Hi, I have two relations: relation *rows* (>10GB) relation *tinyDictionary* (<1MB)
I want to take each tuple from *rows* and attach *tinyDictionary *to it. And then pass it to python UDF: result = FOREACH someRelation GENERATE udf.my_python_udf(single_row_from_* Rows*, whole*TinyDictionary*); How can I do that? There is a solution to do it using DistirbutedCache, but I would like to avoid to use Java stuff. Also *TinyDictionary *could be in several files. It would be hard to deal with it.