Hi,

I have a RDD (rdd1)where each line is split into an array ["a", "b", "c],
etc.
And I also have a local dictionary p (dict1) stores key value pair {"a":1,
"b": 2, c:3}
I want to replace the keys in the rdd with the its corresponding value in
the dict:
rdd1.map(lambda line: [dict1[item] for item in line])

But this task is not distributed, I believe the reason is the dict1 is a
local instance.
Can any one provide suggestions on this to parallelize this?


Thanks,
Best,
Peng

Reply via email to