Re: Long-running job OOMs driver process

2016-11-18 Thread Nathan Lande
+1 to not threading. What does your load look like? If you are loading many files and cacheing them in N rdds rather than 1 rdd this could be an issue. If the above two things don't fix your oom issue, without knowing anything else about your job, I would focus on your cacheing strategy as a

Re: How do I convert json_encoded_blob_column into a data frame? (This may be a feature request)

2016-11-16 Thread Nathan Lande
json looks like) > 2. In my case, toJSON on RDD doesn't seem to help a lot. Attached a screen > shot. Looks like I got the same data frame as my original one. > > Thanks much for these examples. > > > > On Wed, Nov 16, 2016 at 2:54 PM, Nathan Lande <nathanla...@gmail.com>