Re: [Spark DataFrame]: Passing DataFrame to custom method results in NullPointerException

2018-01-22 Thread Matteo Cossu
Hello, I did not understand very well your question. However, I can tell you that if you do .collect() on a RDD you are collecting all the data in the driver node. For this reason, you should use it only when the RDD is very small. Your function "validate_hostname" depends on a DataFrame. It's not

[Spark DataFrame]: Passing DataFrame to custom method results in NullPointerException

2018-01-15 Thread abdul.h.hussain
Hi, My Spark app is mapping lines from a text file to case classes stored within an RDD. When I run the following code on this rdd: .collect.map(line => if(validate_hostname(line, data_frame)) line).foreach(println) It correctly calls the method validate_hostname by passing the case class and