If you only mark it as transient, then the object won't be serialized, and on 
the worker the field will be null. When the worker goes to use it, you get an 
NPE. 

Marking it lazy defers initialization to first use. If that use happens to be 
after serialization time (e.g. on the worker), then the worker will first check 
to see if it's initialized, and then initialize it if not. 

I think if you *do* reference the lazy val before serializing you will likely 
get an NPE. 


> On Nov 25, 2014, at 1:05 AM, Theodore Vasiloudis 
> <theodoros.vasilou...@gmail.com> wrote:
> 
> Great, Ian's approach seems to work fine.
> 
> Can anyone provide an explanation as to why this works, but passing the
> CoreNLP object itself
> as transient does not?
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-and-Stanford-CoreNLP-tp19654p19739.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to