subject:"Re\: Spark and Stanford CoreNLP"

Re: Spark and Stanford CoreNLP

2015-05-27 Thread mathewvinoj

Evan, could you please look into this post.Below is the link.Any thoughts or suggestion is really appreciated http://apache-spark-user-list.1001560.n3.nabble.com/Spark-partition-issue-with-Stanford-NLP-td23048.html -- View this message in context: http://apache-spark-user-list.1001560.n3.nabb

Re: Spark and Stanford CoreNLP

2014-11-25 Thread Evan R. Sparks

Chris, Thanks for stopping by! Here's a simple example. Imagine I've got a corpus of data, which is an RDD[String], and I want to do some POS tagging on it. In naive spark, that might look like this: val props = new Properties.setAnnotators("pos") val proc = new StanfordCoreNLP(props) val data =

Re: Spark and Stanford CoreNLP

2014-11-25 Thread Christopher Manning

I’m not (yet!) an active Spark user, but saw this thread on twitter … and am involved with Stanford CoreNLP. Could someone explain how things need to be to work better with Spark — since that would be a useful goal. That is, while Stanford CoreNLP is not quite uniform (being developed by vario

Re: Spark and Stanford CoreNLP

2014-11-25 Thread Evan Sparks

If you only mark it as transient, then the object won't be serialized, and on the worker the field will be null. When the worker goes to use it, you get an NPE. Marking it lazy defers initialization to first use. If that use happens to be after serialization time (e.g. on the worker), then the

Re: Spark and Stanford CoreNLP

2014-11-25 Thread Theodore Vasiloudis

Great, Ian's approach seems to work fine. Can anyone provide an explanation as to why this works, but passing the CoreNLP object itself as transient does not? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-and-Stanford-CoreNLP-tp19654p19739.html Sent

Re: Spark and Stanford CoreNLP

2014-11-24 Thread Evan R. Sparks

Neat hack! This is cute and actually seems to work. The fact that it works is a little surprising and somewhat unintuitive. On Mon, Nov 24, 2014 at 8:08 AM, Ian O'Connell wrote: > > object MyCoreNLP { > @transient lazy val coreNLP = new coreNLP() > } > > and then refer to it from your map/redu

Re: Spark and Stanford CoreNLP

2014-11-24 Thread Evan R. Sparks

This is probably not the right venue for general questions on CoreNLP - the project website (http://nlp.stanford.edu/software/corenlp.shtml) provides documentation and links to mailing lists/stack overflow topics. On Mon, Nov 24, 2014 at 9:08 AM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wr

Re: Spark and Stanford CoreNLP

2014-11-24 Thread Madabhattula Rajesh Kumar

Hello, I'm new to Stanford CoreNLP. Could any one share good training material and examples(java or scala) on NLP. Regards, Rajesh On Mon, Nov 24, 2014 at 9:38 PM, Ian O'Connell wrote: > > object MyCoreNLP { > @transient lazy val coreNLP = new coreNLP() > } > > and then refer to it from your

Re: Spark and Stanford CoreNLP

2014-11-24 Thread Ian O'Connell

object MyCoreNLP { @transient lazy val coreNLP = new coreNLP() } and then refer to it from your map/reduce/map partitions or that it should be fine (presuming its thread safe), it will only be initialized once per classloader per jvm On Mon, Nov 24, 2014 at 7:58 AM, Evan Sparks wrote: > We ha

Re: Spark and Stanford CoreNLP

2014-11-24 Thread Evan Sparks

We have gotten this to work, but it requires instantiating the CoreNLP object on the worker side. Because of the initialization time it makes a lot of sense to do this inside of a .mapPartitions instead of a .map, for example. As an aside, if you're using it from Scala, have a look at sistanlp,

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

Re: Spark and Stanford CoreNLP

10 matches

Site Navigation

Mail list logo

Footer information