Would love to know if somebody has tried this, only possible problem I can forsee is non-serializable libraries, else no reason it should not work.
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Wed, Mar 12, 2014 at 11:10 AM, shankark <shankark+...@gmail.com> wrote: > (apologies if this was sent out multiple times before) > > We are about to start a large-scale text-processing research project and > are debating between two alternatives for our cluster -- Spark and Hadoop. > I've researched possibilities of using NLTK with Hadoop and see that > there's some precedent ( > http://blog.cloudera.com/blog/2010/03/natural-language-processing-with-hadoop-and-python/). > I wanted to know how easy it might be to use NLTK with pyspark, or if > scalanlp is mature enough to be used with the Scala API for Spark/mllib. > > Thanks! > > ------------------------------ > View this message in context: NLP with > Spark<http://apache-spark-user-list.1001560.n3.nabble.com/NLP-with-Spark-tp2612.html> > Sent from the Apache Spark User List mailing list > archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com. >