Re: java.net.SocketException on reduceByKey() in pyspark

2014-04-22 Thread benlaird
I was getting this error after upgrading my nodes to Python2.7. I suspected the problem was due to conflicting Python versions, but my 2.7 install seemed correct on my nodes. I set the PYSPARK_PYTHON variable to my 2.7 install (as I still had 2.6 installed and linked to the 'python' executable, w

Re: java.net.SocketException on reduceByKey() in pyspark

2014-03-19 Thread Uri Laserson
> scala.Option.foreach(Option.scala:236) at >>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207) >>> at &

Re: java.net.SocketException on reduceByKey() in pyspark

2014-03-02 Thread Nicholas Chammas
h.Mailbox.processMailbox(Mailbox.scala:237) at >> akka.dispatch.Mailbox.run(Mailbox.scala:219) at >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >> at >> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java

Re: java.net.SocketException on reduceByKey() in pyspark

2014-02-28 Thread Nicholas Chammas
t.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > >>> > > > The lambda passed to flatMap() returns a list of tuples; take() works fine > just on the flatMap(). > > Where would I start to troubleshoot this error? > > The error output includes m

java.net.SocketException on reduceByKey() in pyspark

2014-02-28 Thread nicholas.chammas
I've done a whole bunch of things to this RDD, and now when I try to sortByKey(), this is what I get: >>> flattened_po.flatMap(lambda x: map_to_database_types(x)).sortByKey()14/02/28 23:18:41 INFO spark.SparkContext: Starting job: sortByKey at :114/02/28 23:18:41 INFO scheduler.DAGScheduler: Got j