OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zoltán Tóth
Hi, When I execute the Spark ML Logisitc Regression example in pyspark I run into an OutOfMemory exception. I'm wondering if any of you experienced the same or has a hint about how to fix this. The interesting bit is that I only get the exception when I try to write the result DataFrame into a

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zoltán Tóth
Aaand, the error! :) Exception in thread "org.apache.hadoop.hdfs.PeerCache@4e000abf" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "org.apache.hadoop.hdfs.PeerCache@4e000abf" Exception in thread "Thread-7" Exception: java.lang.OutOfMemoryError thrown

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zoltán Zvara
Hey, I'd try to debug, profile ResolvedDataSource. As far as I know, your write will be performed by the JVM. On Mon, Sep 7, 2015 at 4:11 PM Tóth Zoltán wrote: > Unfortunately I'm getting the same error: > The other interesting things are that: > - the parquet files got

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread boci
Hi, Can you try to using save method instead of write? ex: out_df.save("path","parquet") b0c1 -- Skype: boci13, Hangout: boci.b...@gmail.com On Mon, Sep 7, 2015 at

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Tóth Zoltán
Unfortunately I'm getting the same error: The other interesting things are that: - the parquet files got actually written to HDFS (also with .write.parquet() ) - the application gets stuck in the RUNNING state for good even after the error is thrown 15/09/07 10:01:10 INFO spark.ContextCleaner:

Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread Zsolt Tóth
Hi, I ran your example on Spark-1.4.1 and 1.5.0-rc3. It succeeds on 1.4.1 but throws the OOM on 1.5.0. Do any of you know which PR introduced this issue? Zsolt 2015-09-07 16:33 GMT+02:00 Zoltán Zvara : > Hey, I'd try to debug, profile ResolvedDataSource. As far as I