Hi,
When I execute the Spark ML Logisitc Regression example in pyspark I run
into an OutOfMemory exception. I'm wondering if any of you experienced the
same or has a hint about how to fix this.
The interesting bit is that I only get the exception when I try to write
the result DataFrame into a
Aaand, the error! :)
Exception in thread "org.apache.hadoop.hdfs.PeerCache@4e000abf"
Exception: java.lang.OutOfMemoryError thrown from the
UncaughtExceptionHandler in thread
"org.apache.hadoop.hdfs.PeerCache@4e000abf"
Exception in thread "Thread-7"
Exception: java.lang.OutOfMemoryError thrown
Hey, I'd try to debug, profile ResolvedDataSource. As far as I know, your
write will be performed by the JVM.
On Mon, Sep 7, 2015 at 4:11 PM Tóth Zoltán wrote:
> Unfortunately I'm getting the same error:
> The other interesting things are that:
> - the parquet files got
Hi,
Can you try to using save method instead of write?
ex: out_df.save("path","parquet")
b0c1
--
Skype: boci13, Hangout: boci.b...@gmail.com
On Mon, Sep 7, 2015 at
Unfortunately I'm getting the same error:
The other interesting things are that:
- the parquet files got actually written to HDFS (also with
.write.parquet() )
- the application gets stuck in the RUNNING state for good even after the
error is thrown
15/09/07 10:01:10 INFO spark.ContextCleaner:
Hi,
I ran your example on Spark-1.4.1 and 1.5.0-rc3. It succeeds on 1.4.1 but
throws the OOM on 1.5.0. Do any of you know which PR introduced this
issue?
Zsolt
2015-09-07 16:33 GMT+02:00 Zoltán Zvara :
> Hey, I'd try to debug, profile ResolvedDataSource. As far as I