Its a perfectly good method :) Your JVM is hungry for more memory.. its fishy for you as your data is too small? Every RDD keeps independent data in memory. You can check out memory usage on spark dashboard..
Mayur Rustagi Ph: +919632149971 h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com https://twitter.com/mayur_rustagi On Wed, Feb 19, 2014 at 1:51 AM, zhaoxw12 <zhaox...@mails.tsinghua.edu.cn>wrote: > I run spark in standalone mode. I use Spark in python to do the program. > The > OutOfMemory Error alway occurs after some steps during the iteration > process. The usage of memory should be almost the same in each step. This > problem can be solved by increasing "java heap size" or running in a small > data. I'm very confused about this. Has someone met the same error? how to > fix it? I think increasing the heap size is not a good method. > > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/OutOfMemory-Error-tp1746.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >