So you have 100 partitions (blocks). This might be too many for your dataset. Try setting a smaller number of blocks, e.g., 32 or 64. When ALS starts iterations, you can see the shuffle read/write size from the "stages" tab of Spark WebUI. Vary number of blocks and check the numbers there. Kyro serializer doesn't help much here. You can try disabling it (though I don't think it caused the failure). -Xiangrui
On Fri, Jun 26, 2015 at 11:00 AM, Ayman Farahat <ayman.fara...@yahoo.com> wrote: > Hello ; > I checked on my partitions/storage and here is what I have > > I have 80 executors > 5 G per executore. > > Do i need to set additional params > say cores > > spark.serializer org.apache.spark.serializer.KryoSerializer > # spark.driver.memory 5g > # spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value > -Dnumbers="one two three" > spark.shuffle.memoryFraction 0.3 > spark.storage.memoryFraction 0.65 > > > > RDD NameStorage LevelCached PartitionsFraction CachedSize in MemorySize > in TachyonSize on Disk ratingBlocks > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=44> > Memory > Deserialized 1x Replicated 257 129% 4.1 GB 0.0 B 0.0 B itemOutBlocks > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=53> > Memory > Deserialized 1x Replicated 100 100% 7.3 MB 0.0 B 0.0 B 38 > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=38> > Memory > Serialized 1x Replicated 193 97% 5.6 GB 0.0 B 0.0 B userInBlocks > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=47> > Memory > Deserialized 1x Replicated 100 100% 2.8 GB 0.0 B 0.0 B itemFactors-1 > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=65> > Memory > Deserialized 1x Replicated 69 69% 8.4 MB 0.0 B 0.0 B itemInBlocks > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=52> > Memory > Deserialized 1x Replicated 69 69% 1455.3 MB 0.0 B 0.0 B userFactors-1 > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=54> > Memory > Deserialized 1x Replicated 100 100% 35.0 GB 0.0 B 0.0 B userOutBlocks > <http://mithrilblue-jt1.blue.ygrid.yahoo.com:8088/proxy/application_1433921068880_943447/storage/rdd?id=48> > Memory > Deserialized 1x Replicated 100 100% 1062.7 MB 0.0 B 0.0 B > > On Jun 26, 2015, at 8:26 AM, Xiangrui Meng <men...@gmail.com> wrote: > > number of CPU cores or less. > > >