You should pull in this PR: https://github.com/apache/spark/pull/5364 It should resolve that. It is in master. Best, Reza
On Fri, Apr 10, 2015 at 8:32 AM, Debasish Das <debasish.da...@gmail.com> wrote: > Hi, > > I am benchmarking row vs col similarity flow on 60M x 10M matrices... > > Details are in this JIRA: > > https://issues.apache.org/jira/browse/SPARK-4823 > > For testing I am using Netflix data since the structure is very similar: > 50k x 17K near dense similarities.. > > Items are 17K and so I did not activate threshold in colSimilarities yet > (it's at 1e-4) > > Running Spark on YARN with 20 nodes, 4 cores, 16 gb, shuffle threshold 0.6 > > I keep getting these from col similarity code from 1.2 branch. Should I > use Master ? > > 15/04/10 11:08:36 WARN BlockManagerMasterActor: Removing BlockManager > BlockManagerId(5, tblpmidn36adv-hdp.tdc.vzwcorp.com, 44410) with no > recent heart beats: 50315ms exceeds 45000ms > > 15/04/10 11:09:12 ERROR ContextCleaner: Error cleaning broadcast 1012 > > java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] > > at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) > > at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) > > at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) > > at > scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) > > at scala.concurrent.Await$.result(package.scala:107) > > at > org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:137) > > at > org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:227) > > at > org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45) > > at > org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:66) > > at > org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:185) > > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:147) > > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:138) > > at scala.Option.foreach(Option.scala:236) > > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:138) > > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134) > > at > org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134) > > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1468) > > at org.apache.spark.ContextCleaner.org > $apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:133) > > at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65) > > I knew how to increase the 45 ms to something higher as it is compute > heavy job but in YARN, I am not sure how to set that config.. > > But in any-case that's a warning and should not affect the job... > > Any idea how to improve the runtime other than increasing threshold to > 1e-2 ? I will do that next > > Was netflix dataset benchmarked for col based similarity flow before ? > similarity output from this dataset becomes near dense and so it is > interesting for stress testing... > > Thanks. > > Deb >