Re: Spark on other parallel filesystems

2014-04-05 Thread Venkat Krishnamurthy
Christopher Just to clarify - by ‘load ops’ do you mean RDD actions that result in IO? Venkat From: Christopher Nguyen mailto:c...@adatao.com>> Reply-To: "user@spark.apache.org" mailto:user@spark.apache.org>> Date: Saturday, April 5, 2014 at 8:49 AM To: "user@spark.

Re: Heartbeat exceeds

2014-04-05 Thread Andrew Or
Setting spark.worker.timeout should not help you. What this value means is that the master checks every 60 seconds whether the workers are still alive, as the documentation describes. But this value also determines how often the workers send HEARTBEAT messages to notify the master of their liveness

Re: Heartbeat exceeds

2014-04-05 Thread Debasish Das
This does not seem to help: export SPARK_JAVA_OPTS="-Dspark.local.dir=/app/spark/tmp -Dspark.worker.timeout=600 -Dspark.akka.timeout=200 -Dspark.storage.blockManagerSlaveTimeoutMs=30" Getting the message leads to GC failure followed by master declaring the worker as dead ! This is related to

Re: How to create a RPM package

2014-04-05 Thread Will Benton
Hi Rahul, As Christophe pointed out, Spark has been in Fedora Rawhide (which will become Fedora 21) for a little while now. (I haven't announced it here because Rawhide is a little too bleeding-edge for most end-users.) With native packages of any kind, there are a couple of considerations:

Re: Spark on other parallel filesystems

2014-04-05 Thread Christopher Nguyen
Avati, depending on your specific deployment config, there can be up to a 10X difference in data loading time. For example, we routinely parallel load 10+GB data files across small 8-node clusters in 10-20 seconds, which would take about 100s if bottlenecked over a 1GigE network. That's about the m

Re: Heartbeat exceeds

2014-04-05 Thread Debasish Das
>From the documentation this is what I understood: 1. spark.worker.timeout: Number of seconds after which the standalone deploy master considers a worker lost if it receives no heartbeats. default: 60 I increased it to be 600 It was pointed before that if there is GC overload and the worker take

Re: Heartbeat exceeds

2014-04-05 Thread azurecoder
Interested in a resolution to this. I'm building a large triangular matrix so doing similar to ALS - lots of work on the worker nodes and keep timing out. Tried a few updates to akka frame sizes, timeouts and blockmanager but unable to complete. Will try the blockmanagerslaves property now and let

Re: Heartbeat exceeds

2014-04-05 Thread Debasish Das
@patrick I think there is a bug...when this timeout happens then suddenly I see some negative ms numbers in spark uiI tried to send a pic showing the negative ms numbers but it was rejected by mailing list...I will send it your gmail... >From the archive I saw some more suggestions: >> It se

Re: Redirect Incubator pages

2014-04-05 Thread Pat McDonough
I'm looking forward to that myself! Seems to be hung up with Apache infrastructure though. https://issues.apache.org/jira/plugins/servlet/mobile#issue/INFRA-7398 On Apr 4, 2014 11:19 PM, "Andrew Ash" wrote: > I occasionally see links to pages in the spark.incubator.apache.orgdomain. > Can we H