Re: master attempted to re-register the worker and then took all workers as unregistered

2014-01-15 Thread Nan Zhu
I got the reason for the weird behaviour the executor throws an exception due to the bug in application code (I forgot to set an env variable used in the application code in every machine) when starting then the master seems to remove the worker from the list (?) but the worker keeps

Anyone know hot to submit spark job to yarn in java code?

2014-01-15 Thread John Zhao
Now I am working on a web application and I want to submit a spark job to hadoop yarn. I have already do my own assemble and can run it in command line by the following script: export YARN_CONF_DIR=/home/gpadmin/clusterConfDir/yarn export

Re: Anyone know hot to submit spark job to yarn in java code?

2014-01-15 Thread Philip Ogren
Great question! I was writing up a similar question this morning and decided to investigate some more before sending. Here's what I'm trying. I have created a new scala project that contains only spark-examples-assembly-0.8.1-incubating.jar and

Re: Please help: change $SPARK_HOME/work directory for spark applications

2014-01-15 Thread Nan Zhu
Hi, Jin It’s SPARK_WORKER_DIR Line 48 WorkerArguments.scala if (System.getenv(SPARK_WORKER_DIR) != null) { workDir = System.getenv(SPARK_WORKER_DIR) } Best, -- Nan Zhu On Wednesday, January 15, 2014 at 2:03 PM, Chen Jin wrote: Hi, Currently my application jars and logs

Please help: change $SPARK_HOME/work directory for spark applications

2014-01-15 Thread Chen Jin
Hi, Currently my application jars and logs are stored in $SPARK_HOME/work, I would like to change it to somewhere having more space. Could anyone advise me on this? Changing the log dir is straightforward which just to export SPARK_LOG_DIR, however, there is no environment variable for

Exception in thread DAGScheduler scala.MatchError: None (of class scala.None$)

2014-01-15 Thread Soren Macbeth
Howdy, I'm having some trouble understanding what this exception means, i.e., what the problem it's complaining about is. The full stack trace is here: https://gist.github.com/sorenmacbeth/6f49aa1852d9097deee4 I've doing a simple map and then reduce. TIA

libraryDependencies configuration is different for sbt assembly vs sbt run

2014-01-15 Thread kamatsuoka
When I run sbt assembly, I use the provided configuration in the build.sbt library dependency, to avoid conflicts in the fat jar: libraryDependencies += org.apache.spark %% spark-core % 0.8.1-incubating % provided But if I want to do sbt run, I have to remove the provided, otherwise it doesn't

Re: Exception in thread DAGScheduler scala.MatchError: None (of class scala.None$)

2014-01-15 Thread Soren Macbeth
0.8.1-incubating running locally. On January 15, 2014 at 2:28:00 PM, Mark Hamstra (m...@clearstorydata.com) wrote: Spark version? On Wed, Jan 15, 2014 at 2:19 PM, Soren Macbeth so...@yieldbot.com wrote: Howdy, I'm having some trouble understanding what this exception means, i.e., what the

Re: Exception in thread DAGScheduler scala.MatchError: None (of class scala.None$)

2014-01-15 Thread Mark Hamstra
Okay, that fits with what I was expecting. What does your reduce function look like? On Wed, Jan 15, 2014 at 2:33 PM, Soren Macbeth so...@yieldbot.com wrote: 0.8.1-incubating running locally. On January 15, 2014 at 2:28:00 PM, Mark Hamstra

Re: Exception in thread DAGScheduler scala.MatchError: None (of class scala.None$)

2014-01-15 Thread Soren Macbeth
I'm working on a Clojure DSL, so my map and reduce function are in Clojure, but I updated to the gist to include the code. https://gist.github.com/sorenmacbeth/6f49aa1852d9097deee4 (map-reduce-1) works as expected, however, (map-reduce) throws that exception. I've traced the types and outputs

Re: Anyone know hot to submit spark job to yarn in java code?

2014-01-15 Thread Philip Ogren
My problem seems to be related to this: https://issues.apache.org/jira/browse/MAPREDUCE-4052 So, I will try running my setup from a Linux client and see if I have better luck. On 1/15/2014 11:38 AM, Philip Ogren wrote: Great question! I was writing up a similar question this morning and

Reading files on a cluster / shared file system

2014-01-15 Thread Ognen Duzlevski
On a cluster where the nodes and the master all have access to a shared filesystem/files - does spark read a file (like one resulting from sc.textFile()) in parallel/different sections on each node? Or is the file read on master in sequence and chunks processed on the nodes afterwards? Thanks!

jarOfClass method no found in SparkContext

2014-01-15 Thread arjun biswas
Hello All , I have installed spark on my machine and was succesful in running sbt/sbt package as well as sbt/sbt assembly . I am trying to run the examples in java from eclipse . To be precise i am trying to run the JavaLogQuery example from eclipse . The issue is i am unable to resolve this

Re: jarOfClass method no found in SparkContext

2014-01-15 Thread Tathagata Das
Could it be possible that you have an older version of JavaSparkContext (i.e. from an older version of Spark) in your path? Please check that there aren't two versions of Spark accidentally included in your class path used in Eclipse. It would not give errors in the import (as it finds the

RE: Anyone know hot to submit spark job to yarn in java code?

2014-01-15 Thread Liu, Raymond
Hi Regarding your question 1) when I run the above script, which jar is beed submitted to the yarn server ? What SPARK_JAR env point to and the --jar point to are both submitted to the yarn server 2) It like the spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar plays the role of

Re: Reading files on a cluster / shared file system

2014-01-15 Thread Tathagata Das
If you are running a distributed Spark cluster over the nodes, then the reading should be done in a distributed manner. If you give sc.textFile() a local path to a directory in the shared file system, then each worker should read a subset of the files in directory by accessing them locally.

Re: Master and worker nodes in standalone deployment

2014-01-15 Thread Nan Zhu
you can start a worker process in the master node so that all nodes in your cluster can participate in the computation Best, -- Nan Zhu On Wednesday, January 15, 2014 at 11:32 PM, Manoj Samel wrote: When spark is deployed on cluster in standalone deployment mode (V 0.81), one of the

Re: Master and worker nodes in standalone deployment

2014-01-15 Thread Manoj Samel
Thanks, Could you still explain what does master process does ? On Wed, Jan 15, 2014 at 8:36 PM, Nan Zhu zhunanmcg...@gmail.com wrote: you can start a worker process in the master node so that all nodes in your cluster can participate in the computation Best, -- Nan Zhu On Wednesday,

Re: Master and worker nodes in standalone deployment

2014-01-15 Thread Nan Zhu
it maintains the running of worker process, create executor for the tasks in the worker nodes, contacts with driver program, etc. -- Nan Zhu On Wednesday, January 15, 2014 at 11:37 PM, Manoj Samel wrote: Thanks, Could you still explain what does master process does ? On Wed, Jan 15,

Re: jarOfClass method no found in SparkContext

2014-01-15 Thread arjun biswas
Thanks for pointing me to that mistake . Yes i was using the spark 0.8.1 incubating jar and the master branch code examples . I corrected the mistake Regards On Wed, Jan 15, 2014 at 5:51 PM, Patrick Wendell pwend...@gmail.com wrote: Hm, are you sure you haven't included the master branch of