master attempted to re-register the worker and then took all workers as unregistered

2014-01-14 Thread Nan Zhu
Hi, all I’m trying to deploy spark in standalone mode, everything goes as usual, the webUI is accessible, the master node wrote some logs saying all workers are registered 14/01/15 01:37:30 INFO Slf4jEventHandler: Slf4jEventHandler started 14/01/15 01:37:31 INFO ActorSystemImpl: RemoteSe

Re: Spark SequenceFile Java API Repeat Key Values

2014-01-14 Thread Michael Quinlan
Matei and Andrew, Thank you both for your prompt responses. Matei is correct in that I am attempting to cache a large RDD for repeated query. I was able to implement your suggestion in a Scala version of the code, which I've copied below. I should point out two minor details: LongWritable.clone()

Re: Spark writing to disk when there's enough memory?!

2014-01-14 Thread Matei Zaharia
Hey Majd, I believe Shark sets up data to spill to disk, even though the default storage level in Spark is memory-only. In terms of those executors, it looks like data distribution was unbalanced across them, possibly due to data locality in HDFS (some of the executors may have had more data).

Re: Stalling during large iterative PySpark jobs

2014-01-14 Thread Matei Zaharia
Hi Jeremy, If you look at the stdout and stderr files on that worker, do you see any earlier errors? I wonder if one of the Python workers crashed earlier. It would also be good to run “top” and see if more memory is used during the computation. I guess the cached RDD itself fits in less than 5

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

2014-01-14 Thread Aureliano Buendia
On Tue, Jan 14, 2014 at 5:52 PM, Christopher Nguyen wrote: > Aureliano, this sort of jar-hell is something we have to deal with, > whether Spark or elsewhere. How would you propose we fix this with Spark? > Do you mean that Spark's own scaffolding caused you to pull in both > Protobuf 2.4 and 2.5

RE: squestion on using spark parallelism vs using num partitions in spark api

2014-01-14 Thread Hussam_Jarada
I am using local Thanks, Hussam From: Huangguowei [mailto:huangguo...@huawei.com] Sent: Tuesday, January 14, 2014 4:43 AM To: user@spark.incubator.apache.org Subject: 答复: squestion on using spark parallelism vs using num partitions in spark api “Using spark 0.8.1 … jave code running on 8 CPU wi

Re: Controlling hadoop block size

2014-01-14 Thread Aureliano Buendia
On Tue, Jan 14, 2014 at 5:00 PM, Archit Thakur wrote: > Hadoop block size decreased, do you mean HDFS block size? That is not > possible. > Sorry for terminology mix up. In my question 'hadoop block size' should probably be replaced by 'RDD partitions number'. I'm getting a large number of small

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

2014-01-14 Thread Christopher Nguyen
Aureliano, this sort of jar-hell is something we have to deal with, whether Spark or elsewhere. How would you propose we fix this with Spark? Do you mean that Spark's own scaffolding caused you to pull in both Protobuf 2.4 and 2.5? Or do you mean the error message should have been more helpful? Se

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

2014-01-14 Thread Aureliano Buendia
On Tue, Jan 14, 2014 at 5:07 PM, Archit Thakur wrote: > How much memory you are setting for exector JVM. > This problem comes when either there is a communication problem between > Master/Worker. or you do not have any memory left. Eg, you specified 75G > for your executor and your machine has a m

Re: WARN ClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

2014-01-14 Thread Archit Thakur
How much memory you are setting for exector JVM. This problem comes when either there is a communication problem between Master/Worker. or you do not have any memory left. Eg, you specified 75G for your executor and your machine has a memory of 70G. On Thu, Jan 9, 2014 at 11:27 PM, Aureliano Buen

Re: Getting java.netUnknownHostException

2014-01-14 Thread Archit Thakur
Try running ./bin/start-slave.sh 1 spark://A-IP:PORT. Thx, Archit_Thakur. On Sat, Jan 11, 2014 at 7:18 AM, Khanderao kand wrote: > For "java.netUnknownHostException" Did you check something basic that you > are able to connect to A from B? and checked /etc/hosts? > > > On Fri, Jan 10, 2014 at 7

Re: Controlling hadoop block size

2014-01-14 Thread Archit Thakur
Hadoop block size decreased, do you mean HDFS block size? That is not possible. Block size of HDFS is never affected by your spark jobs. "For a big number of tasks, I get a very high number of 1 MB files generated by saveAsSequenceFile()." What do you mean by "big number of tasks" No. of files

Re: Akka error kills workers in standalone mode

2014-01-14 Thread Archit Thakur
You are getting a NullPointerException because of which it gets failed. It runs at local means you are ignoring a fact that many of the classes wont be initialized on the worker executor node when you might have initialized them in your master executor JVM. To check = Does your code works when you

Re: yarn SPARK_CLASSPATH

2014-01-14 Thread Tom Graves
The right way to setup yarn/hadoop is tricky as its really very dependent upon your usage of it.    Since HBase is a hadoop service you might just add it to your hadoop config  yarn.application.classpath and have it on the classpath for all users/applications of that grid.  In this way you are tr

Akka error kills workers in standalone mode

2014-01-14 Thread vuakko
Spark fails to run practically any standalone mode jobs sent to it. The local mode works and spark-shell works even in standalone, but sending any other jobs manually fails with worker posting the following error: 2014-01-14 15:47:05,073 [sparkWorker-akka.actor.default-dispatcher-5] INFO org.apac

Shark runtime error

2014-01-14 Thread Kishore kumar
Installed the spark and scala to run the shark with the help of this document https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster when i run shark the error which iam getting is -- [root@localhost bin]# shark Starting the Shark Command Line Client WARNING: org.apache.hadoop.metrics.

答复: squestion on using spark parallelism vs using num partitions in spark api

2014-01-14 Thread Huangguowei
“Using spark 0.8.1 … jave code running on 8 CPU with 16GRAM single node” Local or standalone(single node) ? 发件人: leosand...@gmail.com [mailto:leosand...@gmail.com] 发送时间: 2014年1月14日 13:42 收件人: user 主题: Re: squestion on using spark parallelism vs using num partitions in spark api I think the para

Re: Running Spark on Mesos

2014-01-14 Thread deric
I've deleted whole /tmp/mesos on each slave, but it didn't help (this one was running on mesos 0.15.0). I've tried different mesos versions (0.14, 0.15, 0.16-rc1, 0.16-rc2). Now spark is compiled with mesos-0.15.0.jar, but it doesn't seem to have any impact on this. java.lang.NullPointerException