Re: How to read a multipart s3 file?

2014-05-07 Thread Nicholas Chammas
Amazon also strongly discourages the use of s3:// because the block file system it maps to is deprecated. http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-file-systems.html Note The configuration of Hadoop running on Amazon EMR differs from the default configuration

Re: How to read a multipart s3 file?

2014-05-07 Thread Han JU
Just some complements to other answers: If you output to, say, `s3://bucket/myfile`, then you can use this bucket as the input of other jobs (sc.textFile('s3://bucket/myfile')). By default all `part-xxx` files will be used. There's also `sc.wholeTextFiles` that you can play with. If you file is

Re: Easy one

2014-05-07 Thread Ian Ferreira
Thanks! From: Aaron Davidson ilike...@gmail.com Reply-To: user@spark.apache.org Date: Tuesday, May 6, 2014 at 5:32 PM To: user@spark.apache.org Subject: Re: Easy one If you're using standalone mode, you need to make sure the Spark Workers know about the extra memory. This can be configured

Is there anything that I need to modify?

2014-05-07 Thread Sophia
[root@CHBM220 spark-0.9.1]# SPARK_JAR=.assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar ./bin/spark-class org.apache.spark.deploy.yarn.Client --jar examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar --class org.apache.spark.examples.SparkPi --args yarn-standalone

Re: How to use spark-submit

2014-05-07 Thread Tathagata Das
Doesnt the run-example script work for you? Also, are you on the latest commit of branch-1.0 ? TD On Mon, May 5, 2014 at 7:51 PM, Soumya Simanta soumya.sima...@gmail.comwrote: Yes, I'm struggling with a similar problem where my class are not found on the worker nodes. I'm using

Unable to load native-hadoop library problem

2014-05-07 Thread Sophia
Hi,everyone, [root@CHBM220 spark-0.9.1]# SPARK_JAR=.assembly/target/scala-2.10/spark-assembly_2.10-0.9.1-hadoop2.2.0.jar ./bin/spark-class org.apache.spark.deploy.yarn.Client --jar examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.1.jar --class org.apache.spark.examples.SparkPi --args

Re: details about event log

2014-05-07 Thread wxhsdp
any ideas? thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/details-about-event-log-tp5411p5476.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark and Java 8

2014-05-07 Thread Kristoffer Sjögren
Running Hadoop and HDFS on unsupported JVM runtime sounds a little adventurous. But as long as Spark can run in a separate Java 8 runtime it's all good. I think having lambdas and type inference is huge when writing these jobs and using Scala (paying the price of complexity, poor tooling etc etc)

Re: sbt run with spark.ContextCleaner ERROR

2014-05-07 Thread Tathagata Das
Okay, this needs to be fixed. Thanks for reporting this! On Mon, May 5, 2014 at 11:00 PM, wxhsdp wxh...@gmail.com wrote: Hi, TD i tried on v1.0.0-rc3 and still got the error -- View this message in context:

Re: master attempted to re-register the worker and then took all workers as unregistered

2014-05-07 Thread Cheney Sun
Hi Nan, In worker's log, I see the following exception thrown when try to launch on executor. (The SPARK_HOME is wrongly specified on purpose, so there is no such file /usr/local/spark1/bin/compute-classpath.sh). After the exception was thrown several times, the worker was requested to kill the