Re: Build Spark Against CDH5

2014-02-28 Thread Brian Brunner
After successfully building the official 0.9.0 release I attempted to build off of the github code again and was successfully able to do so. Not really sure what happened, but it works now. -- View this message in context:

Re: Spark streaming on ec2

2014-02-28 Thread Aureliano Buendia
Also, in this talk http://www.youtube.com/watch?v=OhpjgaBVUtU on using spark streaming in production, the author seems to have missed the topic of how to manage cloud instances. On Fri, Feb 28, 2014 at 6:48 PM, Aureliano Buendia buendia...@gmail.comwrote: What's the updated way of deploying

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-02-28 Thread Egor Pahomov
Spark 0.9 uses protobuf 2.5.0 Hadoop 2.2 uses protobuf 2.5.0 protobuf 2.5.0 can read massages serialized with protobuf 2.4.1 So there is not any reason why you can't read some messages from hadoop 2.2 with protobuf 2.5.0, probably you somehow have 2.4.1 in your class path. Of course it's very bad,

Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1

2014-02-28 Thread Egor Pahomov
In that same pom profile idyarn/id properties hadoop.major.version2/hadoop.major.version hadoop.version2.2.0/hadoop.version protobuf.version2.5.0/protobuf.version /properties modules moduleyarn/module /modules /profile

Spark stream example SimpleZeroMQPublisher high cpu usage

2014-02-28 Thread Aureliano Buendia
Hi, Running: ./bin/run-example org.apache.spa.streaming.examples.SimpleZeroMQPublisher tcp://127.0.1.1:1234 foo causes over 100% cpu usage on os x. Given that it's just a simple zmq publisher, this shouldn't be expected. Is there something wrong with that example?

Re: Spark streaming on ec2

2014-02-28 Thread Nicholas Chammas
Yeah, the Spark on EMR bootstrap scripts referenced herehttp://aws.amazon.com/articles/4926593393724923need some polishing. I had a lot of trouble just getting through that tutorial. And yes, the version of Spark they're using is 0.8.1. On Fri, Feb 28, 2014 at 2:39 PM, Aureliano Buendia

Connection Refused When Running SparkPi Locally

2014-02-28 Thread Benny Thompson
I'm trying to run a simple execution of the SparkPi example. I started the master and one worker, then executed the job on my local cluster, but end up getting a sequence of errors all ending with Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection

How to provide a custom Comparator to sortByKey?

2014-02-28 Thread Tao Xiao
I am using Spark 0.9 I have an array of tuples, and I want to sort these tuples using the *sortByKey *API as follows in Spark shell: val A:Array[(String, String)] = Array((1, One), (9, Nine), (3, three), (5, five), (4, four)) val P = sc.parallelize(A) // MyComparator is an example, maybe I have