I'm going through the big data mini course (
http://ampcamp.berkeley.edu/big-data-mini-course/launching-a-bdas-cluster-on-ec2.html)
and am getting urllib2 connection refused errors. They're getting thrown
in check_spark_cluster() function when waiting for the cluster to start.
The URL is
Hi,all,
Thanks for your time to read this,
When I first trying to write a new Java class, and put spark in it, I
always get a exception:
*Exception in thread main org.apache.spark.SparkException: Job aborted:
Task not serializable: java.io.NotSerializableException:
org.dcu.test.SparkPrefix*
*
Hi Jie,
it seems that SparkPrefix is not serializable. you can try adding
/implements Serializable/ and see if that solves the problem.
Yadid
On 12/13/13 5:10 AM, Jie Deng wrote:
Hi,all,
Thanks for your time to read this,
When I first trying to write a new Java class, and put spark in it,
Thanks Yadid, that really works!
So is that means static method works only because spark did not distribute
task yet, and the right way of using spark is implement all class
Serializable?
Thanks a lot!
2013/12/13 Yadid Ayzenberg ya...@media.mit.edu
Hi Jie,
it seems that SparkPrefix is
Great, Thanks a lot!!!
2013/12/13 Yadid Ayzenberg ya...@media.mit.edu
In order for Spark to ship your objects to the slaves, they must be
serializable.
Also make sure to read the Data Serialization section in the tuning guide:
http://spark.incubator.apache.org/docs/latest/tuning.html
If
Thank you, i have one year homework ;)
On Thu, Dec 12, 2013 at 2:51 PM, Imran Rashid im...@quantifind.com wrote:
ah, got it, makes a lot more sense now. I couldn't figure out what w was,
I should have figured it was weights.
As Evan suggested, using zip is almost certainly what you want.
Well, it only uses my user name when I run my application in local mode
(i.e. spark is running on my laptop with a master url of local.) Not
a general solution for you I'm afraid!
On 12/12/2013 5:38 PM, Koert Kuipers wrote:
Hey Philip,
how do you get spark to write to hdfs with your user
Hi Spark Community,
I would like to expose my spark application/libraries via a web service
in order to launch jobs, interact with users, etc. I'm sure there are
100's of ways to think about doing this each with a variety of
technology stacks that could be applied. So, I know there is no
https://github.com/apache/incubator-spark/pull/222
On Fri, Dec 13, 2013 at 8:36 AM, Philip Ogren philip.og...@oracle.comwrote:
Hi Spark Community,
I would like to expose my spark application/libraries via a web service in
order to launch jobs, interact with users, etc. I'm sure there are
thats great. didn't realize this was in master already.
On Thu, Dec 12, 2013 at 8:10 PM, Shao, Saisai saisai.s...@intel.com wrote:
Hi Koert,
Spark with multi-user support has been merged in master branch with patch (
https://github.com/apache/incubator-spark/pull/23), you can check out
Hi,
I'm not sure if it is the right place to talk about this, if not, I'm very
sorry about that
- 9-9:30am The State of Spark, and Where We’re Going
Nexthttp://spark-summit.org/talk/zaharia-the-state-of-spark-and-where-were-going/
– pptx
Hey Philip,
To elaborate a bit, this is a proposed patch for integrating something like
a restful server into Spark. If you wanted to take a look at the
documentation in that patch and comment as to whether it would partially or
fully solve your use-case that would be great.
- Patrick
On Fri,
Thanks for reporting this we'll figure it out.
On Fri, Dec 13, 2013 at 10:04 AM, Nan Zhu zhunanmcg...@gmail.com wrote:
Hi,
I'm not sure if it is the right place to talk about this, if not, I'm very
sorry about that
- 9-9:30am The State of Spark, and Where We’re Going
I think that you want the lookup() method in PairRDDFunctions?
http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.rdd.PairRDDFunctions
It is supposed to be more efficient than filter...
Shankari
On Thu, Dec 12, 2013 at 7:30 PM, Yadid ya...@media.mit.edu wrote:
Right, if your RDD has a Partitioner, then lookup() will use that to
determine which partition contains the key that you want to lookup and only
run a task on that partition.
That still doesn't efficiently solve the lookup-a-set-of-keys problem, but
extending lookup() to efficiently handle a
Hi Spark user@ and dev@ list members,
We are happy to announce that videos and slides of all talks from the first
Spark Summit last week, Dec 2-3 in Downtown SF, are now available on the
Spark Summit 2013 webpage at http://spark-summit.org/summit-2013. There is
a link for each talk's slides and
Hey Nan,
Thanks for pointing that out. It should be fixed now.
Andy
-- Forwarded message --
From: Nan Zhu zhunanmcg...@gmail.com
Date: Fri, Dec 13, 2013 at 10:04 AM
Subject: some wrong link in Spark Summit web page
To: user@spark.incubator.apache.org
Hi,
I'm not sure
oops, ,meant to send to the entire list...
Original Message
Subject:Re: reading a specific key-value
Date: Fri, 13 Dec 2013 14:56:22 -0500
From: Yadid Ayzenberg ya...@media.mit.edu
To: K. Shankari shank...@eecs.berkeley.edu
Its says more efficient if the RDD
It means that the partitioner (Option[Partitioner]) field of the RDD is
Some(p), not None. Which, in turn, means that for a key k, the RDD knows
how to find which partition contains that k.
In order for that to be true, the RDD has to have been partitioned by key,
and after that only
Thanks, I understand. Im using Java newAPIHadoopRDD. It seems that there
is now way to define that partitioner when creating the RDD, correct?
Does this mean I have to call partitionBy ? It seems like it would be a
lot more efficient to e able to define the partitioner on RDD creation.
Yadid
Yup, this should be in Spark 0.9 and 0.8.1.
Matei
On Dec 13, 2013, at 9:41 AM, Koert Kuipers ko...@tresata.com wrote:
thats great. didn't realize this was in master already.
On Thu, Dec 12, 2013 at 8:10 PM, Shao, Saisai saisai.s...@intel.com wrote:
Hi Koert,
Spark with
21 matches
Mail list logo