Hi!
We use Spark to process logs in batches and persist the end result in a db.
Last week, we re-ran the job on the same data couple of times, only to find
that one run had more results than the rest. Digging through the logs, we
found out that a task has been lost and marked for resubmission.
I
Hi Patrick,
Getting following warning while running second user.
WARN component.AbstractLifeCycle: FAILED
SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address
already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at
Hi All,
I am trying to run the CassandraTest sample but i am running into an error
When creating the spark context
val sc = new SparkContext(spark://pulasthi-laptop:7077, casDemo);
this line causes the following error i don't understand why
mesos/Scheduler is needed. I have setup an local
thank you, guys. much appreciated.
-d
On Sun, Nov 24, 2013 at 10:12 PM, Matei Zaharia matei.zaha...@gmail.comwrote:
Yup, it’s also important to have low latency between the drivers and the
workers. If you plan to expose this to the outside (e.g. offer a shell
interface), it would be better
When it gets stuck, what does it show in the web UI? Also, can you run
a jstack on the process and attach the output... that might explain
what's going on.
On Mon, Nov 25, 2013 at 11:30 AM, Vijay Gaikwad vijay...@gmail.com wrote:
I am using apache spark 0.8.0 to process a large data file and
There is a pull request currently to fix this exact issue, I believe, at
https://github.com/apache/incubator-spark/pull/192. It's very small and
only touches the script files, so you could apply it to your current
version and distribute it to the workers. The fix here is that you add an
additional
Thanks, will try it out!
Grega
--
[image: Inline image 1]
*Grega Kešpret*
Analytics engineer
Celtra — Rich Media Mobile Advertising
celtra.com http://www.celtra.com/ |
@celtramobilehttp://www.twitter.com/celtramobile
On Mon, Nov 25, 2013 at 11:54 PM, Aaron Davidson ilike...@gmail.com wrote:
thanks
On Mon, Nov 25, 2013 at 3:00 PM, Ankur Chauhan achau...@brightcove.comwrote:
Have a look at
https://github.com/apache/incubator-spark/blob/master/run-example
That should help you figure out how to run a jar file using spark (given
you have the classpath/dependencies set up).
--
This shows how to serialize user classes. I wanted Spark to serialize all
shuffle files and object files using Kryo. How can I specify that? Or would
that be done by default if I just set spark.serializer to kryo?
On Mon, Nov 25, 2013 at 7:42 PM, Matei Zaharia matei.zaha...@gmail.comwrote:
How do you know Spark doesn't also use Kryo for shuffled files? Are there
metrics or logs somewhere that make you believe it's normal Java
serialization?
On Mon, Nov 25, 2013 at 4:46 PM, Mayuresh Kunjir
mayuresh.kun...@gmail.comwrote:
This shows how to serialize user classes. I wanted Spark
Hi Matei, I've clarified the documentation to include this information in
this pull request. Can you take a look?
https://github.com/apache/incubator-spark/pull/206
On Mon, Nov 25, 2013 at 5:03 PM, Matei Zaharia matei.zaha...@gmail.comwrote:
Yeah, if you just say spark.serializer to Kryo, it
Andrew,
I don't think so. I run Spark on EC2 all the time. It only started
crashing when I upgraded my project to 0.8. The python script gets the
latest AMI, which has Spark 0.7.3 installed (not 0.8). There has to be
some standard procedure, probably involving git pull and the copy-dir
Hi Walrus theCat,
We have been successfully using Spark 0.8 on EC2 ever since it was released
and we do this
several times a day.
We use spark-ec2.py with the new version option (--spark-version=0.8.0), to
spin-up the Spark 0.8 cluster on ec2.
The key is to use the new spark-ec2.py and not the
13 matches
Mail list logo