Hi,
I am using spark 1.4 when an issue occurs to me.
I am trying to use the aggregate function:
JavaRddString rdd = some rdd;
HashMapLong, TypeA zeroValue = new HashMap();
// add initial key-value pair for zeroValue
rdd.aggregate(zeroValue,
new
Are you using 1.4.0? If yes, use 1.4.1
-- --
??: ??;qhz...@apache.org;
: 2015??8??13??(??) 6:04
??: devdev@spark.apache.org;
: please help with ClassNotFoundException
Hi,I am using spark 1.4 when an issue occurs
I am running a spark job with only two operations: mapPartition and then
collect(). The output data size of mapPartition is very small. One integer
per partition. I saw there is a stage 2 for this job that runs this java
program. I am not a java programmer. Could anyone please let me know what
That was intentional - what's your use case that require configs not
starting with spark?
On Thu, Aug 13, 2015 at 8:16 AM, rfarrjr rfar...@gmail.com wrote:
Ran into an issue setting a property on the SparkConf that wasn't made
available on the worker. After some digging[1] I noticed that
Yes, I guess so. I see this bug before.
-- --
??: ??;z.qian...@gmail.com;
: 2015??8??13??(??) 9:30
??: Sea261810...@qq.com; dev@spark.apache.orgdev@spark.apache.org;
: Re: please help with ClassNotFoundException
Hi
Ran into an issue setting a property on the SparkConf that wasn't made
available on the worker. After some digging[1] I noticed that only
properties that start with spark. are sent by the schedular. I'm not
sure if this was intended behavior or not.
Using Spark Streaming 1.4.1 running on Java
Hi,
sampledVertices is a HashSet of vertices
var sampledVertices: HashSet[VertexId] = HashSet()
In each iteration, I am making a list of neighborVertexIds
val neighborVertexIds = burnEdges.map((e:Edge[Int]) = e.dstId)
I want to add this neighborVertexIds to the sampledVertices
oh I see, you are defining your own RDD Partition types, and you had a
bug where partition.index did not line up with the partitions slot in
rdd.getPartitions. Is that correct?
On Thu, Aug 13, 2015 at 2:40 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
I figured that out, And these are my
subscribe
Hello,
Any idea on why this is happening?
Thanks
Naga
-- Forwarded message --
From: Naga Vij nvbuc...@gmail.com
Date: Wed, Aug 12, 2015 at 5:47 PM
Subject: - Spark 1.4.1 - run-example SparkPi - Failure ...
To: u...@spark.apache.org
Hi,
I am evaluating Spark 1.4.1
Any idea on
Hi Naga,
This happened here sometimes when the memory of the spark cluster wasn't
enough, and Java GC enters into an infinite loop trying to free some memory.
To fix this I just added more memory to the Workers of my cluster, or you
can increase the number of partitions of your RDD, using the
Have a look at spark.shuffle.manager, You can switch between sort and hash
with this configuration.
spark.shuffle.managersortImplementation to use for shuffling data. There
are two implementations available:sort and hash. Sort-based shuffle is more
memory-efficient and is the default option
Hi sea
Is it the same issue as https://issues.apache.org/jira/browse/SPARK-8368
Sea 261810...@qq.com于2015年8月13日周四 下午6:52写道:
Are you using 1.4.0? If yes, use 1.4.1
-- 原始邮件 --
*发件人:* 周千昊;qhz...@apache.org;
*发送时间:* 2015年8月13日(星期四) 晚上6:04
*收件人:*
Has anyone run into this?
-- Forwarded message --
From: Naga Vij nvbuc...@gmail.com
Date: Wed, Aug 12, 2015 at 5:47 PM
Subject: - Spark 1.4.1 - run-example SparkPi - Failure ...
To: u...@spark.apache.org
Hi,
I am evaluating Spark 1.4.1
Any idea on why run-example SparkPi
See first section on https://spark.apache.org/community
On Thu, Aug 13, 2015 at 9:44 AM, Naga Vij nvbuc...@gmail.com wrote:
subscribe
That works.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/possible-bug-user-SparkConf-properties-not-copied-to-worker-process-tp13665p13689.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Unfortunately it doesn't because our version of Hive has different syntax
elements and thus I need to patch them in (and a few other minor things).
It would be great if there would be a developer api on a somewhat higher
level.
On Thu, Aug 13, 2015 at 2:19 PM, Reynold Xin r...@databricks.com
I believe for Hive, there is already a client interface that can be used to
build clients for different Hive metastores. That should also work for your
heavily forked one.
For Hadoop, it is definitely a bigger project to refactor. A good way to
start evaluating this is to list what needs to be
Hi,
I have asked this before but didn't receive any comments, but with the
impending release of 1.5 I wanted to bring this up again.
Right now, Spark is very tightly coupled with OSS Hive Hadoop which
causes me a lot of work every time there is a new version because I don't
run OSS Hive/Hadoop
Thanks Marcelo!
The reason I was asking that question is that I was expecting my spark job
to be a map only job. In other words, it should finish after the
mapPartitions run for all partitions. This is because the job is only
mapPartitions() plus count() where mapPartitions only yield one
Cool seems like the design are very close.
Here is my latest blog on my work with HBase and Spark. Let me know if you
have any questions. There should be two more blogs next month talking
about bulk load through spark 14150 which is committed, and SparkSQL 14181
which should be done next week.
Retry sending this again ...
-- Forwarded message --
From: Reynold Xin r...@databricks.com
Date: Thu, Aug 13, 2015 at 12:15 AM
Subject: [ANNOUNCE] Spark 1.5.0-preview package
To: dev@spark.apache.org dev@spark.apache.org
In order to facilitate community testing of the 1.5.0
Is this through Java properties? For java properties, you can pass them
using spark.executor.extraJavaOptions.
On Thu, Aug 13, 2015 at 2:11 PM, rfarrjr rfar...@gmail.com wrote:
Thanks for the response.
In this particular case we passed a url that would be leveraged when
configuring some
Hi Naga,
If you are trying to use classes from this jar, you will need to call the
addJar method from the sparkcontext, which will put this jar in the all
workers context.
Even when you execute it in standalone.
2015-08-13 16:02 GMT-03:00 Naga Vij nvbuc...@gmail.com:
Hi Dirceu,
Thanks for
Thanks for the response.
In this particular case we passed a url that would be leveraged when
configuring some serialization support for Kryo. We are using a schema
registry and leveraging it to efficiently serialize avro objects without the
need to register specific records or schemas up
Thanks Josh for the initiative.
I think reducing the redundancy in QA bot posts would make discussion on GitHub
UI more focused.
Cheers
On Thu, Aug 13, 2015 at 7:21 PM, Josh Rosen rosenvi...@gmail.com wrote:
Prototype is at https://github.com/databricks/spark-pr-dashboard/pull/59
On Wed,
I tried accessing just now.
It took several seconds before the page showed up.
FYI
On Thu, Aug 13, 2015 at 7:56 PM, Cheng, Hao hao.ch...@intel.com wrote:
I found the https://spark-prs.appspot.com/ is super slow while open it in
a new window recently, not sure just myself or everybody
I found the https://spark-prs.appspot.com/ is super slow while open it in a new
window recently, not sure just myself or everybody experience the same, is
there anyways to speed up?
From: Josh Rosen [mailto:rosenvi...@gmail.com]
Sent: Friday, August 14, 2015 10:21 AM
To: dev
Subject: Re:
OK, thanks, probably just myself…
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Friday, August 14, 2015 11:04 AM
To: Cheng, Hao
Cc: Josh Rosen; dev
Subject: Re: Automatically deleting pull request comments left by AmplabJenkins
I tried accessing just now.
It took several seconds before the
Prototype is at https://github.com/databricks/spark-pr-dashboard/pull/59
On Wed, Aug 12, 2015 at 7:51 PM, Josh Rosen rosenvi...@gmail.com wrote:
*TL;DR*: would anyone object if I wrote a script to auto-delete pull
request comments from AmplabJenkins?
Currently there are two bots which post
I have no idea... We use scala. You upgrade to 1.4 so quickly..., are you
using spark in production? Spark 1.3 is better than spark1.4.
-- --
??: ??;z.qian...@gmail.com;
: 2015??8??14??(??) 11:14
??: Sea261810...@qq.com;
Hi Sea
I have updated spark to 1.4.1, however the problem still exists, any
idea?
Sea 261810...@qq.com于2015年8月14日周五 上午12:36写道:
Yes, I guess so. I see this bug before.
-- 原始邮件 --
*发件人:* 周千昊;z.qian...@gmail.com;
*发送时间:* 2015年8月13日(星期四) 晚上9:30
*收件人:*
Hi Cheez,
You can set the parameter spark.shuffle.manager when you submit the Spark
job.
--conf spark.shuffle.manager=hash
Thank you,
Ranjana
On Thu, Aug 13, 2015 at 2:26 AM, cheez 11besemja...@seecs.edu.pk wrote:
I understand that the current master branch of Spark uses Sort based
shuffle.
(I tried to send this last night but somehow ASF mailing list rejected my
mail)
In order to facilitate community testing of the 1.5.0 release, I've built a
preview package. This is not a release candidate, so there is no voting
involved. However, it'd be great if community members can start
34 matches
Mail list logo