Hi,
UpdateStateByKey : if you can brief the issue you are facing with
this,that will be great.
Regarding not keeping whole dataset in memory, you can tweak the parameter
of remember, such that it does checkpoint at appropriate time.
Thanks
Twinkle
On Thursday, June 18, 2015, Nipun Arora
Thanks Saisai.
On Wed, May 20, 2015 at 11:23 AM, Saisai Shao sai.sai.s...@gmail.com
wrote:
I think here is the PR https://github.com/apache/spark/pull/2994 you
could refer to.
2015-05-20 13:41 GMT+08:00 twinkle sachdeva twinkle.sachd...@gmail.com:
Hi,
As Spark streaming is being nicely
Hi,
As Spark streaming is being nicely integrated with consuming messages from
Kafka, so I thought of asking the forum, that is there any implementation
available for pushing data to Kafka from Spark Streaming too?
Any link(s) will be helpful.
Thanks and Regards,
Twinkle
Hi,
Can you please share your compression etc settings, which you are using.
Thanks,
Twinkle
On Wed, May 6, 2015 at 4:15 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
I'm facing this error in Spark 1.3.1
https://issues.apache.org/jira/browse/SPARK-4105
Anyone knows what's the
Thakur.
On Mon, Apr 20, 2015 at 1:31 PM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
Hi Archit,
What is your use case and what kind of metrics are you planning to add?
Thanks,
Twinkle
On Fri, Apr 17, 2015 at 4:07 PM, Archit Thakur archit279tha...@gmail.com
wrote:
Hi,
We
Hi Archit,
What is your use case and what kind of metrics are you planning to add?
Thanks,
Twinkle
On Fri, Apr 17, 2015 at 4:07 PM, Archit Thakur archit279tha...@gmail.com
wrote:
Hi,
We are planning to add new Metrics in Spark for the executors that got
killed during the execution. Was
Hi,
If you have the same spark context, then you can cache the query result via
caching the table ( sqlContext.cacheTable(tableName) ).
Maybe you can have a look at OOyola server also.
On Tue, Apr 14, 2015 at 11:36 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
You can use a tachyon based
Hi,
In one of the application we have made which had no clone stuff, we have
set the value of spark.storage.memoryFraction to very low, and yes that
gave us performance benefits.
Regarding that issue, you should also look at the data you are trying to
broadcast, as sometimes creating that data
Hi,
In spark, there are two settings regarding number of cores, one is at task
level :spark.task.cpus
and there is another one, which drives number of cores per executors:
spark.executor.cores
Apart from using more than one core for a task which has to call some other
external API etc, is there
a lot of them in a short span of time, it means there's
probably something going wrong in the app or on the cluster.
On Wed, Apr 1, 2015 at 7:08 PM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
Hi,
Thanks Sandy.
Another way to look at this is that would we like to have our long
Hi,
In spark over YARN, there is a property spark.yarn.max.executor.failures
which controls the maximum number of executor's failure an application will
survive.
If number of executor's failures ( due to any reason like OOM or machine
failure etc ), increases this value then applications quits.
solution could be to allow a maximum number of failures within any
given time span. E.g. a max failures per hour property.
-Sandy
On Tue, Mar 31, 2015 at 11:52 PM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
Hi,
In spark over YARN, there is a property spark.yarn.max.executor.failures
will be submitted to the
spark cluster based on the priority. jobs will lower priority or less than
some threshold will be discarded.
Thanks,
Abhi
On Mon, Mar 16, 2015 at 10:36 PM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
Hi Abhi,
You mean each task of a job can have different
Hi,
Maybe this is what you are looking for :
http://spark.apache.org/docs/1.2.0/job-scheduling.html#fair-scheduler-pools
Thanks,
On Mon, Mar 16, 2015 at 8:15 PM, abhi abhishek...@gmail.com wrote:
Hi
Current all the jobs in spark gets submitted using queue . i have a
requirement where
...@sigmoidanalytics.com
wrote:
Mostly, that particular executor is stuck on GC Pause, what operation are
you performing? You can try increasing the parallelism if you see only 1
executor is doing the task.
Thanks
Best Regards
On Fri, Feb 27, 2015 at 11:39 AM, twinkle sachdeva
twinkle.sachd
Hi,
Is there any relation between removing block manager of an executor and
marking that as lost?
In my setup,even after removing block manager ( after failing to do some
operation )...it is taking more than 20 mins, to mark that as lost executor.
Following are the logs:
*15/03/03 10:26:49
Hi,
I am running a spark application on Yarn in cluster mode.
One of my executor appears to be in hang state, for a long time, and gets
finally killed by the driver.
As compared to other executors, It have not received StopExecutor message
from the driver.
Here are the logs at the end of this
Hi,
What is the file format which is used to write files while shuffle write?
Is it dependent on the spark shuffle manager or output format?
Is it possible to change the file format for shuffle, irrespective of the
output format of the file?
Thanks,
Twinkle
Hi,
In our job, we need to process the data in small chunks, so as to avoid GC
and other stuff. For this, we are using old API of hadoop as that let us
specify parameter like minPartitions.
Does any one knows, If there a way to do the same via newHadoopAPI also?
How that way will be different
Hi,
Try running following in the spark folder:
bin/*run-example *SparkPi 10
If this runs fine, just see the set of arguments being passed via this
script, and try in similar way.
Thanks,
On Thu, Oct 16, 2014 at 2:59 PM, Christophe Préaud
christophe.pre...@kelkoo.com wrote:
Hi,
I have
Hi,
I have been using spark sql with yarn.
It works fine with yarn-client mode, but with yarn-cluster mode, we are
facing 2 issues. Is yarn-cluster mode not recommended for spark-sql using
hiveContext ??
*Problem #1*
We are not able to use any query with very simple filtering operation
like,
Hi,
Can somebody please share the plans regarding java version's support for
apache spark 1.2.0 or near future releases.
Will java 8 become the all feature supported version in apache spark 1.2 or
java 1.7 will suffice?
Thanks,
://github.com/apache/spark/pull/2382. As a workaround, you can use
lowercase letters in field names instead.
Cheng
On 9/25/14 1:18 PM, twinkle sachdeva wrote:
Hi,
I am using Hive Context to fire the sql queries inside spark. I have
created a schemaRDD( Let's call it cachedSchema
Hi,
I am using Hive Context to fire the sql queries inside spark. I have
created a schemaRDD( Let's call it cachedSchema ) inside my code.
If i fire a sql query ( Query 1 ) on top of it, then it works.
But if I refer to Query1's result inside another sql, that fails. Note that
I have already
Hi,
Has anyone else also experienced
https://issues.apache.org/jira/browse/SPARK-2604?
It is an edge case scenario of mis configuration, where the executor memory
asked is same as the maximum allowed memory by yarn. In such situation,
application stays in hang state, and the reason is not logged
25 matches
Mail list logo