Hello all,
I have a couple of spark streaming questions. Thanks.
1. In the case of stateful operations, the data is, by default,
persistent in memory.
In memory does it mean MEMORY_ONLY? When is it removed from memory?
2. I do not see any documentation for spark.cleaner.ttl
dehmich.wordpress.com
>>
>>
>>> On 22 June 2016 at 15:54, pandees waran <pande...@gmail.com> wrote:
>>> Hi Mich, please let me know if you have any thoughts on the below.
>>>
>>> -- Forwarded message --
>>> From: pa
andees waran <pande...@gmail.com> wrote:
>> Hi Mich, please let me know if you have any thoughts on the below.
>>
>> -- Forwarded message --
>> From: pandees waran <pande...@gmail.com>
>> Date: Wed, Jun 22, 2016 at 7:53 AM
>> Subj
om> wrote:
> Hi Mich, please let me know if you have any thoughts on the below.
>
> -- Forwarded message --
> From: pandees waran <pande...@gmail.com>
> Date: Wed, Jun 22, 2016 at 7:53 AM
> Subject: spark streaming questions
> To: user@spark.apache.org
&g
Hello all,
I have few questions regarding spark streaming :
* I am wondering anyone uses spark streaming with workflow orchestrators
such as data pipeline/SWF/any other framework. Is there any advantages
/drawbacks on using a workflow orchestrator for spark streaming?
*How do you guys manage
Thanks Cody, that's what I thought.
Currently in the cases where I want global ordering, I am doing a collect()
call and going through everything in the client.
I wonder if there is a way to do a global ordered execution across
micro-batches in a betterway?
I am having some trouble with
Hi,
I wanted to understand forEachPartition logic. In the code below, I am
assuming the iterator is executing in a distributed fashion.
1. Assuming I have a stream which has timestamp data which is sorted. Will
the stringiterator in foreachPartition process each line in order?
2. Assuming I have
Ordering would be on a per-partition basis, not global ordering.
You typically want to acquire resources inside the foreachpartition
closure, just before handling the iterator.
http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd
On Mon, Nov
Hi, Dear Spark Streaming Developers and Users,
We are prototyping using spark streaming and hit the following 2 issues thatI
would like to seek your expertise.
1) We have a spark streaming application in scala, that reads data from Kafka
intoa DStream, does some processing and output a
Thanks Anwar.
On Tue, Jun 17, 2014 at 11:54 AM, Anwar Rizal anriza...@gmail.com wrote:
On Tue, Jun 17, 2014 at 5:39 PM, Chen Song chen.song...@gmail.com wrote:
Hey
I am new to spark streaming and apologize if these questions have been
asked.
* In StreamingContext, reduceByKey() seems
Hey
I am new to spark streaming and apologize if these questions have been
asked.
* In StreamingContext, reduceByKey() seems to only work on the RDDs of the
current batch interval, not including RDDs of previous batches. Is my
understanding correct?
* If the above statement is correct, what
Hey
I am new to spark streaming and apologize if these questions have been
asked.
* In StreamingContext, reduceByKey() seems to only work on the RDDs of the
current batch interval, not including RDDs of previous batches. Is my
understanding correct?
* If the above statement is correct, what
12 matches
Mail list logo