While checkpointing RDDs as a part of an application that doesn't use
spark-streaming, I observed that the checkpointed files are not being
cleaned up even after the application completes successfully.
Is it because we assume that checkpointing would be primarily used for
spark-streaming applicati
I don't think we will have trouble with whatever rule that is adopted for
accepting proposals. Considering committers' votes binding (if that is what
we choose) is an established practice as long as it isn't for specific
votes, like a release vote. From the Apache docs: "Who is permitted to vote
is
Thanks Fred - that is very helpful.
> Delivering low latency, high throughput, and stability simultaneously: Right
> now, our own tests indicate you can get at most two of these characteristics
> out of Spark Streaming at the same time. I know of two parties that have
> abandoned Spark Streaming b
On Tue, Oct 11, 2016 at 10:55 AM, Michael Armbrust
wrote:
> *Complex event processing and state management:* Several groups I've
>> talked to want to run a large number (tens or hundreds of thousands now,
>> millions in the near future) of state machines over low-rate partitions of
>> a high-rate
This is super helpful, thanks for writing it up!
> *Delivering low latency, high throughput, and stability simultaneously:* Right
> now, our own tests indicate you can get at most two of these
> characteristics out of Spark Streaming at the same time. I know of two
> parties that have abandoned S
On Thu, Oct 6, 2016 at 12:37 PM, Michael Armbrust > wrote:
>
> [snip!]
> Relatedly, I'm curious to hear more about the types of questions you are
> getting. I think the dev list is a good place to discuss applications and
> if/how structured streaming can handle them.
>
Details are difficult to s
This is a follow up for this unanswered October 2015 issue:
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-streaming-failed-recovery-from-checkpoint-td14832.html
The issue is that the Spark driver checkpoints an RDD, deletes it, the job
restarts, and *the new driver tries to load
Just as one of those who subscribed to dev/user mailing list, I would like
to avoid to recieve flooding emails about job recruiting.
In my personal opinion, I think that might mean virtually allowing that
this list is being used as the mean for some profits in an organisation.
On 7 Oct 2016 5:05