What are you expecting?  If you want to update every key on every
batch, it's going to be linear on the number of keys... there's no
real way around that.

On Tue, Oct 11, 2016 at 9:49 AM, Daan Debie <debie.d...@gmail.com> wrote:
> That's nice and all, but I'd rather have a solution involving mapWithState
> of course :) I'm just wondering why it doesn't support this use case yet.
>
> On Tue, Oct 11, 2016 at 3:41 PM, Cody Koeninger <c...@koeninger.org> wrote:
>>
>> They're telling you not to use the old function because it's linear on the
>> total number of keys, not keys in the batch, so it's slow.
>>
>> But if that's what you really want, go ahead and do it, and see if it
>> performs well enough.
>>
>>
>> On Oct 11, 2016 6:28 AM, "DandyDev" <debie.d...@gmail.com> wrote:
>>
>> Hi there,
>>
>> I've built a Spark Streaming app that accepts certain events from Kafka,
>> and
>> I want to keep some state between the events. So I've successfully used
>> mapWithState for that. The problem is, that I want the state for keys to
>> be
>> updated on every batchInterval, because "lack" of events is also
>> significant
>> to the use case. This doesn't seem possible with mapWithState, unless I'm
>> missing something.
>>
>> Previously I looked at updateStateByKey, which says:
>> > In every batch, Spark will apply the state update function for all
>> > existing keys, regardless of whether they have new data in a batch or
>> > not.
>>
>> That is what I want, however, I've seen several tutorials/blog posts where
>> the advise was not to use updateStateByKey anymore, and use mapWithState
>> instead.
>>
>> So my questions:
>>
>> - Can mapWithState state function be called every batchInterval, even when
>> no events exist for that interval?
>> - If not, is it okay to use updateStateByKey instead? Or will it be
>> deprecated in the near future?
>> - If mapWithState doesn't support my need, is there another way to
>> accomplish the goal of updating state every batchInterval, that still uses
>> mapWithState, together with some other mechanism?
>>
>> Thanks in advance!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Can-mapWithState-state-func-be-called-every-batchInterval-tp27877.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to