Re: mapWithState question

2017-01-30 Thread Cody Koeninger
Keep an eye on

https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging

although it'll likely be a while

On Mon, Jan 30, 2017 at 3:41 PM, Tathagata Das
 wrote:
> If you care about the semantics of those writes to Kafka, then you should be
> aware of two things.
> 1. There are no transactional writes to Kafka.
> 2. So, when tasks get reexecuted due to any failure, your mapping function
> will also be reexecuted, and the writes to kafka can happen multiple times.
> So you may only get at least once guarantee about those Kafka writes
>
>
> On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpande 
> wrote:
>>
>> Hello,
>>
>> TD, your suggestion works great. Thanks
>>
>> I have 1 more question, I need to write to kafka from within the
>> mapWithState function. Just wanted to check if this a bad pattern in any
>> way.
>>
>> Thank you.
>>
>>
>>
>>
>>
>> On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande
>>  wrote:
>>>
>>> Thats a great idea. I will try that. Thanks.
>>>
>>> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das
>>>  wrote:

 1 state object for each user.
 union both streams into a single DStream, and apply mapWithState on it
 to update the user state.

 On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande
  wrote:
>
> Can multiple DStreams manipulate a state? I have a stream that gives me
> total minutes the user spent on a course material. I have another stream
> that gives me chapters completed and lessons completed by the user. I want
> to keep track for each user total_minutes, chapters_completed and
> lessons_completed. I am not sure if I should have 1 state or 2 states. 
> Can I
> lookup the state for a given key just like a map outside the mapfunction?
>
> Appreciate your input. Thanks


>>>
>>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: mapWithState question

2017-01-30 Thread shyla deshpande
Thanks. Appreciate your input.

On Mon, Jan 30, 2017 at 1:41 PM, Tathagata Das 
wrote:

> If you care about the semantics of those writes to Kafka, then you should
> be aware of two things.
> 1. There are no transactional writes to Kafka.
> 2. So, when tasks get reexecuted due to any failure, your mapping function
> will also be reexecuted, and the writes to kafka can happen multiple times.
> So you may only get at least once guarantee about those Kafka writes
>
>
> On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpande <
> deshpandesh...@gmail.com> wrote:
>
>> Hello,
>>
>> TD, your suggestion works great. Thanks
>>
>> I have 1 more question, I need to write to kafka from within the
>> mapWithState function. Just wanted to check if this a bad pattern in any
>> way.
>>
>> Thank you.
>>
>>
>>
>>
>>
>> On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande <
>> deshpandesh...@gmail.com> wrote:
>>
>>> Thats a great idea. I will try that. Thanks.
>>>
>>> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das <
>>> tathagata.das1...@gmail.com> wrote:
>>>
 1 state object for each user.
 union both streams into a single DStream, and apply mapWithState on it
 to update the user state.

 On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande <
 deshpandesh...@gmail.com> wrote:

> Can multiple DStreams manipulate a state? I have a stream that gives
> me total minutes the user spent on a course material. I have another
> stream that gives me chapters completed and lessons completed by the 
> user. I
> want to keep track for each user total_minutes, chapters_completed and
> lessons_completed. I am not sure if I should have 1 state or 2
> states. Can I lookup the state for a given key just like a map
> outside the mapfunction?
>
> Appreciate your input. Thanks
>


>>>
>>
>


Re: mapWithState question

2017-01-30 Thread Tathagata Das
If you care about the semantics of those writes to Kafka, then you should
be aware of two things.
1. There are no transactional writes to Kafka.
2. So, when tasks get reexecuted due to any failure, your mapping function
will also be reexecuted, and the writes to kafka can happen multiple times.
So you may only get at least once guarantee about those Kafka writes


On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpande 
wrote:

> Hello,
>
> TD, your suggestion works great. Thanks
>
> I have 1 more question, I need to write to kafka from within the
> mapWithState function. Just wanted to check if this a bad pattern in any
> way.
>
> Thank you.
>
>
>
>
>
> On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande  > wrote:
>
>> Thats a great idea. I will try that. Thanks.
>>
>> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das <
>> tathagata.das1...@gmail.com> wrote:
>>
>>> 1 state object for each user.
>>> union both streams into a single DStream, and apply mapWithState on it
>>> to update the user state.
>>>
>>> On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande <
>>> deshpandesh...@gmail.com> wrote:
>>>
 Can multiple DStreams manipulate a state? I have a stream that gives
 me total minutes the user spent on a course material. I have another
 stream that gives me chapters completed and lessons completed by the user. 
 I
 want to keep track for each user total_minutes, chapters_completed and
 lessons_completed. I am not sure if I should have 1 state or 2 states. Can
 I lookup the state for a given key just like a map outside the mapfunction?

 Appreciate your input. Thanks

>>>
>>>
>>
>


Re: mapWithState question

2017-01-30 Thread shyla deshpande
Hello,

TD, your suggestion works great. Thanks

I have 1 more question, I need to write to kafka from within the
mapWithState function. Just wanted to check if this a bad pattern in any
way.

Thank you.





On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande 
wrote:

> Thats a great idea. I will try that. Thanks.
>
> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> 1 state object for each user.
>> union both streams into a single DStream, and apply mapWithState on it to
>> update the user state.
>>
>> On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande <
>> deshpandesh...@gmail.com> wrote:
>>
>>> Can multiple DStreams manipulate a state? I have a stream that gives me
>>> total minutes the user spent on a course material. I have another
>>> stream that gives me chapters completed and lessons completed by the user. I
>>> want to keep track for each user total_minutes, chapters_completed and
>>> lessons_completed. I am not sure if I should have 1 state or 2 states. Can
>>> I lookup the state for a given key just like a map outside the mapfunction?
>>>
>>> Appreciate your input. Thanks
>>>
>>
>>
>


Re: mapWithState question

2017-01-28 Thread shyla deshpande
Thats a great idea. I will try that. Thanks.

On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das 
wrote:

> 1 state object for each user.
> union both streams into a single DStream, and apply mapWithState on it to
> update the user state.
>
> On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande <
> deshpandesh...@gmail.com> wrote:
>
>> Can multiple DStreams manipulate a state? I have a stream that gives me
>> total minutes the user spent on a course material. I have another stream
>> that gives me chapters completed and lessons completed by the user. I
>> want to keep track for each user total_minutes, chapters_completed and
>> lessons_completed. I am not sure if I should have 1 state or 2 states. Can
>> I lookup the state for a given key just like a map outside the mapfunction?
>>
>> Appreciate your input. Thanks
>>
>
>


Re: mapWithState question

2017-01-28 Thread Tathagata Das
1 state object for each user.
union both streams into a single DStream, and apply mapWithState on it to
update the user state.

On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande 
wrote:

> Can multiple DStreams manipulate a state? I have a stream that gives me
> total minutes the user spent on a course material. I have another stream
> that gives me chapters completed and lessons completed by the user. I
> want to keep track for each user total_minutes, chapters_completed and
> lessons_completed. I am not sure if I should have 1 state or 2 states. Can
> I lookup the state for a given key just like a map outside the mapfunction?
>
> Appreciate your input. Thanks
>


mapWithState question

2017-01-28 Thread shyla deshpande
Can multiple DStreams manipulate a state? I have a stream that gives me
total minutes the user spent on a course material. I have another stream
that gives me chapters completed and lessons completed by the user. I want
to keep track for each user total_minutes, chapters_completed and
lessons_completed. I am not sure if I should have 1 state or 2 states. Can
I lookup the state for a given key just like a map outside the mapfunction?

Appreciate your input. Thanks