Re: mapWithState question
Keep an eye on https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging although it'll likely be a while On Mon, Jan 30, 2017 at 3:41 PM, Tathagata Daswrote: > If you care about the semantics of those writes to Kafka, then you should be > aware of two things. > 1. There are no transactional writes to Kafka. > 2. So, when tasks get reexecuted due to any failure, your mapping function > will also be reexecuted, and the writes to kafka can happen multiple times. > So you may only get at least once guarantee about those Kafka writes > > > On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpande > wrote: >> >> Hello, >> >> TD, your suggestion works great. Thanks >> >> I have 1 more question, I need to write to kafka from within the >> mapWithState function. Just wanted to check if this a bad pattern in any >> way. >> >> Thank you. >> >> >> >> >> >> On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande >> wrote: >>> >>> Thats a great idea. I will try that. Thanks. >>> >>> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das >>> wrote: 1 state object for each user. union both streams into a single DStream, and apply mapWithState on it to update the user state. On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande wrote: > > Can multiple DStreams manipulate a state? I have a stream that gives me > total minutes the user spent on a course material. I have another stream > that gives me chapters completed and lessons completed by the user. I want > to keep track for each user total_minutes, chapters_completed and > lessons_completed. I am not sure if I should have 1 state or 2 states. > Can I > lookup the state for a given key just like a map outside the mapfunction? > > Appreciate your input. Thanks >>> >> > - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: mapWithState question
Thanks. Appreciate your input. On Mon, Jan 30, 2017 at 1:41 PM, Tathagata Daswrote: > If you care about the semantics of those writes to Kafka, then you should > be aware of two things. > 1. There are no transactional writes to Kafka. > 2. So, when tasks get reexecuted due to any failure, your mapping function > will also be reexecuted, and the writes to kafka can happen multiple times. > So you may only get at least once guarantee about those Kafka writes > > > On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpande < > deshpandesh...@gmail.com> wrote: > >> Hello, >> >> TD, your suggestion works great. Thanks >> >> I have 1 more question, I need to write to kafka from within the >> mapWithState function. Just wanted to check if this a bad pattern in any >> way. >> >> Thank you. >> >> >> >> >> >> On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande < >> deshpandesh...@gmail.com> wrote: >> >>> Thats a great idea. I will try that. Thanks. >>> >>> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das < >>> tathagata.das1...@gmail.com> wrote: >>> 1 state object for each user. union both streams into a single DStream, and apply mapWithState on it to update the user state. On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande < deshpandesh...@gmail.com> wrote: > Can multiple DStreams manipulate a state? I have a stream that gives > me total minutes the user spent on a course material. I have another > stream that gives me chapters completed and lessons completed by the > user. I > want to keep track for each user total_minutes, chapters_completed and > lessons_completed. I am not sure if I should have 1 state or 2 > states. Can I lookup the state for a given key just like a map > outside the mapfunction? > > Appreciate your input. Thanks > >>> >> >
Re: mapWithState question
If you care about the semantics of those writes to Kafka, then you should be aware of two things. 1. There are no transactional writes to Kafka. 2. So, when tasks get reexecuted due to any failure, your mapping function will also be reexecuted, and the writes to kafka can happen multiple times. So you may only get at least once guarantee about those Kafka writes On Mon, Jan 30, 2017 at 10:02 AM, shyla deshpandewrote: > Hello, > > TD, your suggestion works great. Thanks > > I have 1 more question, I need to write to kafka from within the > mapWithState function. Just wanted to check if this a bad pattern in any > way. > > Thank you. > > > > > > On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpande > wrote: > >> Thats a great idea. I will try that. Thanks. >> >> On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das < >> tathagata.das1...@gmail.com> wrote: >> >>> 1 state object for each user. >>> union both streams into a single DStream, and apply mapWithState on it >>> to update the user state. >>> >>> On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande < >>> deshpandesh...@gmail.com> wrote: >>> Can multiple DStreams manipulate a state? I have a stream that gives me total minutes the user spent on a course material. I have another stream that gives me chapters completed and lessons completed by the user. I want to keep track for each user total_minutes, chapters_completed and lessons_completed. I am not sure if I should have 1 state or 2 states. Can I lookup the state for a given key just like a map outside the mapfunction? Appreciate your input. Thanks >>> >>> >> >
Re: mapWithState question
Hello, TD, your suggestion works great. Thanks I have 1 more question, I need to write to kafka from within the mapWithState function. Just wanted to check if this a bad pattern in any way. Thank you. On Sat, Jan 28, 2017 at 9:14 AM, shyla deshpandewrote: > Thats a great idea. I will try that. Thanks. > > On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Das < > tathagata.das1...@gmail.com> wrote: > >> 1 state object for each user. >> union both streams into a single DStream, and apply mapWithState on it to >> update the user state. >> >> On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande < >> deshpandesh...@gmail.com> wrote: >> >>> Can multiple DStreams manipulate a state? I have a stream that gives me >>> total minutes the user spent on a course material. I have another >>> stream that gives me chapters completed and lessons completed by the user. I >>> want to keep track for each user total_minutes, chapters_completed and >>> lessons_completed. I am not sure if I should have 1 state or 2 states. Can >>> I lookup the state for a given key just like a map outside the mapfunction? >>> >>> Appreciate your input. Thanks >>> >> >> >
Re: mapWithState question
Thats a great idea. I will try that. Thanks. On Sat, Jan 28, 2017 at 2:35 AM, Tathagata Daswrote: > 1 state object for each user. > union both streams into a single DStream, and apply mapWithState on it to > update the user state. > > On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpande < > deshpandesh...@gmail.com> wrote: > >> Can multiple DStreams manipulate a state? I have a stream that gives me >> total minutes the user spent on a course material. I have another stream >> that gives me chapters completed and lessons completed by the user. I >> want to keep track for each user total_minutes, chapters_completed and >> lessons_completed. I am not sure if I should have 1 state or 2 states. Can >> I lookup the state for a given key just like a map outside the mapfunction? >> >> Appreciate your input. Thanks >> > >
Re: mapWithState question
1 state object for each user. union both streams into a single DStream, and apply mapWithState on it to update the user state. On Sat, Jan 28, 2017 at 12:30 AM, shyla deshpandewrote: > Can multiple DStreams manipulate a state? I have a stream that gives me > total minutes the user spent on a course material. I have another stream > that gives me chapters completed and lessons completed by the user. I > want to keep track for each user total_minutes, chapters_completed and > lessons_completed. I am not sure if I should have 1 state or 2 states. Can > I lookup the state for a given key just like a map outside the mapfunction? > > Appreciate your input. Thanks >
mapWithState question
Can multiple DStreams manipulate a state? I have a stream that gives me total minutes the user spent on a course material. I have another stream that gives me chapters completed and lessons completed by the user. I want to keep track for each user total_minutes, chapters_completed and lessons_completed. I am not sure if I should have 1 state or 2 states. Can I lookup the state for a given key just like a map outside the mapfunction? Appreciate your input. Thanks