On Thu, Apr 25, 2019, 5:26 PM Maximilian Michels <m...@apache.org> wrote:
> Completely agree that CombiningState is nicer in this example. Users may > still want to use ValueState when there is nothing to combine. I've always had trouble coming up with any good examples of this. Also, > users already know ValueState from the Java SDK. > Maybe we should deprecate that :) On 25.04.19 17:12, Robert Bradshaw wrote: > > On Thu, Apr 25, 2019 at 4:58 PM Maximilian Michels <m...@apache.org> > wrote: > >> > >> I forgot to give an example, just to clarify for others: > >> > >>> What was the specific example that was less natural? > >> > >> Basically every time we use ListState to express ValueState, e.g. > >> > >> next_index, = list(state.read()) or [0] > >> > >> Taken from: > >> > https://github.com/apache/beam/pull/8363/files#diff-ba1a2aed98079ccce869cd660ca9d97dR301 > > > > Yes, ListState is much less natural here. I think generally > > CombiningValue is often a better replacement. E.g. the Java example > > reads > > > > > > public void processElement( > > ProcessContext context, @StateId("index") ValueState<Integer> > index) { > > int current = firstNonNull(index.read(), 0); > > context.output(KV.of(current, context.element())); > > index.write(current+1); > > } > > > > > > which is replaced with bag state > > > > > > def process(self, element, state=DoFn.StateParam(INDEX_STATE)): > > next_index, = list(state.read()) or [0] > > yield (element, next_index) > > state.clear() > > state.add(next_index + 1) > > > > > > whereas CombiningState would be more natural (than ListState, and > > arguably than even ValueState), giving > > > > > > def process(self, element, index=DoFn.StateParam(INDEX_STATE)): > > yield element, index.read() > > index.add(1) > > > > > > > > > >> > >> -Max > >> > >> On 25.04.19 16:40, Robert Bradshaw wrote: > >>> https://github.com/apache/beam/pull/8402 > >>> > >>> On Thu, Apr 25, 2019 at 4:26 PM Robert Bradshaw <rober...@google.com> > wrote: > >>>> > >>>> Oh, this is for the indexing example. > >>>> > >>>> I actually think using CombiningState is more cleaner than ValueState. > >>>> > >>>> > https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py#L262 > >>>> > >>>> (The fact that one must specify the accumulator coder is, however, > >>>> unfortunate. We should probably infer that if we can.) > >>>> > >>>> On Thu, Apr 25, 2019 at 4:19 PM Robert Bradshaw <rober...@google.com> > wrote: > >>>>> > >>>>> The desire was to avoid the implicit disallowed combination wart in > >>>>> Python (until we could make sense of it), and also ValueState could > be > >>>>> surprising with respect to older values overwriting newer ones. What > >>>>> was the specific example that was less natural? > >>>>> > >>>>> On Thu, Apr 25, 2019 at 3:01 PM Maximilian Michels <m...@apache.org> > wrote: > >>>>>> > >>>>>> @Pablo: Thanks for following up with the PR! :) > >>>>>> > >>>>>> @Brian: I was wondering about this as well. It makes the Python > state > >>>>>> code a bit unnatural. I'd suggest to add a ValueState wrapper around > >>>>>> ListState/CombiningState. > >>>>>> > >>>>>> @Robert: Like Reuven pointed out, we can disallow ValueState for > merging > >>>>>> windows with state. > >>>>>> > >>>>>> @Reza: Great. Let's make sure it has Python examples out of the box. > >>>>>> Either Pablo or me could help there. > >>>>>> > >>>>>> Thanks, > >>>>>> Max > >>>>>> > >>>>>> On 25.04.19 04:14, Reza Ardeshir Rokni wrote: > >>>>>>> Pablo, Kenneth and I have a new blog ready for publication which > covers > >>>>>>> how to create a "looping timer" it allows for default values to be > >>>>>>> created in a window when no incoming elements exists. We just need > to > >>>>>>> clear a few bits before publication, but would be great to have > that > >>>>>>> also include a python example, I wrote it in java... > >>>>>>> > >>>>>>> Cheers > >>>>>>> > >>>>>>> Reza > >>>>>>> > >>>>>>> On Thu, 25 Apr 2019 at 04:34, Reuven Lax <re...@google.com > >>>>>>> <mailto:re...@google.com>> wrote: > >>>>>>> > >>>>>>> Well state is still not implemented for merging windows even > for > >>>>>>> Java (though I believe the idea was to disallow ValueState > there). > >>>>>>> > >>>>>>> On Wed, Apr 24, 2019 at 1:11 PM Robert Bradshaw < > rober...@google.com > >>>>>>> <mailto:rober...@google.com>> wrote: > >>>>>>> > >>>>>>> It was unclear what the semantics were for ValueState > for merging > >>>>>>> windows. (It's also a bit weird as it's inherently a > race condition > >>>>>>> wrt element ordering, unlike Bag and CombineState, > though you can > >>>>>>> always implement it as a CombineState that always > returns the latest > >>>>>>> value which is a bit more explicit about the dangers > here.) > >>>>>>> > >>>>>>> On Wed, Apr 24, 2019 at 10:08 PM Brian Hulette > >>>>>>> <bhule...@google.com <mailto:bhule...@google.com>> > wrote: > >>>>>>> > > >>>>>>> > That's a great idea! I thought about this too after > those > >>>>>>> posts came up on the list recently. I started to look > into it, > >>>>>>> but I noticed that there's actually no implementation of > >>>>>>> ValueState in userstate. Is there a reason for that? I > started > >>>>>>> to work on a patch to add it but I was just curious if > there was > >>>>>>> some reason it was omitted that I should be aware of. > >>>>>>> > > >>>>>>> > We could certainly replicate the example without > ValueState > >>>>>>> by using BagState and clearing it before each write, but > it > >>>>>>> would be nice if we could draw a direct parallel. > >>>>>>> > > >>>>>>> > Brian > >>>>>>> > > >>>>>>> > On Fri, Apr 12, 2019 at 7:05 AM Maximilian Michels > >>>>>>> <m...@apache.org <mailto:m...@apache.org>> wrote: > >>>>>>> >> > >>>>>>> >> > It would probably be pretty easy to add the > corresponding > >>>>>>> code snippets to the docs as well. > >>>>>>> >> > >>>>>>> >> It's probably a bit more work because there is no > section > >>>>>>> dedicated to > >>>>>>> >> state/timer yet in the documentation. Tracked here: > >>>>>>> >> https://jira.apache.org/jira/browse/BEAM-2472 > >>>>>>> >> > >>>>>>> >> > I've been going over this topic a bit. I'll add the > >>>>>>> snippets next week, if that's fine by y'all. > >>>>>>> >> > >>>>>>> >> That would be great. The blog posts are a great way > to get > >>>>>>> started with > >>>>>>> >> state/timers. > >>>>>>> >> > >>>>>>> >> Thanks, > >>>>>>> >> Max > >>>>>>> >> > >>>>>>> >> On 11.04.19 20:21, Pablo Estrada wrote: > >>>>>>> >> > I've been going over this topic a bit. I'll add the > >>>>>>> snippets next week, > >>>>>>> >> > if that's fine by y'all. > >>>>>>> >> > Best > >>>>>>> >> > -P. > >>>>>>> >> > > >>>>>>> >> > On Thu, Apr 11, 2019 at 5:27 AM Robert Bradshaw > >>>>>>> <rober...@google.com <mailto:rober...@google.com> > >>>>>>> >> > <mailto:rober...@google.com <mailto: > rober...@google.com>>> > >>>>>>> wrote: > >>>>>>> >> > > >>>>>>> >> > That's a great idea! It would probably be > pretty easy > >>>>>>> to add the > >>>>>>> >> > corresponding code snippets to the docs as > well. > >>>>>>> >> > > >>>>>>> >> > On Thu, Apr 11, 2019 at 2:00 PM Maximilian > Michels > >>>>>>> <m...@apache.org <mailto:m...@apache.org> > >>>>>>> >> > <mailto:m...@apache.org <mailto:m...@apache.org>>> > wrote: > >>>>>>> >> > > > >>>>>>> >> > > Hi everyone, > >>>>>>> >> > > > >>>>>>> >> > > The Python SDK still lacks documentation on > state > >>>>>>> and timers. > >>>>>>> >> > > > >>>>>>> >> > > As a first step, what do you think about > updating > >>>>>>> these two blog > >>>>>>> >> > posts > >>>>>>> >> > > with the corresponding Python code? > >>>>>>> >> > > > >>>>>>> >> > > > >>>>>>> > https://beam.apache.org/blog/2017/02/13/stateful-processing.html > >>>>>>> >> > > > >>>>>>> > https://beam.apache.org/blog/2017/08/28/timely-processing.html > >>>>>>> >> > > > >>>>>>> >> > > Thanks, > >>>>>>> >> > > Max > >>>>>>> >> > > >>>>>>> >