On Thu, Apr 25, 2019 at 4:58 PM Maximilian Michels <m...@apache.org> wrote: > > I forgot to give an example, just to clarify for others: > > > What was the specific example that was less natural? > > Basically every time we use ListState to express ValueState, e.g. > > next_index, = list(state.read()) or [0] > > Taken from: > https://github.com/apache/beam/pull/8363/files#diff-ba1a2aed98079ccce869cd660ca9d97dR301
Yes, ListState is much less natural here. I think generally CombiningValue is often a better replacement. E.g. the Java example reads public void processElement( ProcessContext context, @StateId("index") ValueState<Integer> index) { int current = firstNonNull(index.read(), 0); context.output(KV.of(current, context.element())); index.write(current+1); } which is replaced with bag state def process(self, element, state=DoFn.StateParam(INDEX_STATE)): next_index, = list(state.read()) or [0] yield (element, next_index) state.clear() state.add(next_index + 1) whereas CombiningState would be more natural (than ListState, and arguably than even ValueState), giving def process(self, element, index=DoFn.StateParam(INDEX_STATE)): yield element, index.read() index.add(1) > > -Max > > On 25.04.19 16:40, Robert Bradshaw wrote: > > https://github.com/apache/beam/pull/8402 > > > > On Thu, Apr 25, 2019 at 4:26 PM Robert Bradshaw <rober...@google.com> wrote: > >> > >> Oh, this is for the indexing example. > >> > >> I actually think using CombiningState is more cleaner than ValueState. > >> > >> https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py#L262 > >> > >> (The fact that one must specify the accumulator coder is, however, > >> unfortunate. We should probably infer that if we can.) > >> > >> On Thu, Apr 25, 2019 at 4:19 PM Robert Bradshaw <rober...@google.com> > >> wrote: > >>> > >>> The desire was to avoid the implicit disallowed combination wart in > >>> Python (until we could make sense of it), and also ValueState could be > >>> surprising with respect to older values overwriting newer ones. What > >>> was the specific example that was less natural? > >>> > >>> On Thu, Apr 25, 2019 at 3:01 PM Maximilian Michels <m...@apache.org> > >>> wrote: > >>>> > >>>> @Pablo: Thanks for following up with the PR! :) > >>>> > >>>> @Brian: I was wondering about this as well. It makes the Python state > >>>> code a bit unnatural. I'd suggest to add a ValueState wrapper around > >>>> ListState/CombiningState. > >>>> > >>>> @Robert: Like Reuven pointed out, we can disallow ValueState for merging > >>>> windows with state. > >>>> > >>>> @Reza: Great. Let's make sure it has Python examples out of the box. > >>>> Either Pablo or me could help there. > >>>> > >>>> Thanks, > >>>> Max > >>>> > >>>> On 25.04.19 04:14, Reza Ardeshir Rokni wrote: > >>>>> Pablo, Kenneth and I have a new blog ready for publication which covers > >>>>> how to create a "looping timer" it allows for default values to be > >>>>> created in a window when no incoming elements exists. We just need to > >>>>> clear a few bits before publication, but would be great to have that > >>>>> also include a python example, I wrote it in java... > >>>>> > >>>>> Cheers > >>>>> > >>>>> Reza > >>>>> > >>>>> On Thu, 25 Apr 2019 at 04:34, Reuven Lax <re...@google.com > >>>>> <mailto:re...@google.com>> wrote: > >>>>> > >>>>> Well state is still not implemented for merging windows even for > >>>>> Java (though I believe the idea was to disallow ValueState there). > >>>>> > >>>>> On Wed, Apr 24, 2019 at 1:11 PM Robert Bradshaw > >>>>> <rober...@google.com > >>>>> <mailto:rober...@google.com>> wrote: > >>>>> > >>>>> It was unclear what the semantics were for ValueState for > >>>>> merging > >>>>> windows. (It's also a bit weird as it's inherently a race > >>>>> condition > >>>>> wrt element ordering, unlike Bag and CombineState, though you > >>>>> can > >>>>> always implement it as a CombineState that always returns the > >>>>> latest > >>>>> value which is a bit more explicit about the dangers here.) > >>>>> > >>>>> On Wed, Apr 24, 2019 at 10:08 PM Brian Hulette > >>>>> <bhule...@google.com <mailto:bhule...@google.com>> wrote: > >>>>> > > >>>>> > That's a great idea! I thought about this too after those > >>>>> posts came up on the list recently. I started to look into it, > >>>>> but I noticed that there's actually no implementation of > >>>>> ValueState in userstate. Is there a reason for that? I started > >>>>> to work on a patch to add it but I was just curious if there > >>>>> was > >>>>> some reason it was omitted that I should be aware of. > >>>>> > > >>>>> > We could certainly replicate the example without ValueState > >>>>> by using BagState and clearing it before each write, but it > >>>>> would be nice if we could draw a direct parallel. > >>>>> > > >>>>> > Brian > >>>>> > > >>>>> > On Fri, Apr 12, 2019 at 7:05 AM Maximilian Michels > >>>>> <m...@apache.org <mailto:m...@apache.org>> wrote: > >>>>> >> > >>>>> >> > It would probably be pretty easy to add the corresponding > >>>>> code snippets to the docs as well. > >>>>> >> > >>>>> >> It's probably a bit more work because there is no section > >>>>> dedicated to > >>>>> >> state/timer yet in the documentation. Tracked here: > >>>>> >> https://jira.apache.org/jira/browse/BEAM-2472 > >>>>> >> > >>>>> >> > I've been going over this topic a bit. I'll add the > >>>>> snippets next week, if that's fine by y'all. > >>>>> >> > >>>>> >> That would be great. The blog posts are a great way to get > >>>>> started with > >>>>> >> state/timers. > >>>>> >> > >>>>> >> Thanks, > >>>>> >> Max > >>>>> >> > >>>>> >> On 11.04.19 20:21, Pablo Estrada wrote: > >>>>> >> > I've been going over this topic a bit. I'll add the > >>>>> snippets next week, > >>>>> >> > if that's fine by y'all. > >>>>> >> > Best > >>>>> >> > -P. > >>>>> >> > > >>>>> >> > On Thu, Apr 11, 2019 at 5:27 AM Robert Bradshaw > >>>>> <rober...@google.com <mailto:rober...@google.com> > >>>>> >> > <mailto:rober...@google.com > >>>>> <mailto:rober...@google.com>>> > >>>>> wrote: > >>>>> >> > > >>>>> >> > That's a great idea! It would probably be pretty easy > >>>>> to add the > >>>>> >> > corresponding code snippets to the docs as well. > >>>>> >> > > >>>>> >> > On Thu, Apr 11, 2019 at 2:00 PM Maximilian Michels > >>>>> <m...@apache.org <mailto:m...@apache.org> > >>>>> >> > <mailto:m...@apache.org <mailto:m...@apache.org>>> > >>>>> wrote: > >>>>> >> > > > >>>>> >> > > Hi everyone, > >>>>> >> > > > >>>>> >> > > The Python SDK still lacks documentation on state > >>>>> and timers. > >>>>> >> > > > >>>>> >> > > As a first step, what do you think about updating > >>>>> these two blog > >>>>> >> > posts > >>>>> >> > > with the corresponding Python code? > >>>>> >> > > > >>>>> >> > > > >>>>> > >>>>> https://beam.apache.org/blog/2017/02/13/stateful-processing.html > >>>>> >> > > > >>>>> https://beam.apache.org/blog/2017/08/28/timely-processing.html > >>>>> >> > > > >>>>> >> > > Thanks, > >>>>> >> > > Max > >>>>> >> > > >>>>>