On Thu, Apr 25, 2019 at 4:58 PM Maximilian Michels <m...@apache.org> wrote:
>
> I forgot to give an example, just to clarify for others:
>
> > What was the specific example that was less natural?
>
> Basically every time we use ListState to express ValueState, e.g.
>
>    next_index, = list(state.read()) or [0]
>
> Taken from:
> https://github.com/apache/beam/pull/8363/files#diff-ba1a2aed98079ccce869cd660ca9d97dR301

Yes, ListState is much less natural here. I think generally
CombiningValue is often a better replacement. E.g. the Java example
reads


public void processElement(
      ProcessContext context, @StateId("index") ValueState<Integer> index) {
    int current = firstNonNull(index.read(), 0);
    context.output(KV.of(current, context.element()));
    index.write(current+1);
}


which is replaced with bag state


def process(self, element, state=DoFn.StateParam(INDEX_STATE)):
    next_index, = list(state.read()) or [0]
    yield (element, next_index)
    state.clear()
    state.add(next_index + 1)


whereas CombiningState would be more natural (than ListState, and
arguably than even ValueState), giving


def process(self, element, index=DoFn.StateParam(INDEX_STATE)):
    yield element, index.read()
    index.add(1)




>
> -Max
>
> On 25.04.19 16:40, Robert Bradshaw wrote:
> > https://github.com/apache/beam/pull/8402
> >
> > On Thu, Apr 25, 2019 at 4:26 PM Robert Bradshaw <rober...@google.com> wrote:
> >>
> >> Oh, this is for the indexing example.
> >>
> >> I actually think using CombiningState is more cleaner than ValueState.
> >>
> >> https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py#L262
> >>
> >> (The fact that one must specify the accumulator coder is, however,
> >> unfortunate. We should probably infer that if we can.)
> >>
> >> On Thu, Apr 25, 2019 at 4:19 PM Robert Bradshaw <rober...@google.com> 
> >> wrote:
> >>>
> >>> The desire was to avoid the implicit disallowed combination wart in
> >>> Python (until we could make sense of it), and also ValueState could be
> >>> surprising with respect to older values overwriting newer ones. What
> >>> was the specific example that was less natural?
> >>>
> >>> On Thu, Apr 25, 2019 at 3:01 PM Maximilian Michels <m...@apache.org> 
> >>> wrote:
> >>>>
> >>>> @Pablo: Thanks for following up with the PR! :)
> >>>>
> >>>> @Brian: I was wondering about this as well. It makes the Python state
> >>>> code a bit unnatural. I'd suggest to add a ValueState wrapper around
> >>>> ListState/CombiningState.
> >>>>
> >>>> @Robert: Like Reuven pointed out, we can disallow ValueState for merging
> >>>> windows with state.
> >>>>
> >>>> @Reza: Great. Let's make sure it has Python examples out of the box.
> >>>> Either Pablo or me could help there.
> >>>>
> >>>> Thanks,
> >>>> Max
> >>>>
> >>>> On 25.04.19 04:14, Reza Ardeshir Rokni wrote:
> >>>>> Pablo, Kenneth and I have a new blog ready for publication which covers
> >>>>> how to create a "looping timer" it allows for default values to be
> >>>>> created in a window when no incoming elements exists. We just need to
> >>>>> clear a few bits before publication, but would be great to have that
> >>>>> also include a python example, I wrote it in java...
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> Reza
> >>>>>
> >>>>> On Thu, 25 Apr 2019 at 04:34, Reuven Lax <re...@google.com
> >>>>> <mailto:re...@google.com>> wrote:
> >>>>>
> >>>>>      Well state is still not implemented for merging windows even for
> >>>>>      Java (though I believe the idea was to disallow ValueState there).
> >>>>>
> >>>>>      On Wed, Apr 24, 2019 at 1:11 PM Robert Bradshaw 
> >>>>> <rober...@google.com
> >>>>>      <mailto:rober...@google.com>> wrote:
> >>>>>
> >>>>>          It was unclear what the semantics were for ValueState for 
> >>>>> merging
> >>>>>          windows. (It's also a bit weird as it's inherently a race 
> >>>>> condition
> >>>>>          wrt element ordering, unlike Bag and CombineState, though you 
> >>>>> can
> >>>>>          always implement it as a CombineState that always returns the 
> >>>>> latest
> >>>>>          value which is a bit more explicit about the dangers here.)
> >>>>>
> >>>>>          On Wed, Apr 24, 2019 at 10:08 PM Brian Hulette
> >>>>>          <bhule...@google.com <mailto:bhule...@google.com>> wrote:
> >>>>>           >
> >>>>>           > That's a great idea! I thought about this too after those
> >>>>>          posts came up on the list recently. I started to look into it,
> >>>>>          but I noticed that there's actually no implementation of
> >>>>>          ValueState in userstate. Is there a reason for that? I started
> >>>>>          to work on a patch to add it but I was just curious if there 
> >>>>> was
> >>>>>          some reason it was omitted that I should be aware of.
> >>>>>           >
> >>>>>           > We could certainly replicate the example without ValueState
> >>>>>          by using BagState and clearing it before each write, but it
> >>>>>          would be nice if we could draw a direct parallel.
> >>>>>           >
> >>>>>           > Brian
> >>>>>           >
> >>>>>           > On Fri, Apr 12, 2019 at 7:05 AM Maximilian Michels
> >>>>>          <m...@apache.org <mailto:m...@apache.org>> wrote:
> >>>>>           >>
> >>>>>           >> > It would probably be pretty easy to add the corresponding
> >>>>>          code snippets to the docs as well.
> >>>>>           >>
> >>>>>           >> It's probably a bit more work because there is no section
> >>>>>          dedicated to
> >>>>>           >> state/timer yet in the documentation. Tracked here:
> >>>>>           >> https://jira.apache.org/jira/browse/BEAM-2472
> >>>>>           >>
> >>>>>           >> > I've been going over this topic a bit. I'll add the
> >>>>>          snippets next week, if that's fine by y'all.
> >>>>>           >>
> >>>>>           >> That would be great. The blog posts are a great way to get
> >>>>>          started with
> >>>>>           >> state/timers.
> >>>>>           >>
> >>>>>           >> Thanks,
> >>>>>           >> Max
> >>>>>           >>
> >>>>>           >> On 11.04.19 20:21, Pablo Estrada wrote:
> >>>>>           >> > I've been going over this topic a bit. I'll add the
> >>>>>          snippets next week,
> >>>>>           >> > if that's fine by y'all.
> >>>>>           >> > Best
> >>>>>           >> > -P.
> >>>>>           >> >
> >>>>>           >> > On Thu, Apr 11, 2019 at 5:27 AM Robert Bradshaw
> >>>>>          <rober...@google.com <mailto:rober...@google.com>
> >>>>>           >> > <mailto:rober...@google.com 
> >>>>> <mailto:rober...@google.com>>>
> >>>>>          wrote:
> >>>>>           >> >
> >>>>>           >> >     That's a great idea! It would probably be pretty easy
> >>>>>          to add the
> >>>>>           >> >     corresponding code snippets to the docs as well.
> >>>>>           >> >
> >>>>>           >> >     On Thu, Apr 11, 2019 at 2:00 PM Maximilian Michels
> >>>>>          <m...@apache.org <mailto:m...@apache.org>
> >>>>>           >> >     <mailto:m...@apache.org <mailto:m...@apache.org>>> 
> >>>>> wrote:
> >>>>>           >> >      >
> >>>>>           >> >      > Hi everyone,
> >>>>>           >> >      >
> >>>>>           >> >      > The Python SDK still lacks documentation on state
> >>>>>          and timers.
> >>>>>           >> >      >
> >>>>>           >> >      > As a first step, what do you think about updating
> >>>>>          these two blog
> >>>>>           >> >     posts
> >>>>>           >> >      > with the corresponding Python code?
> >>>>>           >> >      >
> >>>>>           >> >      >
> >>>>>          
> >>>>> https://beam.apache.org/blog/2017/02/13/stateful-processing.html
> >>>>>           >> >      >
> >>>>>          https://beam.apache.org/blog/2017/08/28/timely-processing.html
> >>>>>           >> >      >
> >>>>>           >> >      > Thanks,
> >>>>>           >> >      > Max
> >>>>>           >> >
> >>>>>

Reply via email to