Nice! On Mon, 18 May 2020, 05:52 Mohil Khare, <[email protected]> wrote:
> Hi Reza and others, > As suggested, I have opened > https://issues.apache.org/jira/browse/BEAM-10019 which I think might be a > good addition to beam pipeline patterns. > > Thanks > Mohil > > On Mon, Apr 6, 2020 at 6:28 PM Mohil Khare <[email protected]> wrote: > >> Sure thing.. I would love to contribute. >> >> Thanks >> Mohil >> >> >> >> On Mon, Apr 6, 2020 at 6:17 PM Reza Ardeshir Rokni <[email protected]> >> wrote: >> >>> Great! BTW if you get the time and wanted to contribute back to beam >>> there is a nice section to record cool patterns: >>> >>> https://beam.apache.org/documentation/patterns/overview/ >>> >>> This would make a great one! >>> >>> On Tue, 7 Apr 2020 at 09:12, Mohil Khare <[email protected]> wrote: >>> >>>> No ... that's a valid answer. Since I wanted to have a long window size >>>> per key and since we can't use state with session windows, I am using a >>>> sliding window for let's say 72 hrs which starts every hour. >>>> >>>> Thanks a lot Reza for your input. >>>> >>>> Regards >>>> Mohil >>>> >>>> On Mon, Apr 6, 2020 at 6:09 PM Reza Ardeshir Rokni <[email protected]> >>>> wrote: >>>> >>>>> Depends on the use case, Global state comes with the technical debt of >>>>> having to do your own GC, but comes with more control. You could >>>>> implement the pattern above with a long FixedWindow as well, which will >>>>> take care of the GC within the window bound. >>>>> >>>>> Sorry, its not a yes / no answer :-) >>>>> >>>>> On Tue, 7 Apr 2020 at 09:03, Mohil Khare <[email protected]> wrote: >>>>> >>>>>> Thanks a lot Reza for your quick response. Yeah saving the data in an >>>>>> external system after timer expiry makes sense. >>>>>> So do you suggest using a global window for maintaining state ? >>>>>> >>>>>> Thanks and regards >>>>>> Mohil >>>>>> >>>>>> On Mon, Apr 6, 2020 at 5:37 PM Reza Ardeshir Rokni <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Are you able to make use of the following pattern? >>>>>>> >>>>>>> Store StateA-metadata until no activity for Duration X, you can use >>>>>>> a Timer to check this, then expire the value, but store in an >>>>>>> external system. If you get a record that does want this value after >>>>>>> expiry, call out to the external system and store the value again in key >>>>>>> StateA-metadata. >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> On Tue, 7 Apr 2020 at 08:03, Mohil Khare <[email protected]> wrote: >>>>>>> >>>>>>>> Hello all, >>>>>>>> We are attempting a implement a use case where beam (java sdk) >>>>>>>> reads two kind of records from data stream like Kafka: >>>>>>>> >>>>>>>> 1. Records of type A containing key and corresponding metadata. >>>>>>>> 2. Records of type B containing the same key, but no metadata. Beam >>>>>>>> then needs to fill metadata for records of type B by doing a lookup >>>>>>>> for >>>>>>>> metadata using keys received in records of type A. >>>>>>>> >>>>>>>> Idea is to save metadata or rather state for keys received in >>>>>>>> records of type A and then do a lookup when records of type B are >>>>>>>> received. >>>>>>>> I have implemented this using the "@State" construct of beam. >>>>>>>> However my problem is that we don't know when keys should expire. I >>>>>>>> don't >>>>>>>> think keeping a global window will be a good idea as there could be >>>>>>>> many >>>>>>>> keys (may be millions over a period of time) to be saved in a state. >>>>>>>> >>>>>>>> What is the best way to achieve this? I was reading about RedisIO, >>>>>>>> but found that it is still in the experimental stage. Can someone >>>>>>>> please >>>>>>>> recommend any solution to achieve this. >>>>>>>> >>>>>>>> Thanks and regards >>>>>>>> Mohil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>
