Hi Reza and others,
As suggested, I have opened
https://issues.apache.org/jira/browse/BEAM-10019 which
I think might be a good addition to beam pipeline patterns.

Thanks
Mohil

On Mon, Apr 6, 2020 at 6:28 PM Mohil Khare <[email protected]> wrote:

> Sure thing.. I would love to contribute.
>
> Thanks
> Mohil
>
>
>
> On Mon, Apr 6, 2020 at 6:17 PM Reza Ardeshir Rokni <[email protected]>
> wrote:
>
>> Great! BTW if you get the time and wanted to contribute back to beam
>> there is a nice section to record cool patterns:
>>
>> https://beam.apache.org/documentation/patterns/overview/
>>
>> This would make a great one!
>>
>> On Tue, 7 Apr 2020 at 09:12, Mohil Khare <[email protected]> wrote:
>>
>>> No ... that's a valid answer. Since I wanted to have a long window size
>>> per key and since we can't use state with session windows, I am using a
>>> sliding window for let's say 72 hrs which starts every hour.
>>>
>>> Thanks a lot Reza for your input.
>>>
>>> Regards
>>> Mohil
>>>
>>> On Mon, Apr 6, 2020 at 6:09 PM Reza Ardeshir Rokni <[email protected]>
>>> wrote:
>>>
>>>> Depends on the use case, Global state comes with the technical debt of
>>>> having to do your own GC, but comes with more control. You could
>>>> implement the pattern above with a long FixedWindow as well, which will
>>>> take care of the GC within the window  bound.
>>>>
>>>> Sorry, its not a yes / no answer :-)
>>>>
>>>> On Tue, 7 Apr 2020 at 09:03, Mohil Khare <[email protected]> wrote:
>>>>
>>>>> Thanks a lot Reza for your quick response. Yeah saving the data in an
>>>>> external system after timer expiry makes sense.
>>>>> So do you suggest using a global window for maintaining state ?
>>>>>
>>>>> Thanks and regards
>>>>> Mohil
>>>>>
>>>>> On Mon, Apr 6, 2020 at 5:37 PM Reza Ardeshir Rokni <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Are you able to make use of the following pattern?
>>>>>>
>>>>>> Store StateA-metadata until no activity for Duration X, you can use a
>>>>>> Timer to check this, then expire the value, but store in an
>>>>>> external system. If you get a record that does want this value after
>>>>>> expiry, call out to the external system and store the value again in key
>>>>>> StateA-metadata.
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> On Tue, 7 Apr 2020 at 08:03, Mohil Khare <[email protected]> wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>> We are attempting a implement a use case where beam (java sdk) reads
>>>>>>> two kind of records from data stream like Kafka:
>>>>>>>
>>>>>>> 1. Records of type A containing key and corresponding metadata.
>>>>>>> 2. Records of type B containing the same key, but no metadata. Beam
>>>>>>> then needs to fill metadata for records of type B  by doing a lookup for
>>>>>>> metadata using keys received in records of type A.
>>>>>>>
>>>>>>> Idea is to save metadata or rather state for keys received in
>>>>>>> records of type A and then do a lookup when records of type B are 
>>>>>>> received.
>>>>>>> I have implemented this using the "@State" construct of beam.
>>>>>>> However my problem is that we don't know when keys should expire. I 
>>>>>>> don't
>>>>>>> think keeping a global window will be a good idea as there could be many
>>>>>>> keys (may be millions over a period of time) to be saved in a state.
>>>>>>>
>>>>>>> What is the best way to achieve this? I was reading about RedisIO,
>>>>>>> but found that it is still in the experimental stage. Can someone please
>>>>>>> recommend any solution to achieve this.
>>>>>>>
>>>>>>> Thanks and regards
>>>>>>> Mohil
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>

Reply via email to