Nice!

On Mon, 18 May 2020, 05:52 Mohil Khare, <[email protected]> wrote:

> Hi Reza and others,
> As suggested, I have opened
> https://issues.apache.org/jira/browse/BEAM-10019 which I think might be a
> good addition to beam pipeline patterns.
>
> Thanks
> Mohil
>
> On Mon, Apr 6, 2020 at 6:28 PM Mohil Khare <[email protected]> wrote:
>
>> Sure thing.. I would love to contribute.
>>
>> Thanks
>> Mohil
>>
>>
>>
>> On Mon, Apr 6, 2020 at 6:17 PM Reza Ardeshir Rokni <[email protected]>
>> wrote:
>>
>>> Great! BTW if you get the time and wanted to contribute back to beam
>>> there is a nice section to record cool patterns:
>>>
>>> https://beam.apache.org/documentation/patterns/overview/
>>>
>>> This would make a great one!
>>>
>>> On Tue, 7 Apr 2020 at 09:12, Mohil Khare <[email protected]> wrote:
>>>
>>>> No ... that's a valid answer. Since I wanted to have a long window size
>>>> per key and since we can't use state with session windows, I am using a
>>>> sliding window for let's say 72 hrs which starts every hour.
>>>>
>>>> Thanks a lot Reza for your input.
>>>>
>>>> Regards
>>>> Mohil
>>>>
>>>> On Mon, Apr 6, 2020 at 6:09 PM Reza Ardeshir Rokni <[email protected]>
>>>> wrote:
>>>>
>>>>> Depends on the use case, Global state comes with the technical debt of
>>>>> having to do your own GC, but comes with more control. You could
>>>>> implement the pattern above with a long FixedWindow as well, which will
>>>>> take care of the GC within the window  bound.
>>>>>
>>>>> Sorry, its not a yes / no answer :-)
>>>>>
>>>>> On Tue, 7 Apr 2020 at 09:03, Mohil Khare <[email protected]> wrote:
>>>>>
>>>>>> Thanks a lot Reza for your quick response. Yeah saving the data in an
>>>>>> external system after timer expiry makes sense.
>>>>>> So do you suggest using a global window for maintaining state ?
>>>>>>
>>>>>> Thanks and regards
>>>>>> Mohil
>>>>>>
>>>>>> On Mon, Apr 6, 2020 at 5:37 PM Reza Ardeshir Rokni <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Are you able to make use of the following pattern?
>>>>>>>
>>>>>>> Store StateA-metadata until no activity for Duration X, you can use
>>>>>>> a Timer to check this, then expire the value, but store in an
>>>>>>> external system. If you get a record that does want this value after
>>>>>>> expiry, call out to the external system and store the value again in key
>>>>>>> StateA-metadata.
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> On Tue, 7 Apr 2020 at 08:03, Mohil Khare <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hello all,
>>>>>>>> We are attempting a implement a use case where beam (java sdk)
>>>>>>>> reads two kind of records from data stream like Kafka:
>>>>>>>>
>>>>>>>> 1. Records of type A containing key and corresponding metadata.
>>>>>>>> 2. Records of type B containing the same key, but no metadata. Beam
>>>>>>>> then needs to fill metadata for records of type B  by doing a lookup 
>>>>>>>> for
>>>>>>>> metadata using keys received in records of type A.
>>>>>>>>
>>>>>>>> Idea is to save metadata or rather state for keys received in
>>>>>>>> records of type A and then do a lookup when records of type B are 
>>>>>>>> received.
>>>>>>>> I have implemented this using the "@State" construct of beam.
>>>>>>>> However my problem is that we don't know when keys should expire. I 
>>>>>>>> don't
>>>>>>>> think keeping a global window will be a good idea as there could be 
>>>>>>>> many
>>>>>>>> keys (may be millions over a period of time) to be saved in a state.
>>>>>>>>
>>>>>>>> What is the best way to achieve this? I was reading about RedisIO,
>>>>>>>> but found that it is still in the experimental stage. Can someone 
>>>>>>>> please
>>>>>>>> recommend any solution to achieve this.
>>>>>>>>
>>>>>>>> Thanks and regards
>>>>>>>> Mohil
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>

Reply via email to