Hello,

What I look into can actually be achieved by implementing one of the
caching strategies in a talk at Beam Summit 2022.

   - Strategies for caching data in Dataflow using Beam SDK
   
<https://2022.beamsummit.org/sessions/strategies-for-caching-data-in-dataflow-using-beam-sdk/>

Among the 4 options, I'd try a side input and the shared module
(with/without side input) first.

Cheers,
Jaehyeon


On Thu, 1 Aug 2024 at 13:30, Jaehyeon Kim <dott...@gmail.com> wrote:

> Thank you for letting me know. It is also available in the Python SDK -
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/util.py#L1656
>
> However, it doesn't seem to meet the requirement that the side input
> values can change over time because the mentioned transform just seems to
> wait until the previous one gets completed. What I look into is, let's say,
> a customer attribute changes then order records should be enriched with the
> updated attribute.
>
> On Thu, 1 Aug 2024 at 13:14, LDesire <two_som...@icloud.com> wrote:
>
>> Hello. I found a similar example code.
>> You can use `Wait` PTransform.
>> Wait (Apache Beam 2.13.0)
>> <https://beam.apache.org/releases/javadoc/2.13.0/org/apache/beam/sdk/transforms/Wait.html>
>> beam.apache.org
>> <https://beam.apache.org/releases/javadoc/2.13.0/org/apache/beam/sdk/transforms/Wait.html>
>> [image: favicon.ico]
>> <https://beam.apache.org/releases/javadoc/2.13.0/org/apache/beam/sdk/transforms/Wait.html>
>> <https://beam.apache.org/releases/javadoc/2.13.0/org/apache/beam/sdk/transforms/Wait.html>
>>
>> Hope this helps.
>>
>> [image: stateful-beam-realtime.png]
>>
>> stateful-beam-realtime/pipeline/src/main/java/org/stjimmy/beam/LtvPipelineSqlLookup.java
>> at 2cc16a9cf8460c5b0e4d749e81654273c14ffb00 · Jimmyst/stateful-beam-realtime
>> <https://github.com/Jimmyst/stateful-beam-realtime/blob/2cc16a9cf8460c5b0e4d749e81654273c14ffb00/pipeline/src/main/java/org/stjimmy/beam/LtvPipelineSqlLookup.java#L92>
>> github.com
>> <https://github.com/Jimmyst/stateful-beam-realtime/blob/2cc16a9cf8460c5b0e4d749e81654273c14ffb00/pipeline/src/main/java/org/stjimmy/beam/LtvPipelineSqlLookup.java#L92>
>>
>> <https://github.com/Jimmyst/stateful-beam-realtime/blob/2cc16a9cf8460c5b0e4d749e81654273c14ffb00/pipeline/src/main/java/org/stjimmy/beam/LtvPipelineSqlLookup.java#L92>
>>
>>
>> 2024. 8. 1. 오전 11:52, Jaehyeon Kim <dott...@gmail.com> 작성:
>>
>> Hello,
>>
>> I'm looking into side input patterns especially slowly updating global
>> window side inputs -
>> https://beam.apache.org/documentation/patterns/side-inputs/
>>
>> It'd be useful if we need to enrich eg) order records with customer
>> details where customer details would be taken as a side input.
>>
>> Let's say we have two Kafka topics, one for client records and the other
>> for order records. For the enrichment to work properly, consumption of
>> order records should wait until all customer records are read.
>>
>> Can you please inform me if it is achievable?
>>
>> Cheers,
>> Jaehyeon
>>
>>
>>

Attachment: favicon.ico
Description: Binary data

Reply via email to