thiagotnunes opened a new pull request #16514:
URL: https://github.com/apache/beam/pull/16514


   Adds ReadChangeStreamPartitionDoFn, which is an SDF to read partitions from 
change streams and process them accordingly. This component receives a change 
stream name, a partition, a start time and an end time to query. It then 
initiates a change stream query with the received parameters.
   
   Within a change stream, 3 types of records can be received:
   
   1. A Data record
   2. A Heartbeat record
   3. A Child partitions record
   
   Upon receiving #1, the function updates the watermark with the record's 
commit timestamp and emits the record into the output PCollection.
   Upon receiving #2, the function updates the watermark with the record's 
timestamp, but it does not emit any record into the PCollection.
   Upon receiving #3, the function updates the watermark with the record's 
timestamp and writes the new child partitions into the metadata table.
   These partitions will be later scheduled by the DetectNewPartitions 
component.
   
   Once the change stream query for the element partition finishes, it marks 
the partition as finished in the metadata table and terminates.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to