thiagotnunes opened a new pull request #16514: URL: https://github.com/apache/beam/pull/16514
Adds ReadChangeStreamPartitionDoFn, which is an SDF to read partitions from change streams and process them accordingly. This component receives a change stream name, a partition, a start time and an end time to query. It then initiates a change stream query with the received parameters. Within a change stream, 3 types of records can be received: 1. A Data record 2. A Heartbeat record 3. A Child partitions record Upon receiving #1, the function updates the watermark with the record's commit timestamp and emits the record into the output PCollection. Upon receiving #2, the function updates the watermark with the record's timestamp, but it does not emit any record into the PCollection. Upon receiving #3, the function updates the watermark with the record's timestamp and writes the new child partitions into the metadata table. These partitions will be later scheduled by the DetectNewPartitions component. Once the change stream query for the element partition finishes, it marks the partition as finished in the metadata table and terminates. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
