[ 
https://issues.apache.org/jira/browse/BEAM-12164?focusedWorklogId=717725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-717725
 ]

ASF GitHub Bot logged work on BEAM-12164:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 31/Jan/22 00:33
            Start Date: 31/Jan/22 00:33
    Worklog Time Spent: 10m 
      Work Description: thiagotnunes opened a new pull request #16655:
URL: https://github.com/apache/beam/pull/16655


   The original algorithm of the DetectNewPartitions is susceptible to 
failures, because it produces side effects on every try which is not  
idempotent. Specifically, it marks the partitions as SCHEDULED in the Spanner 
database and outputs them. If there is a bundle commit failure, during retry, 
the already SCHEDULED partitions will not be picked up again.
   
   We change the algorithm in this PR to always schedule partitions that have a 
created at timestamp greater than the one saved in the DetectNewPartitions 
restriction. When scheduling the partitions, this SDF will also claim the 
created at of such partitions, advancing the timestamp saved. If there is a 
bundle commit failure, the restriction timestamp won't be saved, thus the 
partitions in the bundle will be picked up again regardless of their state.
   
   More information can be seen at: 
https://docs.google.com/document/d/1IQAOqLmGuIaOJc55NmfUckM4rDCXHAmxKNuRg6Ae07U/edit#heading=h.q3e0xrkg85ay


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 717725)
    Time Spent: 8h 40m  (was: 8.5h)

> SpannerIO Change Stream Connector
> ---------------------------------
>
>                 Key: BEAM-12164
>                 URL: https://issues.apache.org/jira/browse/BEAM-12164
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core
>            Reporter: Thiago Nunes
>            Priority: P3
>          Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> We would like to augment the existing Google Cloud SpannerIO connector 
> ([https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java)]
>  with the support for Spanner Change Streams (CDC). CDC support is just being 
> implemented in Spanner and it will be exposed through a gRPC API. We will use 
> such API to create a new SpannerIO.readChangeStream(...) implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to