ianb-pomelo opened a new issue, #30167:
URL: https://github.com/apache/beam/issues/30167

   ### What would you like to happen?
   
   Right now one of the known limitations of the Spanner change stream source 
is it can't be drained 
[1](https://cloud.google.com/spanner/docs/change-streams/use-dataflow#draining).
 Is there a way to allow draining this connector? 
   
   Currently our use case is we have a job that consumes change stream value 
but the structure of this jobs changes frequently. To handle this, we try to do 
in-place updates and if those fail, drain and start a new job. This works with 
Pub/Sub sources but to get around the fact that the change streams can't be 
drained, we have an intermediate job that converts the Spanner changes into 
Pub/Sub messages and then the changing job consumes that. However, this has 
caused a huge increase in latency, the commit time -> change stream read is 
pretty consistently 200ms but when we add this Pub/Sub layer, it increases the 
latency to ~5s.
   
   ### Issue Priority
   
   Priority: 2 (default / most feature requests should be filed as P2)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [X] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to