lostluck opened a new issue, #25454:
URL: https://github.com/apache/beam/issues/25454

   ### What would you like to happen?
   
   As implemented, the Python SDK Datastore IO Query doesn't currently retry on 
retryable RPC/HTTP errors, in particular, Deadline exceeded.
   
   Per the [Datastore 
documentation](https://cloud.google.com/datastore/docs/concepts/errors) 
DEADLINE_EXCEEDED errors should retry using exponential backoff. 
   
   
https://github.com/apache/beam/blob/v2.44.0/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py#L304
   
   Writes currently do this at least, but the same applies to reads. 
https://github.com/apache/beam/blob/v2.44.0/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py#L397
   
   ----
   
   It does occur to me that this would need to be done in a safe enough fashion 
to not redundantly re-emit already read and processed data. This may complicate 
the implementation of this resilience improvement.
   
   ### Issue Priority
   
   Priority: 3 (nice-to-have improvement)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [X] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to