damnMeddlingKid opened a new issue, #31373:
URL: https://github.com/apache/beam/issues/31373
### What happened?
We are attempting to use the STORAGE_WRITE_API with exactly-once guarantees
in our pipelines running on Runner V2. Our configuration uses dynamic
destinations and auto sharding, as detailed below:
```
BigQueryIO
.write[TableRowWithTableId]
.to(new DynamicDestinationImpl())
.optimizedWrites()
.withFormatFunction(_.tableRow)
.withMethod(BigQueryIO.Write.Method.STORAGE_WRITE_API)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors())
.withExtendedErrorInfo()
.withAutoSharding()
.withTriggeringFrequency(Duration.standardMinutes(15))
```
### Issue Encountered
When we run our pipeline on runner V2 with the above BigQueryIO
configuration we get the following error
```
Error translating pipeline. Runner V2 doesn't support the following SDK
features: [Use STORAGE_WRITE_API].
```
The pipeline executes successfully when we modify the configuration to use a
static number of write streams (withNumStorageWriteApiStreams(40)) instead of
auto sharding.
While looking for references on this issue I found
https://partnerissuetracker.corp.google.com/issues/271105510 which claims that
auto sharding should work on Runner V2.
### Questions
1. Is it safe to use a static number of write streams as a work around to
using the `STORAGE_WRITE_API` on runner V2
2. What is the current state of `STORAGE_WRITE_API` support on runner V2 ?,
im struggling to find an issue or documentation on this.
3. Is it possible to support auto sharding for BigQueryIO.
### Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
### Issue Components
- [ ] Component: Python SDK
- [X] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam YAML
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]