[
https://issues.apache.org/jira/browse/BEAM-11006?focusedWorklogId=501671&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-501671
]
ASF GitHub Bot logged work on BEAM-11006:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Oct/20 20:52
Start Date: 16/Oct/20 20:52
Worklog Time Spent: 10m
Work Description: dhercher commented on a change in pull request #13055:
URL: https://github.com/apache/beam/pull/13055#discussion_r506715613
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
##########
@@ -2002,6 +2007,11 @@ static String getExtractDestinationUri(String
extractDestinationDir) {
return toBuilder().setFormatFunction(formatFunction).build();
}
+ /** Formats the user's type into a {@link TableRow} to be written to an
error collector. */
+ public Write<T> withFailsafeFormatFunction(SerializableFunction<T,
TableRow> formatFunction) {
Review comment:
Sounds good, renaming it to
`withFormatRecordOnFailureFunction`
And adding more in the Javadoc
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 501671)
Remaining Estimate: 334h (was: 334h 10m)
Time Spent: 2h (was: 1h 50m)
> Allow Failsafe Handling of BigQuery Streaming Writes
> ----------------------------------------------------
>
> Key: BEAM-11006
> URL: https://issues.apache.org/jira/browse/BEAM-11006
> Project: Beam
> Issue Type: Improvement
> Components: extensions-java-gcp
> Reporter: Dylan Hercher
> Priority: P2
> Labels: Clarified, bigquery, google-cloud-bigquery
> Original Estimate: 336h
> Time Spent: 2h
> Remaining Estimate: 334h
>
> To allow handling of a generic failsafe (of any type) would allow a dead
> letter queue to retain the original source data rather than the cleaned
> version and could be more easily understood and re-processed.
>
> The BigQueryIO.Write currently supports `withFormatFunction` which allows for
> a serializable function to be applied to each datapoint -> TableRow. Ideally
> that same source value could be converted with a separate function:
> `withFailsafeFormatFunction` taken (InputT -> TableRow) or possibly (InputT
> -> OutputT), though the backwards compatibility of OutputT is more difficult.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)