[ 
https://issues.apache.org/jira/browse/BEAM-11006?focusedWorklogId=498778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498778
 ]

ASF GitHub Bot logged work on BEAM-11006:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Oct/20 22:45
            Start Date: 09/Oct/20 22:45
    Worklog Time Spent: 10m 
      Work Description: dhercher commented on a change in pull request #13055:
URL: https://github.com/apache/beam/pull/13055#discussion_r502701806



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
##########
@@ -2002,6 +2006,11 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
       return toBuilder().setFormatFunction(formatFunction).build();
     }
 
+    /** Formats the user's type into a {@link TableRow} to be written to an 
error collector. */
+     public Write<T> withFailsafeFormatFunction(SerializableFunction<T, 
TableRow> formatFunction) {

Review comment:
       Currently the design is for the ErrorContainer to return TableRow's (or 
BigQueryError<TableRow>).  My worry is that it wouldn't be backwards compatible 
to make this generic, though I agree it would be more valuable.
   
   I think this question is important and is my only big open question around 
this design




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 498778)
    Remaining Estimate: 334h 50m  (was: 335h)
            Time Spent: 1h 10m  (was: 1h)

> Allow Failsafe Handling of BigQuery Streaming Writes
> ----------------------------------------------------
>
>                 Key: BEAM-11006
>                 URL: https://issues.apache.org/jira/browse/BEAM-11006
>             Project: Beam
>          Issue Type: Improvement
>          Components: extensions-java-gcp
>            Reporter: Dylan Hercher
>            Priority: P2
>              Labels: bigquery, google-cloud-bigquery
>   Original Estimate: 336h
>          Time Spent: 1h 10m
>  Remaining Estimate: 334h 50m
>
> To allow handling of a generic failsafe (of any type) would allow a dead 
> letter queue to retain the original source data rather than the cleaned 
> version and could be more easily understood and re-processed.
>  
> The BigQueryIO.Write currently supports `withFormatFunction` which allows for 
> a serializable function to be applied to each datapoint -> TableRow.  Ideally 
> that same source value could be converted with a separate function:
> `withFailsafeFormatFunction` taken (InputT -> TableRow) or possibly (InputT 
> -> OutputT), though the backwards compatibility of OutputT is more difficult.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to