Hi Gurav, many thanks for your response. I'm currently using retry policies, but imagine the following scenario:
I'm trying to insert an existing field, even if we retry, it will still fail but I'll never be able to detect that within the pipeline, as getFailedInserts() https://beam.apache.org/documentation/sdks/javadoc/2.4.0/org/apache/beam/sdk/io/gcp/bigquery/WriteResult.html#getFailedInserts-- only contains the TableRows that failed, not the reason. Adding the error as well won't be very hard as I understand it because BigQueryServicesImpl.insertAll|() actually know about it: https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L750 I think I would even volunteer to work on it if the community feels it makes sense as well. Regards On Fri, Apr 6, 2018 at 1:28 AM Gaurav Thakur <[email protected]> wrote: > Hi Carlos, > > Would an insert retry policy help you? > Please see this, > https://beam.apache.org/documentation/sdks/javadoc/2.1.0/org/apache/beam/sdk/io/gcp/bigquery/InsertRetryPolicy.Context.html > > Thanks, Gaurav > > On Fri, Apr 6, 2018 at 8:13 AM, Pablo Estrada <[email protected]> wrote: > >> Im adding Cham as he might be knowledgeable about BQ IO, or he might be >> able to redirect to someone else. >> Cham, do you have guidance for Carlos here? >> Thanks >> -P. >> >> >> On Mon, Apr 2, 2018 at 11:08 AM Carlos Alonso <[email protected]> >> wrote: >> >>> And... where could I catch that exception? >>> >>> Thanks! >>> On Mon, 2 Apr 2018 at 16:58, Ted Yu <[email protected]> wrote: >>> >>>> Wouldn't the following code give you information about failed >>>> insertions (around line 790 in BigQueryServicesImpl) ? >>>> >>>> if (!allErrors.isEmpty()) { >>>> throw new IOException("Insert failed: " + allErrors); >>>> >>>> Cheers >>>> >>>> On Mon, Apr 2, 2018 at 7:16 AM, Carlos Alonso <[email protected]> >>>> wrote: >>>> >>>>> Hi everyone!! >>>>> >>>>> I was wondering if there's any way to get the error why an insert >>>>> (streaming) failed. Looking at the code I think there's currently no way >>>>> to >>>>> do that, as the BigQueryServicesImpl insertAll seems to discard the errors >>>>> and just add the failed TableRow instances into the failedInserts list. >>>>> >>>>> It would be very nice to have an "enriched" TableRow returned instead >>>>> that contains the error information for further processing (in our use >>>>> case >>>>> we're saving the failed ones into a different table for further analysis) >>>>> >>>>> Could this be added as an enhancement or similar Issue in GH/Jira? Any >>>>> other ideas? >>>>> >>>>> Thanks! >>>>> >>>> >>>> -- >> Got feedback? go/pabloem-feedback >> > >
