[ 
https://issues.apache.org/jira/browse/BEAM-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Pocreau updated BEAM-12863:
----------------------------------
    Description: 
When using PubsubAvroToBigQuery Dataflow template, [I noticed this issue. 
|https://github.com/GoogleCloudPlatform/DataflowTemplates/issues/287]
This seems to be related to the way 
[toBeamRow|https://github.com/apache/beam/blob/v2.32.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L593]
 is handling TableRow with List containing null values.


The error trace has this path (I only added the relevant ones):

a. 
com.google.cloud.teleport.v2.transforms.BigQueryConverters$TableRowToGenericRecordFn.apply(BigQueryConverters.java:548)
 
 b. 
[org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:580)
 
 c. 
[org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamRowFieldValue(BigQueryUtils.java:593)
 d. 
[org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamValue(BigQueryUtils.java:641)

On c., a validation of null "bqValue" objects is performed; however, it appears 
that some of the elements are of List type (for BigQuery Record type); 
therefore, the List object is validated to be non-null, but not the elements of 
the List.

On d., the method is executed recursively to process all the elements of the 
List object; however, it seems that some objects are Null so this method is 
throwing the NullPointerExceptions.

toBeamValue method should probably not used toBeamValue recursively but 
toBeamRowFieldValue instead.

  was:
The error trace has this path (I only added the relevant ones): 
 
 a. 
com.google.cloud.teleport.v2.transforms.BigQueryConverters$TableRowToGenericRecordFn.apply(BigQueryConverters.java:548)
 
 b. 
[org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:580)
 
 c. 
[org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamRowFieldValue(BigQueryUtils.java:593)
 d. 
[org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamValue(BigQueryUtils.java:641)
 
 
On c., a validation of null "bqValue" objects is performed; however, it appears 
that some of the elements are of List type (for BigQuery Record type); 
therefore, the List object is validated to be non-null, but not the elements of 
the List. 
 
On d., the method is executed recursively to process all the elements of the 
List object; however, it seems that some objects are Null so this method is 
throwing the NullPointerExceptions. 

toBeamValue method should probably not used toBeamValue recursively but 
toBeamRowFieldValue instead.


> BigQueryUtils doesn't process List or Map with nullables
> --------------------------------------------------------
>
>                 Key: BEAM-12863
>                 URL: https://issues.apache.org/jira/browse/BEAM-12863
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.32.0
>            Reporter: Thomas Pocreau
>            Assignee: Matthew Ouyang
>            Priority: P3
>
> When using PubsubAvroToBigQuery Dataflow template, [I noticed this issue. 
> |https://github.com/GoogleCloudPlatform/DataflowTemplates/issues/287]
> This seems to be related to the way 
> [toBeamRow|https://github.com/apache/beam/blob/v2.32.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L593]
>  is handling TableRow with List containing null values.
> The error trace has this path (I only added the relevant ones):
> a. 
> com.google.cloud.teleport.v2.transforms.BigQueryConverters$TableRowToGenericRecordFn.apply(BigQueryConverters.java:548)
>  
>  b. 
> [org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:580)
>  
>  c. 
> [org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamRowFieldValue(BigQueryUtils.java:593)
>  d. 
> [org.apache.beam.sdk.io|https://org.apache.beam.sdk.io/].gcp.bigquery.BigQueryUtils.toBeamValue(BigQueryUtils.java:641)
> On c., a validation of null "bqValue" objects is performed; however, it 
> appears that some of the elements are of List type (for BigQuery Record 
> type); therefore, the List object is validated to be non-null, but not the 
> elements of the List.
> On d., the method is executed recursively to process all the elements of the 
> List object; however, it seems that some objects are Null so this method is 
> throwing the NullPointerExceptions.
> toBeamValue method should probably not used toBeamValue recursively but 
> toBeamRowFieldValue instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to