Sahith Nallapareddy created BEAM-7755:
-----------------------------------------

             Summary: BigQuery Repeated Records do not seem to work
                 Key: BEAM-7755
                 URL: https://issues.apache.org/jira/browse/BEAM-7755
             Project: Beam
          Issue Type: Bug
          Components: io-java-avro, io-java-gcp
    Affects Versions: 2.13.0, 2.12.0
            Reporter: Sahith Nallapareddy


When translating BigQuery rows to beam rows, specifically using theĀ  
BigQueryUtils.toBeamRow(record, beamSchema) method, REPEATEDĀ RECORDS causes an 
error. This seems to be caused that arrays are thought to only have primitive 
types but these are arrays with a ROW type:
{noformat}
Caused by: java.lang.RuntimeException: ROW is not primitive type.
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroPrimitiveTypes(BigQueryUtils.java:467)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroArray(BigQueryUtils.java:427)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroFormat(BigQueryUtils.java:373)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$2(BigQueryUtils.java:222)
        at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
        at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
        at 
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at 
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:223)
        at 
com.spotify.data.sql.RowSource.lambda$bigquery$120a5f9f$1(RowSource.java:55)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:242)
        at 
org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:235)
        at 
org.apache.beam.sdk.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:597)
        at 
org.apache.beam.sdk.io.BlockBasedSource$BlockBasedReader.readNextRecord(BlockBasedSource.java:209)
        at 
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:484)
        at 
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
        at 
org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:249)
        at 
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:601)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to