Sahith Nallapareddy created BEAM-7755:
-----------------------------------------
Summary: BigQuery Repeated Records do not seem to work
Key: BEAM-7755
URL: https://issues.apache.org/jira/browse/BEAM-7755
Project: Beam
Issue Type: Bug
Components: io-java-avro, io-java-gcp
Affects Versions: 2.13.0, 2.12.0
Reporter: Sahith Nallapareddy
When translating BigQuery rows to beam rows, specifically using theĀ
BigQueryUtils.toBeamRow(record, beamSchema) method, REPEATEDĀ RECORDS causes an
error. This seems to be caused that arrays are thought to only have primitive
types but these are arrays with a ROW type:
{noformat}
Caused by: java.lang.RuntimeException: ROW is not primitive type.
at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroPrimitiveTypes(BigQueryUtils.java:467)
at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroArray(BigQueryUtils.java:427)
at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroFormat(BigQueryUtils.java:373)
at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$2(BigQueryUtils.java:222)
at
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:223)
at
com.spotify.data.sql.RowSource.lambda$bigquery$120a5f9f$1(RowSource.java:55)
at
org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:242)
at
org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:235)
at
org.apache.beam.sdk.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:597)
at
org.apache.beam.sdk.io.BlockBasedSource$BlockBasedReader.readNextRecord(BlockBasedSource.java:209)
at
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:484)
at
org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
at
org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:249)
at
org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:601)
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)