[
https://issues.apache.org/jira/browse/BEAM-7755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ismaël Mejía updated BEAM-7755:
-------------------------------
Status: Open (was: Triage Needed)
> BigQuery Repeated Records do not seem to work
> ---------------------------------------------
>
> Key: BEAM-7755
> URL: https://issues.apache.org/jira/browse/BEAM-7755
> Project: Beam
> Issue Type: Bug
> Components: io-java-avro, io-java-gcp
> Affects Versions: 2.12.0, 2.13.0
> Reporter: Sahith Nallapareddy
> Priority: Major
>
> When translating BigQuery rows to beam rows, specifically using the
> BigQueryUtils.toBeamRow(record, beamSchema) method, REPEATED RECORDS causes
> an error. This seems to be caused that avro arrays are thought to only have
> primitive types but these are arrays with a ROW type:
> {noformat}
> Caused by: java.lang.RuntimeException: ROW is not primitive type.
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroPrimitiveTypes(BigQueryUtils.java:467)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroArray(BigQueryUtils.java:427)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroFormat(BigQueryUtils.java:373)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$2(BigQueryUtils.java:222)
> at
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:223)
> at
> com.spotify.data.sql.RowSource.lambda$bigquery$120a5f9f$1(RowSource.java:55)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:242)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:235)
> at
> org.apache.beam.sdk.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:597)
> at
> org.apache.beam.sdk.io.BlockBasedSource$BlockBasedReader.readNextRecord(BlockBasedSource.java:209)
> at
> org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:484)
> at
> org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
> at
> org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:249)
> at
> org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:601)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)