[
https://issues.apache.org/jira/browse/BEAM-8933?focusedWorklogId=599914&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-599914
]
ASF GitHub Bot logged work on BEAM-8933:
----------------------------------------
Author: ASF GitHub Bot
Created on: 20/May/21 16:48
Start Date: 20/May/21 16:48
Worklog Time Spent: 10m
Work Description: MiguelAnzoWizeline commented on a change in pull
request #14586:
URL: https://github.com/apache/beam/pull/14586#discussion_r636279367
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
##########
@@ -601,8 +602,17 @@ public static Read read() {
@Override
public TableRow apply(SchemaAndRecord schemaAndRecord) {
- return BigQueryAvroUtils.convertGenericRecordToTableRow(
- schemaAndRecord.getRecord(), schemaAndRecord.getTableSchema());
+ // TODO(BEAM-9114): Implement a function to encapsulate row conversion
logic.
+ try {
+ return BigQueryAvroUtils.convertGenericRecordToTableRow(
+ schemaAndRecord.getRecord(), schemaAndRecord.getTableSchema());
+ } catch (IllegalStateException i) {
+ if (schemaAndRecord.getRow() != null) {
+ return BigQueryUtils.toTableRow().apply(schemaAndRecord.getRow());
+ }
+ throw new IllegalStateException(
+ "Record should be of instance GenericRecord (for Avro format) or
of instance Row (for Arrow format), but it is not.");
+ }
Review comment:
R: @TheNeuralBit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 599914)
Time Spent: 36h (was: 35h 50m)
> BigQuery IO should support reading Arrow format over Storage API
> ----------------------------------------------------------------
>
> Key: BEAM-8933
> URL: https://issues.apache.org/jira/browse/BEAM-8933
> Project: Beam
> Issue Type: Improvement
> Components: io-java-gcp
> Reporter: Kirill Kozlov
> Assignee: Miguel Anzo
> Priority: P3
> Time Spent: 36h
> Remaining Estimate: 0h
>
> As of right now BigQuery uses Avro format for reading and writing.
> We should add a config to BigQueryIO to specify which format to use: Arrow or
> Avro (with Avro as default).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)