[ 
https://issues.apache.org/jira/browse/BEAM-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150546#comment-17150546
 ] 

Sreekanth Nutulapati edited comment on BEAM-876 at 7/2/20, 7:29 PM:
--------------------------------------------------------------------

[~ziel] / [~monicaPC] - I am using the apache beam version 2.21.0, and I cannot 
get the alter table (adding a new column to existing table) working with the 
SchemaUpdateOptions.ALLOW_FIELD_ADDITION. The record is getting persisted in 
the BigQuery table, but without the new column being added to it; I do not see 
any errors in the Apache Beam Job logs. Here is my code, one difference I see 
with the test in your commits is using API writeToTableRows() instead of 
write(), but the writeToTableRows() internally calls the write() API. Please 
let me know if you see any issue, or advise if I should open a new Jira ticket. 
Also, this ticket is still Open even though the pull request is merged :).

*****Code*******

WriteResult writeResult = myPCollection.apply("Writing my Events to BigQuery",
 BigQueryIO.<MyEvent>write()
 .to(new TableReference()
 .setProjectId(<projectId>)
 .setDatasetId(<dataSetId>)
 .setTableId(<tableId>)
 )
 .withSchema(mySchema)
 .withFormatFunction((myEvent) ->

{ TableRow row = new TableRow(); row.set("column1", myEvent.getColumn1()); 
row.set("column2", myEvent.getColumn2()); .... return row; }

)
 .optimizedWrites()
 .ignoreUnknownValues()
 .withExtendedErrorInfo()
 .withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors())
 .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
 .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
 .withSchemaUpdateOptions(Set.of(SchemaUpdateOption.ALLOW_FIELD_ADDITION))
 );

***************


was (Author: nsreekanth):
[~ziel] / [~monicaPC] - I am using the apache beam version 2.21.0, and I cannot 
get the alter table (adding a new column to existing table) working with the 
SchemaUpdateOptions.ALLOW_FIELD_ADDITION. The record is getting persisted in 
the BigQuery table, but without the new column being added to it; I do not see 
any errors in the Apache Beam Job logs. Here is my code, one difference I see 
with the test in your commits is using API writeToTableRows() instead of 
write(), but the writeToTableRows() internally calls the write() API. Please 
let me know if you see any issue, or advise if I should open a new Jira ticket.

*****Code*******

WriteResult writeResult = myPCollection.apply("Writing my Events to BigQuery",
 BigQueryIO.<MyEvent>write()
 .to(new TableReference()
 .setProjectId(<projectId>)
 .setDatasetId(<dataSetId>)
 .setTableId(<tableId>)
 )
 .withSchema(mySchema)
 .withFormatFunction((myEvent) ->

{ TableRow row = new TableRow(); row.set("column1", myEvent.getColumn1()); 
row.set("column2", myEvent.getColumn2()); .... return row; }

)
 .optimizedWrites()
 .ignoreUnknownValues()
 .withExtendedErrorInfo()
 .withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors())
 .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
 .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
 .withSchemaUpdateOptions(Set.of(SchemaUpdateOption.ALLOW_FIELD_ADDITION))
 );

***************

> Support schemaUpdateOption in BigQueryIO
> ----------------------------------------
>
>                 Key: BEAM-876
>                 URL: https://issues.apache.org/jira/browse/BEAM-876
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Eugene Kirpichov
>            Assignee: canaan silberberg
>            Priority: P2
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> BigQuery recently added support for updating the schema as a side effect of 
> the load job.
> Here is the relevant API method in JobConfigurationLoad: 
> https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/java/latest/com/google/api/services/bigquery/model/JobConfigurationLoad.html#setSchemaUpdateOptions(java.util.List)
> BigQueryIO should support this too. See user request for this: 
> http://stackoverflow.com/questions/40333245/is-it-possible-to-update-schema-while-doing-a-load-into-an-existing-bigquery-tab



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to