[
https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Kirpichov updated BEAM-2870:
-----------------------------------
Fix Version/s: 2.3.0
> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -------------------------------------------------------------------------
>
> Key: BEAM-2870
> URL: https://issues.apache.org/jira/browse/BEAM-2870
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.2.0
> Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb
> SDD)
> Reporter: Steven Jon Anderson
> Assignee: Reuven Lax
> Labels: bigquery, dataflow, google, google-cloud-bigquery,
> google-dataflow
> Fix For: 2.2.0, 2.3.0
>
>
> Dataflow Job ID:
> https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration
> that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events
> into our clients' numerous tables. To keep costs down, all of our tables are
> partitioned. However, this requires that we create the tables before we allow
> events to process as creating partitioned tables isn't supported in 2.1.0.
> We've been looking forward to [~reuvenlax]'s partition table write feature
> ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master
> for some time now as it'll allow us to launch our client platforms much, much
> faster. Today we got around to testing the 2.2.0 nightly and discovered this
> bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to
> an existing partitioned table with a decorator, the write succeeds. When
> using a partitioned table destination that doesn't exist without a decorator,
> the write succeeds. *However, when writing to a partitioned table that
> doesn't exist with a decorator, the write fails*.
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
> .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
> .to(new DynamicDestinations<TableRow, String>() {
> @Override
> public String getDestination(ValueInSingleWindow<TableRow> element) {
> return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
> }
> @Override
> public TableDestination getTable(String destination) {
> TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
> return new TableDestination(destination, null, DAY_PARTITION);
> }
> @Override
> public TableSchema getSchema(String destination) {
> return TABLE_SCHEMA;
> }
> })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus
> underscores) and must be at most 1024 characters long. Also, Table decorators
> cannot be used.
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)