[
https://issues.apache.org/jira/browse/BEAM-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340173#comment-16340173
]
Eugene Kirpichov commented on BEAM-3501:
----------------------------------------
I tried doing a similar pipeline myself:
* Writing to table with partition decorators using DynamicDestinations, where
the table does not exist. It succeeds if I specify the TimePartitioning in the
TableDestination's in getTable(); and if I don't, it fails with a different
error (asking to specify TimePartitioning).
* Same but the table exists and is unpartitioned. Then I get yet a different
error: "Cannot add storage to a non-partitioned table with a partition
reference:..."
So far I'm unable to reproduce your issue. I'd really appreciate more details
here.
One issue I see is that you're using .expand() explicitly - this is incorrect
and can lead to all sorts of issues. expand() is an implementation detail of
all transforms, users MUST use .apply() instead.
> BigQuery Partitioned table creation/write fails when destination has
> partition decorator
> ----------------------------------------------------------------------------------------
>
> Key: BEAM-3501
> URL: https://issues.apache.org/jira/browse/BEAM-3501
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-gcp
> Reporter: Darshan Mehta
> Assignee: Chamikara Jayalath
> Priority: Major
>
> Following is the code that writes to BigQuery:
> {code:java}
> BigQueryIO.writeTableRows()
> .to(destination)
> .withCreateDisposition(CREATE_IF_NEEDED)
> .withWriteDisposition(WRITE_APPEND)
> .withSchema(tableSchema)
> .expand(tableRows);{code}
>
> Here's the destination's implementation:
> {code:java}
> public TableDestination apply(ValueInSingleWindow<TableRow> input) {
> String partition = timestampExtractor.apply(input.getValue())
> .toString(DateTimeFormat.forPattern("yyyyMMdd").withZoneUTC());
> TableReference tableReference = new TableReference();
> tableReference.setDatasetId(dataset);
> tableReference.setProjectId(projectId);
> tableReference.setTableId(String.format("%s_%s", table, partition));
> log.debug("Will write to BigQuery table: %s", tableReference);
> return new TableDestination(tableReference, null);
> }{code}
>
> When the dataflow tries to write to this table, I see the following message:
> {code:java}
> "errors" : [ {
> "domain" : "global",
> "message" : "Cannot read partition information from a table that is not
> partitioned: <project_id>:<dataset>.<table>$19730522",
> "reason" : "invalid"
> } ]{code}
> So, it looks like it's not creating tables with partition in the first place?
> Apache beam version : 2.2.0
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)