[
https://issues.apache.org/jira/browse/BEAM-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931600#comment-16931600
]
Jose Puertos edited comment on BEAM-3772 at 9/17/19 7:31 PM:
-------------------------------------------------------------
Here having the same issue with 2.12.0 and 2.15.0 . When looking into the Big
Query Jobs it seems as the code for the next jobs trying to upload partitions
after the first day use CREATE_NEVER even though the code has WRITE_APPEND
withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND).
withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
!image-2019-09-17-12-01-42-764.png|width=487,height=116!
Something worth mentioning is I'm using dynamic partitions.. Checking the code
of BatchLoads.java it seems that expandTriggered uses a WritePartitions that
doesn't pass the CreateDisposition as WriteTables does
Line 191 of [WriteTables
|[https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java]]has
the code that is changing it apparently if it's not the first partition..
was (Author: josepuertos):
Here having the same issue with 2.12.0 and 2.15.0 . When looking into the Big
Query Jobs it seems as the code for the next jobs trying to upload partitions
after the first day use CREATE_NEVER even though the code has WRITE_APPEND
withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND).
withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
!image-2019-09-17-12-01-42-764.png|width=487,height=116!
Something worth mentioning is I'm using dynamic partitions.. Checking the code
of BatchLoads.java it seems that expandTriggered uses a WritePartitions that
doesn't pass the CreateDisposition as WriteTables does
> BigQueryIO - Can't use DynamicDestination with CREATE_IF_NEEDED for unbounded
> PCollection and FILE_LOADS
> --------------------------------------------------------------------------------------------------------
>
> Key: BEAM-3772
> URL: https://issues.apache.org/jira/browse/BEAM-3772
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.2.0, 2.3.0
> Environment: Dataflow streaming pipeline
> Reporter: Benjamin BENOIST
> Assignee: Reuven Lax
> Priority: Major
> Attachments: bigquery-fail.png, bigquery-success.png,
> image-2019-09-17-12-01-42-764.png
>
>
> My workflow : KAFKA -> Dataflow streaming -> BigQuery
> Given that having low-latency isn't important in my case, I use FILE_LOADS to
> reduce the costs. I'm using _BigQueryIO.Write_ with a _DynamicDestination_,
> which is a table with the current hour as a suffix.
> This _BigQueryIO.Write_ is configured like this :
> {code:java}
> .withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(Method.FILE_LOADS)
> .withTriggeringFrequency(triggeringFrequency)
> .withNumFileShards(100)
> {code}
> The first table is successfully created and is written to. But then the
> following tables are never created and I get these exceptions:
> {code:java}
> (99e5cd8c66414e7a): java.lang.RuntimeException: Failed to create load job
> with id prefix
> 5047f71312a94bf3a42ee5d67feede75_5295fbf25e1a7534f85e25dcaa9f4986_00001_00023,
> reached max retries: 3, last failed load job: {
> "configuration" : {
> "load" : {
> "createDisposition" : "CREATE_NEVER",
> "destinationTable" : {
> "datasetId" : "dev_mydataset",
> "projectId" : "myproject-id",
> "tableId" : "mytable_20180302_16"
> },
> {code}
> The _CreateDisposition_ used is _CREATE_NEVER_, contrary as
> _CREATE_IF_NEEDED_ as specified.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)