[ 
https://issues.apache.org/jira/browse/BEAM-8257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169009#comment-17169009
 ] 

Beam JIRA Bot commented on BEAM-8257:
-------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> BigQueryIO - only first day table can be created, despite having 
> CreateDisposition.CREATE_IF_NEEDED
> ---------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-8257
>                 URL: https://issues.apache.org/jira/browse/BEAM-8257
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.21.0
>         Environment:  Dataflow streaming pipeline 
>            Reporter: D Bodych
>            Priority: P2
>              Labels: stale-P2
>
> I have a dataflow job processing data from pub/sub defined like this:
> *read from pub/sub -> process (my function) -> group into day windows -> 
> write to BQ*
> I'm using *Write.Method.FILE_LOADS* because of bounded input.
> My job works fine, processing lots of GBs of data but it fails and tries to 
> retry forever when it gets to create another table. The job is meant to run 
> continuously and create day tables on its own, it does fine on the first few 
> ones but then gives me indefinitely:
> {code:java}
> Processing stuck in step 
> write-bq/BatchLoads/SinglePartitionWriteTables/ParMultiDo(WriteTables) for at 
> least 05h30m00s without outputting or completing in state finish{code}
> Before this happens it also throws: 
> {code:java}
> Load job <job_id> failed, will retry: {"errorResult": {"message":"Not found: 
> Table <name_of_table> was not found in location US","reason":"notFound"}
> {code}
> It is indeed a right error because this table doesn't exists. Problem is that 
> the job should create it on its own because of defined option 
> *CreateDisposition.CREATE_IF_NEEDED*.
> The number of day tables that it creates correctly without a problem depends 
> on number of workers. It seems that when some worker creates one table its 
> *CreateDisposition* changes to *CREATE_NEVER* causing the problem, but it's 
> only my guess.
> The similar problem was reported here but without any definite answer:
>  
> https://issues.apache.org/jira/browse/BEAM-3772?focusedCommentId=16387609&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16387609
> ProcessElement definition here seems to give some clues but I cannot really 
> say how it works with multiple workers: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L138]
> I use 2.15.0 Apache SDK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to