ahmedabu98 commented on PR #27434:
URL: https://github.com/apache/beam/pull/27434#issuecomment-1629658908

   >That's awesome! curious about what was wrong and how it is fixed 
   
   Thanks! yeah it took a while to figure out because our fake testing service 
doesn't propagate an error. I tried running the same pipeline a large number of 
times with a real BigQuery table and ran into the same behavior (copy job stuck 
retrying) on one occurrence. BQ was giving a "table already exists" error and 
kept retrying the copy job. 
   
   Write disposition defaults to `WRITE_EMPTY`, so I set it to `WRITE_APPEND` 
and reran many times without running into the same behavior. 
   We're supposed to already be covering this here (note this test is a 
streaming pipeline): 
https://github.com/apache/beam/blob/4c66866aa9544d1796c7c3880192cb57d2a8dcc0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteRename.java#L187-L196
   ie. we always set it to `WRITE_APPEND` after the first trigger of copy jobs.
   
   Somehow this doesn't always work? and the user-specified disposition 
continues to be used.
   
   I've run the tests in a few different ways and the common denominator seems 
to be that this is happening when the table is created beforehand (as opposed 
to letting the pipeline create it)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to