tomaslink commented on issue #38017:
URL: https://github.com/apache/beam/issues/38017#issuecomment-4244572703

   > Hey [@tomaslink](https://github.com/tomaslink), the BQ sink takes the copy 
job route when it accumulates either 10K files or 15TB of data (check 
[here](https://github.com/apache/beam/blob/e1f02622f364005de4236f8024355fc4236c6e97/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L67-L71)).
   > 
   > If the time interval is just 60seconds and it still triggers copy jobs, my 
guess is we're just creating too many files. Can you try reducing 
`max_files_per_bundle` (default is 20) ? That will reduce the number of writers 
created per bundle and hopefully get you below the 10K limit
   
   @ahmedabu98 Hi! Thanks for the suggestion, I will try it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to