Heejong Lee created BEAM-6443:
---------------------------------
Summary: decrease the number of thread for BigQuery streaming
insertAll
Key: BEAM-6443
URL: https://issues.apache.org/jira/browse/BEAM-6443
Project: Beam
Issue Type: Improvement
Components: io-java-gcp
Reporter: Heejong Lee
Assignee: Heejong Lee
When inserting (a large number of ) very small elements into BigQuery via
streaming insertAll, BigQueryIO causes lots of quota exceeded errors. This
implies that 1) BigQueryIO puts unnecessary overheads on BigQuery API layer by
sending requests too fast 2) log file becomes very big because of repeated same
error messages. Currently we use 50 shards for writing data into BigQuery and
in each bundle 20-30 futures are executed simultaneously with unlimited thread
pool. It would be worth investigating whether just single thread pool is
sufficient for running concurrent insertAll.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)