lukecwik commented on code in PR #22953: URL: https://github.com/apache/beam/pull/22953#discussion_r964007253
########## sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BatchLoads.java: ########## @@ -113,6 +113,9 @@ // If user triggering is supplied, we will trigger the file write after this many records are // written. static final int FILE_TRIGGERING_RECORD_COUNT = 500000; + // If user triggering is supplied, we will trigger the file write after this many bytes are + // written. + static final long FILE_TRIGGERING_BYTE_COUNT = 100 * (1L << 20); // 100MiB Review Comment: The default comes from this constant: https://www.javadoc.io/static/com.google.cloud.bigdataoss/util/1.9.17/com/google/cloud/hadoop/util/AsyncWriteChannelOptions.html#UPLOAD_CHUNK_SIZE_DEFAULT The user can override the default using this pipeline option: https://github.com/apache/beam/blob/b2a6f46fb21709cc1927ce1950a38a922dce0a35/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java#L91 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
