nielm commented on a change in pull request #11532:
URL: https://github.com/apache/beam/pull/11532#discussion_r427485081
##########
File path:
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -1066,7 +1079,12 @@ public SpannerWriteResult
expand(PCollection<MutationGroup> input) {
spec.getBatchSizeBytes(),
spec.getMaxNumMutations(),
spec.getMaxNumRows(),
- spec.getGroupingFactor(),
+ // Do not group on streaming unless explicitly
set.
+ spec.getGroupingFactor()
+ .orElse(
+ input.isBounded() == IsBounded.BOUNDED
Review comment:
They already can set groupingFactorb to 1 if they want...
Breaking backward compatibility: unlikely.
The default of 1000 causes OOMs when using streaming, with wide windows, and
high throughput... When this happens, it is not always obvious that grouping is
the issue...
With smaller windows/less throughput, it is much less likely that a group
will be filled, (groups are bounded by bundles, which are bounded by windows).,
So it is unlikely that anyone ever got to fill the group with 1000 batches.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]