Abacn commented on issue #26395: URL: https://github.com/apache/beam/issues/26395#issuecomment-1532313572
Thanks for reporting this. This if block exists at the first place when `GroupIntoBatches` was implemented (#2610), assuming there was some reason for that. `WithShardedKey` was added some time later (still 3y ago). Haven't looked into detail, would like to dig into it. > "Total streaming data processed" metric was so much higher (2-4× depending on the pipeline) than the actual data Have you tested that this is caused by the pointed code path? Would be nice if there is some benchmark data to share with -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
