nbali commented on issue #26395: URL: https://github.com/apache/beam/issues/26395#issuecomment-1831002767
@Abacn I see how that would cause a similar pattern in unexpected shuffled data amount, and I also get it how that change fixes that issue, but what I can't see is where do we use an `Iterable` in the tested scenarios that would cause this? When checking `GroupIntoBatchesDoFn` the obvious pick would be `BagState`, but that is what this test was about, and we clearly didn't read it. Looking at the other states (`CombiningState`, `ValueState`) nothing really sticks out for me... OR is that the `registerByteObserver()` was called regularly on the `BagState`'s `Iterable` content by the framework and although we clearly didn't explicitly read it, it was read as this unintended side-effect therefore causing the costs? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
