[GitHub] [beam] nbali commented on issue #26395: [Bug]: Possibly unnecessary prefetch during GroupIntoBatches

via GitHub Mon, 15 May 2023 10:33:33 -0700


nbali commented on issue #26395:
URL: https://github.com/apache/beam/issues/26395#issuecomment-1548268836


   I haven't had the time yet - I will try to make time in the upcoming days -, 
but your example has a theoretical maximum at 100k element/sec, executed for 
900s, meanwhile it's only 28M and 54M. Which means it was 30-60k/sec. The 
hashing split it into 10 keys. So a single key received about 3-6k msg/sec. The 
windowing essentially closed every batch after 10s. So a window for a key 
received 30-60k element, which indicates totally different amount of 
prefetching (even if both are using the master version), yet the data amount is 
the same. Seems odd to me. Do you have the input/output stats of the GIB 
transform? Also over a minute delays with both cases with 10s windowing? It's 
seems CPU limited. I mean isn't this essentially single threaded?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] nbali commented on issue #26395: [Bug]: Possibly unnecessary prefetch during GroupIntoBatches

Reply via email to