Ganeshsivakumar commented on PR #38280:
URL: https://github.com/apache/beam/pull/38280#issuecomment-4342037940

   > but why? There is GroupIntoBatches transform already.
   
   Hi @stankiewicz  GroupIntoBatches is per key based and it batch elements per 
key. Takes an input of  [KV.of("key","value")....] and returns 
[KV.of("key",[batch of values associated with only that key]) 
   
   BatchElements is a generic batching transform, that  will buffer 
```Pcol<T>``` and return an ```Pcol<[batch of T]>```, its primarily useful for 
ML inference like RemoteInference transform where inputs are typically 
individual elements and we need to batch elements before running inference for 
efficiency. If we use GroupIntoBatches we'd need to do keying on elements 
before batching and its an unnecessary step for that usecase. Plus we can 
dynamically determine batch size at runtime with BatchElements. 
   
   This is also a direct java port of python 
[BatchElements](https://beam.apache.org/documentation/transforms/python/aggregation/batchelements/)
  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to