Thanks Robert and Danny! @Robert - I'll definitely look into the weighted BatchElements approach! That sounds like the right direction for handling variable-sized inputs like tokens. @Danny - That would be great! I'll tag you (@damccorm) once I have the PR ready. I'm starting the implementation now.
On 2026/01/26 17:06:25 Danny McCormick via dev wrote: > +1 - I think this is a good idea and had been considering something > similar. I'm happy to help with reviews here, feel free to tag me (my > GitHub handle is damccorm). > > Thanks, > Danny > > On Mon, Jan 26, 2026 at 12:00 PM Robert Bradshaw via dev < > [email protected]> wrote: > > > +1, a weighted BatchElements would help this case a lot. > > > > On Sun, Jan 25, 2026 at 1:23 AM Elia LIU <[email protected]> wrote: > > > >> Dear Beam Community, > >> > >> My name is Elia, and I am a final-year student interested in contributing > >> to Apache Beam's AI/ML infrastructure for GSoC 2026. > >> > >> I have been exploring RunInference for variable-length workloads, > >> specifically within NLP and LLMs. I noticed that the current batching > >> strategy in BatchElements is primarily count-based, which can lead to > >> inefficient padding, compute waste on GPU cycles, and unpredictable memory > >> usage (OOMs) when processing variable-length sequences. > >> > >> I propose introducing Content-Aware Batching (or Token-Based Batching) to > >> the ML transform. This would allow batching based on a computational cost > >> metric, such as total tokens, rather than element count. I intend to > >> integrate this with dynamic padding in ModelHandler. > >> > >> I have opened a Feature Request with a conceptual API design for further > >> context here: [Feature Request]: RunInference: Content-Aware Dynamic > >> Batching for NLP/LLM Workloads · Issue #37414 · apache/beam > >> <https://github.com/apache/beam/issues/37414> > >> > >> I am planning to draft a design document for this feature and would > >> appreciate any feedback on this approach or information regarding existing > >> efforts in this direction. > >> > >> Best regards, > >> > >> Elia > >> > > >
