Eliaaazzz opened a new pull request, #37565: URL: https://github.com/apache/beam/pull/37565
[Stateful] Implement length-aware keying to minimize padding in BatchElements (Part 2/3) Rationale Issue: #37531 (Stateful Core - Part 2) Part 1: https://github.com/apache/beam/pull/37532 This PR adds length-aware keying to BatchElements to improve batching efficiency for variable-length inputs (for example, NLP inference workloads). Today, stateful BatchElements uses one shared key (WithSharedKey). That causes short and long sequences to be mixed in the same batch, so padding is dictated by the longest item and compute is wasted. This PR addresses that by routing elements into length buckets before stateful batching. What changed 1. New DoFn: WithLengthBucketKey * Implemented in apache_beam/transforms/util.py * Uses bisect-based bucket lookup * Routes elements into length buckets (for example, 0-16, 16-32, etc.) so similarly sized elements are batched together * Uses a composite key: (worker_uuid, bucket_index) 2. API updates * BatchElements now accepts length_fn and bucket_boundaries * ModelHandler.**init** now accepts length_fn and bucket_boundaries * Default boundaries: [16, 32, 64, 128, 256, 512] 3. Stateful-path integration * Length-aware routing is enabled automatically on the stateful path when max_batch_duration_secs is set and length_fn is provided Testing and results Added test_padding_efficiency_bimodal in util_test.py to represent a bimodal workload: * 500 short elements (length 5-30) * 500 long elements (length 200-512) Observed result: * Unbucketed baseline padding efficiency: about 68% * Bucketed padding efficiency (this PR): about 77% * Improvement: about +9 percentage points Interpretation: * Unbucketed path mixes short and long elements, increasing padding waste * Bucketed path separates short/long cohorts, reducing wasted compute and memory ------------------------ Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
