Eliaaazzz opened a new pull request, #37565:
URL: https://github.com/apache/beam/pull/37565

   [Stateful] Implement length-aware keying to minimize padding in 
BatchElements (Part 2/3)
   
   Rationale
   
   Issue: #37531 (Stateful Core - Part 2)
   Part 1: https://github.com/apache/beam/pull/37532
   
   This PR adds length-aware keying to BatchElements to improve batching 
efficiency for variable-length inputs (for example, NLP inference workloads).
   
   Today, stateful BatchElements uses one shared key (WithSharedKey). That 
causes short and long sequences to be mixed in the same batch, so padding is 
dictated by the longest item and compute is wasted. This PR addresses that by 
routing elements into length buckets before stateful batching.
   
   What changed
   
   1. New DoFn: WithLengthBucketKey
   
   * Implemented in apache_beam/transforms/util.py
   * Uses bisect-based bucket lookup
   * Routes elements into length buckets (for example, 0-16, 16-32, etc.) so 
similarly sized elements are batched together
   * Uses a composite key: (worker_uuid, bucket_index)
   
   2. API updates
   
   * BatchElements now accepts length_fn and bucket_boundaries
   * ModelHandler.**init** now accepts length_fn and bucket_boundaries
   * Default boundaries: [16, 32, 64, 128, 256, 512]
   
   3. Stateful-path integration
   
   * Length-aware routing is enabled automatically on the stateful path when 
max_batch_duration_secs is set and length_fn is provided
   
   Testing and results
   
   Added test_padding_efficiency_bimodal in util_test.py to represent a bimodal 
workload:
   
   * 500 short elements (length 5-30)
   * 500 long elements (length 200-512)
   
   Observed result:
   
   * Unbucketed baseline padding efficiency: about 68%
   * Bucketed padding efficiency (this PR): about 77%
   * Improvement: about +9 percentage points
   
   Interpretation:
   
   * Unbucketed path mixes short and long elements, increasing padding waste
   * Bucketed path separates short/long cohorts, reducing wasted compute and 
memory
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
    - [x] Mention the appropriate issue in your description (for example: 
`addresses #123`), if applicable. This will automatically add a link to the 
pull request in the issue. If you would like the issue to automatically close 
on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to