avantgardnerio commented on PR #23094:
URL: https://github.com/apache/datafusion/pull/23094#issuecomment-4770368684

   Pushed `e6f846ed9b` — adds `HaloSpec` and a `halo: Option<HaloSpec>` field 
on `DynamicRangePartitioning`.
   
   **Why the scope grew:** an honest answer to "what do we need before a 
runtime range repartitioner is implementable" surfaced halo as a missing piece. 
The routing operator and the downstream halo-strip operator need to agree at 
plan time on how far each bucket extends beyond its primary range; the field on 
the partitioning type is the natural carrier for that agreement. Without it, 
the two operators would need a side channel.
   
   **Shape:**
   
   ```rust
   pub struct HaloSpec {
       preceding: ScalarValue,
       following: ScalarValue,
   }
   
   pub struct DynamicRangePartitioning {
       ordering: LexOrdering,
       partition_count: usize,
       halo: Option<HaloSpec>,
   }
   ```
   
   Distances are in the leading sort key's domain. Builder keeps the common 
(no-halo) case terse: `DynamicRangePartitioning::new(ordering, k)` vs 
`…::new(ordering, k).with_halo(halo)`.
   
   This is the API hook the runtime range repartitioner needs to publish 
`ExtremaKind::Expanded` extrema (proposed in #23089 / implemented in #23090) to 
a downstream halo-strip filter. With halo unset, the partitioning produces 
disjoint buckets and downstream sees `Observed` extrema.
   
   ROWS-frame halo (a count of neighbor rows rather than a domain distance) is 
intentionally not represented; a separate variant can be added later if 
motivated.
   
   `compatible_with` requires halo equality; `project` passes halo through 
unchanged; `Display` adds the `halo(preceding=…, following=…)` suffix when set. 
Three new tests cover metadata, compatibility, and projection preservation.
   
   I'll update the discussion at #23093 to mirror this scope shift.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to