clintropolis opened a new pull request, #19557:
URL: https://github.com/apache/druid/pull/19557

   ### Description
   This PR adds a post-processing step to `RunRules` so that partial-load 
matchers that resolve differently across segments of a shard group (e.g. 
`ClusterGroupPartialLoadMatcher` over range-partitioned segments) don't leave 
the broker with an incomplete `PartitionHolder`. `Rule.run` now returns a new 
`RuleRunResult` object that `RunRules` can use to 'do stuff', though most 
implementations today do `RuleRunResult.OK`. For these asymmetric partial 
matchers, there is a `ShardGroupFollowup` implementation of `RuleRunResult` 
that `RunRules` can use to check the siblings of a shard group to perform an 
'empty' partial load to ensure that the group is fully available. I am unsure 
if there is enough of a pattern to make `RunRules` more generically handle 
`RuleRunResult`, i was planning to wait and see if any other use cases pop up 
before trying to make this handling more generic.
   
   Doing an empty load like this seemed cheaper than trying to figure out how 
to allow the timeline to sometimes allow incomplete groups to appear as 
complete, since the empty partial load should be cheap for the target 
historicals.
   
   changes:
   * adds `PartialLoadMatcher.emptyMatch` (default null) as an opt-in way to do 
a zero-content "load" for matchers that can resolve asymmetrically.
   * `Rule.run` returns a `RuleRunResult` (`OK` constant for most rules). 
`PartialLoadRule` returns a `ShardGroupFollowup` on a positive match, but only 
when numCorePartitions > 0.
   * `RunRules` streams followups in a buffer keyed by (dataSource, interval, 
version), flushing at iteration boundaries (NEWEST_SEGMENT_FIRST groups 
segments contiguously by that triple). Flush dispatches emptyMatch loads to 
unmatched core siblings; siblings not part of the core partition group are 
skipped.
   * adds `TimelineLookup.findChunks` which returns a defensive copy via 
`PartitionHolder.copyWithOnlyVisibleChunks` so iteration is safe outside the 
timeline lock.
   * `PartialClusterGroupLoadSpec` allows an empty `clusterGroupIndices` list 
as the wire form of an empty match.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to