[I] [WIP]: Broker Segment Unavailability Detection (druid)

via GitHub Tue, 04 Nov 2025 11:41:59 -0800


jtuglu1 opened a new issue, #18716:
URL: https://github.com/apache/druid/issues/18716


   ### Motivation
   
   Currently, Druid can serve partial result sets unbeknownst to the user. This 
can occur due to many reasons:
   - Data node crash/failure/unavailability
   - Broker missing announcements from historicals
   - Other synchronization issues
   
   ### Proposed changes
   
   The core issue of this problem is because Brokers automatically remove 
segments from their timeline when they receive drop notifications from the data 
nodes. This removes the ability to "audit" queries on whether a complete 
timeline is being served (because parts of it are missing).
   
   #### Segment-level changes
   
   To solve this, introduce concept of a "queryable" segment in Broker 
timeline. A segment is marked as "queryable" when it is announced by at least 
one data node. Once a segment is marked as queryable, it remains as such until 
it is marked as unused, metadata is reset, the cluster is re-deployed with 
downtime, or other similar situations. If a segment is "queryable" it is 
necessarily used, but if a segment is used, it is not necessarily queryable. A 
diagram of the segment lifetime within a broker timeline is below:
   
   
   #### Broker/Coordinator changes
   
   To fetch the latest used status for segments, the broker will do an initial 
full sync, followed by periodic delta syncs, with the coordinator to keep its 
knowledge of what segments are used/queryable up-to-date.
   
   In order to mark a segment as queryable in the timeline, the broker needs to 
hear both sync callbacks from data node/coordinator that the segment has 
loaded. This is because broker needs direct confirmation the segment is loaded 
as well as that it is currently marked as used in the cluster.
   
   <TODO | Fill out>
   
   Cases:
   ##### Segment Load
   1. Newly created segment S loading for first time on server A
     - Broker waits until it receives used callback from coordinator
   2. Previously created segment S loading on server A
   Segment Deletion/Drop on Data Node
   Case 1: Server A removed segment because it was dropped (unused) => remove 
from timeline
   Case 2: Server A removed segment because it was moved (used) => keep in 
timeline (assert there are n > 0 servers serving it, otherwise that's a race 
and should be fixed).
   Case 3: Server removed segment because its replication factor was changed 
(used) => keep in timeline (assert there are n > 0 servers serving it, 
otherwise that's a race and should be fixed).
   Case 4: Server removed segment because the server died/stopped responding 
(used) => keep in timeline
   ```
   
   This section should provide a detailed description of the changes being 
proposed. This will usually be the longest section; please feel free to split 
this section or other sections into subsections if needed.
   
   This section should include any changes made to user-facing interfaces, for 
example:
   - Parameters
   - JSON query/ingest specs
   - SQL language
   - Emitted metrics
   
   ### Rationale
   
   A discussion of why this particular solution is the best one. One good way 
to approach this is to discuss other alternative solutions that you considered 
and decided against. This should also include a discussion of any specific 
benefits or drawbacks you are aware of.
   
   ### Operational impact
   
   This section should describe how the proposed changes will impact the 
operation of existing clusters. It should answer questions such as:
   
   - Is anything going to be deprecated or removed by this change? How will we 
phase out old behavior?
   - Is there a migration path that cluster operators need to be aware of?
   - Will there be any effect on the ability to do a rolling upgrade, or to do 
a rolling _downgrade_ if an operator wants to switch back to a previous version?
   
   ### Test plan (optional)
   
   An optional discussion of how the proposed changes will be tested. This 
section should focus on higher level system test strategy and not unit tests 
(as UTs will be implementation dependent). 
   
   ### Future work (optional)
   
   An optional discussion of things that you believe are out of scope for the 
particular proposal but would be nice follow-ups. It helps show where a 
particular change could be leading us. There isn't any commitment that the 
proposal author will actually work on the items discussed in this section.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [WIP]: Broker Segment Unavailability Detection (druid)

Reply via email to