findingrish opened a new pull request, #13967:
URL: https://github.com/apache/druid/pull/13967

   ### Description
   Broker maintains a timeline of segments which it builds overtime upon 
receiving updates from historical server and it uses this timeline to answer 
queries. Broker isn’t aware of what segments actually exists in the druid 
system. The result of this gap is incomplete query responses on some occasions. 
   
   With this feature the goal is to ensure, if a segment was queryable at one 
point in time, any future query over that segment would either include that 
segment or fail (unless it's replication factor is changed to 0).  
   
   ### Design 
   This change introduces a state for segment metadata called `loaded`, it is 
set to true, when a segment has been loaded onto some historical. 
   Broker polls the coordinator periodically to get all the used segments in 
the system, it merges all the `loaded` segment from this set into its timeline 
of segments. This timeline now consists of segments which are available on some 
historical server and which aren’t available on any server, this information 
helps the broker identify unavailable segments for the query. 
   
   This approach also ensures that any segment which has just been published 
but not loaded by any historical server doesn’t cause query failure. 
   
   ### Major changes
   
   #### Coordinator changes
   - Add a new column `has_loaded` in the druid_segments metadata table to 
represent if a segment has ever been loaded on a historical (changes in 
`SQLMetadataConnector`).
   
   - Set the `has_loaded` field for a segment to true when the Coordinator 
receives notification from the historical. 
   
   - Update `DataSourcesSnapshot` to maintain diff of the segments from the 
previous poll.  
   
   - Add coordinator API  `MetadataResource#getChangedSegments` to send either 
full snapshot or delta changes to the broker using the information present in 
`DataSourcesSnapshot`
   
   - Changed classes: `CoordinatorServerView`, `SqlSegmentsMetadataManager`, 
`SqlSegmentsMetadataQuery`, `MetadataResource`, `DataSourcesSnapshot`
   
   #### Broker changes
   - `MetadataSegmentView` polls the coordinator to fetch the list of all used 
segments along with their `overshadowed` and `loaded` status, on the very first 
poll it receives a full snapshot thereafter it receives delta updates. 
   
   - After the finish of every poll, notify BrokerServerView to update its 
timeline with all segments that have been loaded
      - Remove segments that are not used anymore i.e. segments that are not 
present in the list polled from the coordinator
      - Add segments that are used and loaded to the timeline, if they don’t 
already exist
   
   - While handling a query on the broker, lookup the segments required for the 
query from the timeline. If any of these segments is unavailable, throw an 
error.
   
   - Changed classes: `CachingClusteredClient`, `BrokerServerView`, 
`MetadataSegmentView`
   
   
   ### Segment lifecycle 
   
   The lifecycle of a segment `s` can be described as follows:
   
   `Creation`: Segment metadata is published in the database at time `t1`.
   `Coordinator Polling`: The Coordinator polls for new segments at time `t2` 
but does not consider `s` loaded yet.
   `Broker Polling`: The Broker polls the Coordinator at time `t3` and finds 
segment `s` but doesn't add it to the timeline as it's not loaded.
   `Historical Loading`: The segment `s` is loaded by a Historical process at 
time `t4`.
   `Coordinator Update`: The Coordinator receives a callback from the 
Historical and marks the segment as `loaded` at time `t5`.
   
   #### Segment Availability
   
   Once marked as `loaded`, a segment is considered available for querying. 
However, there are potential scenarios that can impact segment availability:
   
   ##### Scenario 1: Broker Receives Segment Metadata Before Historical Callback
   
   * The Broker adds `s` to the timeline as `queryable` at time `t6`.
   * Queries for `s` at this point might fail as it's not fully loaded.
   * Once the Broker receives the callback from the Historical at time `t7`, 
the segment becomes available for querying.
   * If all Historicals serving `s` go down, the segment remains in the 
timeline as `queryable` but becomes unavailable for querying.
   
   ##### Scenario 2: Broker Receives Historical Callback Before Segment Metadata
   
   * The Broker adds `s` to the timeline but doesn't mark it as `queryable` at 
time `t6`.
   * Queries for `s` will still be processed, but the segment might not be 
fully available.
   * Once the Broker receives metadata from the Coordinator at time `t7`, the 
segment is marked as `queryable`.
   * Similar to Scenario 1, if all Historicals serving `s` go down, the segment 
remains in the timeline as `queryable` but becomes unavailable for querying.
   
   ### Synchronisation issues 
   Following synchronisation conditions could cause temporary query failure,
   - If the broker isn’t able to sync its timeline with the coordinator, this 
would cause broker to be unaware of recently removed segments from the 
historical 
   
   - If the broker is behind historical server, sync with coordinator makes it 
aware of recently loaded segments but the broker would think that they are 
unavailable 
   
   <ol>
   <li>Broker lags behind Historical</li>
   <ol>
     <li>a. Coordinator tells broker that segment is used and loaded </li>
    <ul> 
    <li>start failing if segment is unavailable immediately </li> 
    <li>maybe also log historical sync times </li> 
     </ul>
     <li>b. Coordinator says that segment is unused </li>
   <ul>
     <li>can't do anything but remove from timeline, stop serving that data 
</li> 
   </ul>
   </ol>
   <li>Broker lags behind Coordinator</li>
   <ol>
     <li>a. Historical has removed a loaded segment </li>
   <ul>  
   <li> fail </li>
   </ul>
   <li>b. historical has added an unknown segment </li>
   <ul>   
   <li>current behaviour (feature off): can't do anything, start serving it? 
</li> 
   <li>new behaviour: don't serve it unless coordinator says it is loaded </li> 
   </ul>
     <li>c. historical has removed a non loaded segment </li>
   <ul>
   <li>If we don't do 2b, we might miss out the problem scenario explained 
later </li> 
   </ul>
   </ol>
   <li>Coordinator lags behind historical</li>
   <ol>
     <li>a. coordinator asked historical to remove but doesn't know unload is 
completed </li>
   <ul>  
   <li>this would have same behaviour as 1b </li>
   </ul>
     <li>b. coordinator asked historical to load but doesn't know load is 
complete </li>
   <ul>
     <li>same behaviour as 2b </li>
     </ul>
   </ol>
   
   <li>Coordinator lags behind metadata</li>
   <ul>
   <li>nothing to do because historical wouldn't have made any decision, 
neither broker</li>
   <li>coordinator is source of truth </li>
   </ul>
   </ol>
   
   Problem scenario: Broker might serve something and then not serve it anymore 
without failing query.
   <ul>
   <li>coordinator tells historical to load segment</li>
   <li>historical loads segment</li>
   <li>coordinator doesn't know segment is loaded</li>
   <li>broker knows segment is loaded and starts serving it</li>
   <li>segment disappears</li>
   <li>broker stops serving segment, but doesn't fail</li>
   <li>to avoid this, we have to do 2b</li>
   </ul>
   
   #### Pending items 
   
   * If `druid.sql.planner.detectUnavailableSegments` is set then 
`druid.sql.planner.metadataSegmentCacheEnable` takes effect automatically. 
Verify if this behaviour is as expected or needs change. 
   * MetadataResource#getChangedSegments should also return realtime segments 
if `CentralizedDatasourceSchema` is enabled on the Coordinator. 
   * Handle the scenario when the replication factor for a segment is updated 
to 0. 
   * Verify the potential synchronisation issues. 
   * Testing (unit tests, performance testing). 
   * Document new API, metrics. 
    
   #### Usage 
   - `druid.sql.planner.detectUnavailbleSegments` needs to be set in broker 
runtime properties 
   - `unavailableSegmentsAction` query context can be set to `allow` or `fail`, 
accordingly the queries would fail, in either case the unavailable segments 
will be logged. 
   
   #### Upgrade considerations 
   TBA
   
   #### Release note
   TBA 
   
   <hr>
   
   #### Pending items
   
   This PR has:
   
   - [x] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to