kfaraz opened a new pull request, #15459:
URL: https://github.com/apache/druid/pull/15459

   ### Background
   
   The segment allocation algorithm reuses an already allocated pending segment 
if the new allocation request is made for the same parameters:
   - datasource
   - sequence name
   - same interval
   - same value of `skipSegmentLineageCheck` (`false` for batch append, `true` 
for streaming append)
   - same previous segment id (used only when `skipSegmentLineageCheck = false`)
   
   The above parameters can thus uniquely identify a pending segment (enforced 
by the UNIQUE constraint on the `sequence_name_prev_id_sha1` column in 
`druid_pendingSegments` metadata table).
   
   This reuse is done in order to
   - allow replica tasks (in case of streaming ingestion) to use the same set 
of segment IDs.
   - allow re-run of a failed batch task to use the same segment ID and prevent 
unnecessary allocations
   
   ### Breaking scenario
   - Append task T1 allocates pending segment against version V1. Let's call 
this pending segment `pendingV11`
   - T1 fails and is unable to commit `pendingV11`
   - Replace task T2 commits a new version V2 which overshadows V1.
   - Append task T3 picks up from where T1 had failed. T3 thus tries to reuse 
`pendingV11`.
   - T3 completes and commits `pendingV11` as `segmentV11` even though it 
belongs to an already overshadowed version.
   - `segmentV11` gets immediately marked as unused and eventually deleted thus 
causing loss of the appended data
   
   ### Changes
   - Add the version of the allocated segment to the set of parameters that 
uniquely define a pending segment
   - Thus for a given datasource, interval, sequence, prev_segment_id, there 
can be multiple pending segments, but each of them must have a different 
version.
   - Include the version while calculating the hash 
`sequence_name_prev_id_sha1` thus preserving the UNIQUE constraint
   - Add test to assert above behaviour. This test would fail without the 
changes in this PR.
   
   ### Alternate approach
   Clean up pending segments as soon as they are not needed. It is difficult to 
ensure that a pending segment is not currently in use as multiple tasks might 
be using the same segment id.
   
   ### Release note
   Fix bug in segment allocation that can potentially cause loss of appended 
data when running interleaved append and replace tasks.
   
   <hr>
   
   This PR has:
   
   - [ ] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to