clintropolis commented on a change in pull request #8173: Add a cluster-wide 
configuration to force timeChunk lock and add a doc for segment locking
URL: https://github.com/apache/incubator-druid/pull/8173#discussion_r308914071
 
 

 ##########
 File path: docs/content/ingestion/locking-and-priority.md
 ##########
 @@ -24,30 +24,73 @@ title: "Task Locking & Priority"
 
 # Task Locking & Priority
 
+This document explains the task locking system in Druid. Druid's locking system
+and versioning system are tighly coupled with each other to guarantee the 
correctness of ingested data.
+
+## Overshadow Relation between Segments
+
+You can run a task to overwrite existing data. The segments created by an 
overwriting task _overshadows_ existing segments.
+Note that the overshadow relation holds only for the same time chunk and the 
same data source.
+These overshadowed segments are not considered in query processing to filter 
out stale data.
+
+A segment `s1` can overshadow another `s2` if
+
+- `s1` has a higher major version than `s2`.
+- `s1` has the same major version and a higher minor version than `s2`.
+
+Here are some examples.
+
+- A segment of the major version of `2019-01-01T00:00:00.000Z` and the minor 
version of `0` overshadows
+ another of the major version of `2018-01-01T00:00:00.000Z` and the minor 
version of `1`.
+- A segment of the major version of `2019-01-01T00:00:00.000Z` and the minor 
version of `1` overshadows
+ another of the major version of `2019-01-01T00:00:00.000Z` and the minor 
version of `0`.
+
 ## Locking
 
-Once an Overlord process accepts a task, the task acquires locks for the data 
source and intervals specified in the task.
+If you are running two or more [druid tasks](./tasks.html) which generate 
segments for the same data source and the same time chunk,
+the generated segments could potentially overshadow each other which could 
lead to incorrect query results.
 
-There are two lock types, i.e., _shared lock_ and _exclusive lock_.
+To avoid this problem, tasks should get locks first before creating any 
segment in Druid.
+There are two types of locks, i.e., _time chunk lock_ and _segment lock_, and 
each task can use different types of locks.
 
-- A task needs to acquire a shared lock before it reads segments of an 
interval. Multiple shared locks can be acquired for the same dataSource and 
interval. Shared locks are always preemptable, but they don't preempt each 
other.
-- A task needs to acquire an exclusive lock before it writes segments for an 
interval. An exclusive lock is also preemptable except while the task is 
publishing segments.
+When the time chunk lock is used, a task locks the entire time chunk of a data 
source where generated segments will be written.
+For example, suppose we have a task ingesting data into the time chunk of 
`2019-01-01T00:00:00.000Z/2019-01-02T00:00:00.000Z` of the `wikipedia` data 
source.
+With the time chunk locking, this task should lock the entire time chunk of 
`2019-01-01T00:00:00.000Z/2019-01-02T00:00:00.000Z` of the `wikipedia` data 
source
+before it creates any segments. As long as it holds the lock, any other task 
can't create segments for the same time chunk of the same data source.
+The segments created with the time chunk locking have a _higher_ major version 
than existing segments. Their minor version is always `0`.
 
-Each task can have different lock priorities. The locks of higher-priority 
tasks can preempt the locks of lower-priority tasks. The lock preemption works 
based on _optimistic locking_. When a lock is preempted, it is not notified to 
the owner task immediately. Instead, it's notified when the owner task tries to 
acquire the same lock again. (Note that lock acquisition is idempotent unless 
the lock is preempted.) In general, tasks don't compete for acquiring locks 
because they usually targets different dataSources or intervals.
+When the segment lock is used, a task locks individual segments instead of the 
entire time chunk.
+As a result, two or more tasks can create segments for the same time chunk of 
the same data source simultaneously
+if they are reading different segments.
+For example, a Kafka indexing task and a compaction task can always write 
segments into the same time chunk of the same data source simultaneously
+because the Kafka indexing task always appends new segments while the 
compaction task always overwrites existing segments.
+The segments created with the segment locking have the _same_ major version 
and a _higher_ minor version.
 
-A task writing data into a dataSource must acquire exclusive locks for target 
intervals. Note that exclusive locks are still preemptable. That is, they also 
be able to be preempted by higher priority locks unless they are _publishing 
segments_ in a critical section. Once publishing segments is finished, those 
locks become preemptable again.
+Lock requests can conflict with each other if two or more tasks try to get 
locks for the overlapped time chunks of the same data source.
+Note that the lock conflict can happen between different locks types.
 
-Tasks do not need to explicitly release locks, they are released upon task 
completion. Tasks may potentially release 
-locks early if they desire. Task ids are unique by naming them using UUIDs or 
the timestamp in which the task was created. 
-Tasks are also part of a "task group", which is a set of tasks that can share 
interval locks.
+The behavior on lock conflicts depends on the [task priority](#priority).
+If all tasks of conflicting lock requests have the same priority, then the 
task who requested first will get the lock.
+Other tasks will wait for the task to release the lock.
 
-## Priority
+If a task of a lower priority asks a lock later than another of a higher 
priority,
+this task will also wait for the task of a higher priority to release the lock.
+If a task of a higher priority asks a lock later than another of a lower 
priority,
+then this task will _preempt_ the other task of a lower priority. The lock
+of the lower-prioritized task will be revoked and the higher-prioritized task 
will acquire a new lock.
+
+This lock preemption can happen at any time while a task is running except
+when it is _publishing segments_ in a critical section. Its locks become 
preemptable again once publishing segments is finished.
 
-Apache Druid (incubating)'s indexing tasks use locks for atomic data 
ingestion. Each lock is acquired for the combination of a dataSource and an 
interval. Once a task acquires a lock, it can write data for the dataSource and 
the interval of the acquired lock unless the lock is released or preempted. 
Please see [the below Locking section](#locking)
+Note that locks are shared by the tasks of the same groupId.
+For example, Kafka indexing tasks of the same supervisor have the same groupId 
and share all locks with each other.
+
+## Priority
 
-Each task has a priority which is used for lock acquisition. The locks of 
higher-priority tasks can preempt the locks of lower-priority tasks if they try 
to acquire for the same dataSource and interval. If some locks of a task are 
preempted, the behavior of the preempted task depends on the task 
implementation. Usually, most tasks finish as failed if they are preempted.
+Each task has a priority which is used for lock acquisition.
 
 Review comment:
   This seems like a redundant description of priority, which was just 
mentioned in the preemption section

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to