[ 
https://issues.apache.org/jira/browse/HDDS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jyotinder Singh updated HDDS-6682:
----------------------------------
    Description: 
In high concurrency scenarios (which will become more common once we introduced 
prefix-based locking), there is a possibility of the following race condition:
Take for instance the following scenario and 3 concurrent write requests:
Bucket {{vol/buck1}} exists with {{LEGACY}} layout.

{*}Request 1{*}: {{CreateKey}} by an older client (pre- bucket layout) on a 
bucket {{{}vol/buck1{}}}.
{*}Request 2{*}: {{DeleteBucket}} by a new client on the bucket 
{{{}vol/buck1{}}}.
{*}Request 3{*}: {{CreateBucket}} by a new client on the bucket {{vol/buck1}} 
with layout {{{}FILE_SYSTEM_OPTIMIZED{}}}.

Let's say that these requests are processed in the following order:
 # {{Request 1}} is picked up by one of the threads, which proceeds to run the 
{{PRE_PROCESS}} validations on this request. The validator we are interested in 
is called {{{}blockCreateKeyWithBucketLayoutFromOldClient{}}}. This validator 
will make sure that the bucket associated with this request is a {{LEGACY}} 
bucket - which is the pre-defined behavior in the case of old client/new 
cluster interactions since we do not want an old client operating on buckets 
using a new metadata layout.

One thing to know here is that at this stage, the OM does not hold a bucket 
lock (which only happens inside the {{updateAndValidateCache}} method 
associated with the write request's handler class).
 # While {{Request 1}} was being processed, another thread was processing 
{{{}Request 2{}}}. Let's say `Request2' managed to get hold of the bucket lock 
and successfully completed the bucket deletion.

 # Now before {{Request 1}} got a chance to acquire the bucket lock, {{Request 
3}} manages to acquire it. It proceeds with the bucket creation and creates a 
new bucket {{vol/buck1}} with {{FILE_SYSTEM_OPTIMIZED}} bucket layout.

 # Finally, {{Request 1}} is able to acquire the bucket lock and proceeds to 
enter its {{validateAndUpdateCache}} method. However, even though it is able to 
find the bucket it is looking for, this is not the same bucket that was 
validated in its pre-processing hook. This new bucket has the same name, but a 
different bucket layout. The request ends up modifying a bucket that it should 
not be allowed to touch.

This race condition can lead to undefined behavior of the Ozone cluster, where 
older clients might be modifying information they do not understand.

This PR aims to add bucket ID validation to the request processing flow, which 
would make sure that the bucket that ends up being processed is the same one 
that was validated.

> Validate Bucket ID of bucket associated with in-flight requests.
> ----------------------------------------------------------------
>
>                 Key: HDDS-6682
>                 URL: https://issues.apache.org/jira/browse/HDDS-6682
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Jyotinder Singh
>            Assignee: Jyotinder Singh
>            Priority: Major
>              Labels: pull-request-available
>
> In high concurrency scenarios (which will become more common once we 
> introduced prefix-based locking), there is a possibility of the following 
> race condition:
> Take for instance the following scenario and 3 concurrent write requests:
> Bucket {{vol/buck1}} exists with {{LEGACY}} layout.
> {*}Request 1{*}: {{CreateKey}} by an older client (pre- bucket layout) on a 
> bucket {{{}vol/buck1{}}}.
> {*}Request 2{*}: {{DeleteBucket}} by a new client on the bucket 
> {{{}vol/buck1{}}}.
> {*}Request 3{*}: {{CreateBucket}} by a new client on the bucket {{vol/buck1}} 
> with layout {{{}FILE_SYSTEM_OPTIMIZED{}}}.
> Let's say that these requests are processed in the following order:
>  # {{Request 1}} is picked up by one of the threads, which proceeds to run 
> the {{PRE_PROCESS}} validations on this request. The validator we are 
> interested in is called {{{}blockCreateKeyWithBucketLayoutFromOldClient{}}}. 
> This validator will make sure that the bucket associated with this request is 
> a {{LEGACY}} bucket - which is the pre-defined behavior in the case of old 
> client/new cluster interactions since we do not want an old client operating 
> on buckets using a new metadata layout.
> One thing to know here is that at this stage, the OM does not hold a bucket 
> lock (which only happens inside the {{updateAndValidateCache}} method 
> associated with the write request's handler class).
>  # While {{Request 1}} was being processed, another thread was processing 
> {{{}Request 2{}}}. Let's say `Request2' managed to get hold of the bucket 
> lock and successfully completed the bucket deletion.
>  # Now before {{Request 1}} got a chance to acquire the bucket lock, 
> {{Request 3}} manages to acquire it. It proceeds with the bucket creation and 
> creates a new bucket {{vol/buck1}} with {{FILE_SYSTEM_OPTIMIZED}} bucket 
> layout.
>  # Finally, {{Request 1}} is able to acquire the bucket lock and proceeds to 
> enter its {{validateAndUpdateCache}} method. However, even though it is able 
> to find the bucket it is looking for, this is not the same bucket that was 
> validated in its pre-processing hook. This new bucket has the same name, but 
> a different bucket layout. The request ends up modifying a bucket that it 
> should not be allowed to touch.
> This race condition can lead to undefined behavior of the Ozone cluster, 
> where older clients might be modifying information they do not understand.
> This PR aims to add bucket ID validation to the request processing flow, 
> which would make sure that the bucket that ends up being processed is the 
> same one that was validated.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to