[
https://issues.apache.org/jira/browse/HDDS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679900#comment-17679900
]
István Fajth commented on HDDS-6682:
------------------------------------
:) for 2, I would argue that the readability there is not that good already...
I support to utilise AOP and add certain things to preExecute and
validateAndUpdateCache via annotation, however yepp, I am not sure how hard is
to properly define the context for the advice :D This one does not seem to that
frequent, so probably we have some time to experiment and figure it out if we
want to. But I feel good that we are pretty much on the same side on how to
handle this [~erose].
> Validate Bucket ID of bucket associated with in-flight requests.
> ----------------------------------------------------------------
>
> Key: HDDS-6682
> URL: https://issues.apache.org/jira/browse/HDDS-6682
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Jyotinder Singh
> Assignee: Jyotinder Singh
> Priority: Major
> Labels: pull-request-available
>
> In high concurrency scenarios (which will become more common once we
> introduced prefix-based locking), there is a possibility of the following
> race condition:
> Take for instance the following scenario and 3 concurrent write requests:
> Bucket {{vol/buck1}} exists with {{LEGACY}} layout.
> {*}Request 1{*}: {{CreateKey}} by an older client (pre- bucket layout) on a
> bucket {{{}vol/buck1{}}}.
> {*}Request 2{*}: {{DeleteBucket}} by a new client on the bucket
> {{{}vol/buck1{}}}.
> {*}Request 3{*}: {{CreateBucket}} by a new client on the bucket {{vol/buck1}}
> with layout {{{}FILE_SYSTEM_OPTIMIZED{}}}.
> Let's say that these requests are processed in the following order:
> # {{Request 1}} is picked up by one of the threads, which proceeds to run
> the {{PRE_PROCESS}} validations on this request. The validator we are
> interested in is called {{{}blockCreateKeyWithBucketLayoutFromOldClient{}}}.
> This validator will make sure that the bucket associated with this request is
> a {{LEGACY}} bucket - which is the pre-defined behavior in the case of old
> client/new cluster interactions since we do not want an old client operating
> on buckets using a new metadata layout.
> One thing to know here is that at this stage, the OM does not hold a bucket
> lock (which only happens inside the {{updateAndValidateCache}} method
> associated with the write request's handler class).
> # While {{Request 1}} was being processed, another thread was processing
> {{{}Request 2{}}}. Let's say `Request2' managed to get hold of the bucket
> lock and successfully completed the bucket deletion.
> # Now before {{Request 1}} got a chance to acquire the bucket lock,
> {{Request 3}} manages to acquire it. It proceeds with the bucket creation and
> creates a new bucket {{vol/buck1}} with {{FILE_SYSTEM_OPTIMIZED}} bucket
> layout.
> # Finally, {{Request 1}} is able to acquire the bucket lock and proceeds to
> enter its {{validateAndUpdateCache}} method. However, even though it is able
> to find the bucket it is looking for, this is not the same bucket that was
> validated in its pre-processing hook. This new bucket has the same name, but
> a different bucket layout. The request ends up modifying a bucket that it
> should not be allowed to touch.
> This race condition can lead to undefined behavior of the Ozone cluster,
> where older clients might be modifying information they do not understand.
> This PR aims to add bucket ID validation to the request processing flow,
> which would make sure that the bucket that ends up being processed is the
> same one that was validated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]