sadanand48 commented on PR #6992:
URL: https://github.com/apache/ozone/pull/6992#issuecomment-2257547330

   > in case the client initializes a FileSystem object, and issues subsequent 
calls that should fail in a particular way after a call removes the bucket, it 
might fail in a very different way, that might affects retries as well, as in 
this case it will be the server side that fails, and the failure handling in 
getBucket won't be effective as we get the bucket from the cache, and there we 
can not get the error
   
   @fapifta IMO, we should use the cache only for existing sanity checks and 
bucket type checks inside write-calls like rename/delete etc. For read-calls 
like getFileStatus/getBucketInfo, the cache should not be used as the results 
can be stale and not guarantee consistency.
   For the case you described, if the cache gives out a bucket that has been 
deleted, here are the steps we should follow:
   
   Say the operation is a rename b/w file `vol/buck/dir/a` and `vol/buck/dir/b`
   
   Step1 : Rename API on the client wants to know the type of bucket on which 
the rename op is being run. It invokes `getBucket(buck)`
   Step2:  Say some one deletes `buck` at this point . So OM no longer 
maintains `buck` on the server.
   Step3 : The cache happened to have the bucket `buck` details cached and the 
rename logic moves forward after getting bucket type of  'buck' .
   Step4: Rename request is sent to the server
   Step5: This is the crucial step . Server also checks `getBucketInfo (buck)` 
while handling the request and checks if the response is non-null before 
actually processing the rename. The call response will be null as the bucket is 
deleted and the OM should send a `"BUCKET_NOT_FOUND"` response to the rename.
   
   Also this case is also applicable currently without the cache too as there 
is a possibility that the bucket is deleted before the actual rename request is 
reached on the server as we don't hold a lock b/w these operations. We already 
have that handling today so we should be good. I'd say we should add some tests 
to cover this scenario 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to