Re: [PR] HDDS-11237. Implement BucketCache with expiry for FileSystem client rename and delete [ozone]

via GitHub Mon, 29 Jul 2024 23:08:55 -0700


tanvipenumudy commented on PR #6992:
URL: https://github.com/apache/ozone/pull/6992#issuecomment-2257540538


   Thank you @fapifta for taking the time to review the patch and sharing your 
thoughts, I do agree with most of the points shared.
   
   Regarding points 2. & 3., 
   
   > in case the client initializes a FileSystem object, and issues subsequent 
calls that should succeed, but it modifies a bucket (e.g.: changing a property) 
then it goes undetected from the client's point of view unless the change 
causes a subsequent call to fail. (I am not sure if there is such a scenario at 
the moment though).
   
   Currently, the properties being cached are the bucket layout semantics and 
the bucket link information (for the FS client rename and delete flow). Both 
these properties cannot be changed through Ozone shell on an existing bucket 
unless the bucket has been deleted and recreated using the same name. In such a 
scenario, the client would be using the stale information that has been cached 
until the cache has been invalidated after the configured expiration time. 
   
   In such a case, we might also have to bring in the objectID that is unique 
(or the creation/modification time) as part of the cache key and figure out a 
way to retrieve cache entries on the client-side based on this new key 
information.
   
   Regarding point 4.,
   
   > in case the client initializes a FileSystem object, and issues subsequent 
calls that should fail in a particular way after a call removes the bucket, it 
might fail in a very different way, that might affects retries as well, as in 
this case it will be the server side that fails, and the failure handling in 
getBucket won't be effective as we get the bucket from the cache, and there we 
can not get the error.
   
   Right, this is a valid point! The current code does not handle the retry 
behaviour, which we might need to analyze and compare with the current 
behaviour if we were to move ahead with this approach.
   
   > I believe our first goal should be to change our code to use at most one 
VolumeInfo, at most one BucketInfo, and then at most the required RPC calls to 
handle the actual operation, for example 3 in createkey (openkey and commitkey, 
and renameKey (from x.COPYING to x), or 1 for RenameKey, or 1 for DeleteKey and 
so on.
   > Once that is done, we can introduce the cache, and cache VolumeInfo and 
BucketInfo objects, in order to reduce the overall volume of RPC calls that are 
caused by long running clients.
   
   This makes sense. We may revisit this as needed once we converge all RPC 
calls into at most one for every operation.
   
   As @kerneltime suggested, in the long run, we may have to allow OM to 
resolve the getBucketInfo() and getVolumeInfo() calls internally and not have 
the client resolve this information - however, this might involve some major 
API redesigning (introducing newer OM APIs and modifying a lot of the existing 
code) and with that compatibility handling.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-11237. Implement BucketCache with expiry for FileSystem client rename and delete [ozone]

Reply via email to