tanvipenumudy commented on PR #6992: URL: https://github.com/apache/ozone/pull/6992#issuecomment-2257540538
Thank you @fapifta for taking the time to review the patch and sharing your thoughts, I do agree with most of the points shared. Regarding points 2. & 3., > in case the client initializes a FileSystem object, and issues subsequent calls that should succeed, but it modifies a bucket (e.g.: changing a property) then it goes undetected from the client's point of view unless the change causes a subsequent call to fail. (I am not sure if there is such a scenario at the moment though). Currently, the properties being cached are the bucket layout semantics and the bucket link information (for the FS client rename and delete flow). Both these properties cannot be changed through Ozone shell on an existing bucket unless the bucket has been deleted and recreated using the same name. In such a scenario, the client would be using the stale information that has been cached until the cache has been invalidated after the configured expiration time. In such a case, we might also have to bring in the objectID that is unique (or the creation/modification time) as part of the cache key and figure out a way to retrieve cache entries on the client-side based on this new key information. Regarding point 4., > in case the client initializes a FileSystem object, and issues subsequent calls that should fail in a particular way after a call removes the bucket, it might fail in a very different way, that might affects retries as well, as in this case it will be the server side that fails, and the failure handling in getBucket won't be effective as we get the bucket from the cache, and there we can not get the error. Right, this is a valid point! The current code does not handle the retry behaviour, which we might need to analyze and compare with the current behaviour if we were to move ahead with this approach. > I believe our first goal should be to change our code to use at most one VolumeInfo, at most one BucketInfo, and then at most the required RPC calls to handle the actual operation, for example 3 in createkey (openkey and commitkey, and renameKey (from x.COPYING to x), or 1 for RenameKey, or 1 for DeleteKey and so on. > Once that is done, we can introduce the cache, and cache VolumeInfo and BucketInfo objects, in order to reduce the overall volume of RPC calls that are caused by long running clients. This makes sense. We may revisit this as needed once we converge all RPC calls into at most one for every operation. As @kerneltime suggested, in the long run, we may have to allow OM to resolve the getBucketInfo() and getVolumeInfo() calls internally and not have the client resolve this information - however, this might involve some major API redesigning (introducing newer OM APIs and modifying a lot of the existing code) and with that compatibility handling. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
