cshannon commented on PR #3385: URL: https://github.com/apache/accumulo/pull/3385#issuecomment-1544820857
Thanks for the detailed review and information @keith-turner . Based on that (and after talking to Keith more about it in detail) I'm going to close this PR as it isn't going to help the way I thought as the normalization is going to break things and this extra object isn't buying much over just using the existing `StoredTabletFile` and `TabletFile` classes. Instead, I'm going to create a new PR that just goes ahead and starts using `StoredTabletFile` and `TabletFile` in the places in the code where we are only tracking the tablet file by a Path or a String. This is necessary because when we add Range information in the future we will need to also access the Range and not just the file path when reading. The most notable example of where things will need to change is the [FileOperations](https://github.com/apache/accumulo/blob/9c6db1f3b63f4acfcd85b83a7ec3c119c7f6f0c8/core/src/main/java/org/apache/accumulo/core/file/FileOperations.java) class (and subclasses) and [FileManager](https://github.com/apache/accumulo/blob/9c6db1f3b63f4acfcd85b83a7ec3c119c7f6f0c8/server/base/src/main/java/org/apache/accumulo/server/fs/FileManager.java) but there will be others as well. Those classes need to track the TabletFile so when we add the Range in a future PR it can use it to build the readers. I have actually already prototyped this a bit in my other [PR](https://github.com/apache/accumulo/pull/3332/files#diff-a98b98051dbb7d70a19025b944c64e4b74a08d7a9aaecc9f72b2b17be55a3008) but that PR also includes the Range changes which haven't been settled yet. Instead it makes sense to just do the initial refactoring first without the Range to keep the changes smaller. Also note that even with tracking a `TabletFile` we still need to get the metadata string representation for serialization when dealing with the metadata table. This is unavoidable of course and the format will change in the future (likely to Json instead of just a file path as it will have a serialized range) so we will need to handle that. A good example of this is ExternalCompactionMetadata. This class already [tracks](https://github.com/apache/accumulo/blob/9c6db1f3b63f4acfcd85b83a7ec3c119c7f6f0c8/core/src/main/java/org/apache/accumulo/core/metadata/schema/ExternalCompactionMetadata.java#L41) a `StoredTabletFile` but when data is serialized it calls to [getMetaUpdateDelete()](https://github.com/apache/accumulo/blob/9c6db1f3b63f4acfcd85b83a7ec3c119c7f6f0c8/core/src/main/java/org/apache/accumulo/core/metadata/schema/ExternalCompactionMetadata.java#L132) and that will be a new format. The format won't change in this first PR but it will later so just noting that we will need to be aware of that and make sure we can handle the change when the Range is added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
