cshannon opened a new pull request, #3385:
URL: https://github.com/apache/accumulo/pull/3385

   The goal of this PR is to start to move away from using a String 
representation everywhere for the file path reference in the metadata table in 
order to support easier changes in the future when adding more information 
besides just a path. This is to support the future changes #1327 
   
   The current plan for supporting no chop merges is to start associating a 
range with a file and treating each file that is fenced off by a range as 
unique. This is going to lead to storing more than one metadata entry for the 
same file if there is multiple ranges so we will need to start handling the 
combination of the file reference and Range in the code.
   
   This PR is an incremental step and introduces an object to encapsulate the 
file metadata instead of using a String everywhere. There is a new object that 
is now used as part of `TabletFile` and `StoredTabletFile` called 
`TabletFileMetadataEntry` which will now represent the value that is stored in 
the file column. Currently the only thing it has is just a file path like 
before but now with the addition of this class we can add a field to it (a 
range or other information) and it will be much easier to update since we are 
not using a String everywhere.
   
   This PR doesn't go too crazy yet, it introduces the new object and updates 
the `StoredTabletFile` and `TabletFile` objects but many places in the code 
still just call the 
[getter](https://github.com/cshannon/accumulo/blob/991fe838c059e5f2675e4e89c2d8a33b80e58e9e/core/src/main/java/org/apache/accumulo/core/metadata/schema/TabletFileMetadataEntry.java#L55)
 on the new object to get the string representation. Besides a [new 
constructor](https://github.com/cshannon/accumulo/blob/991fe838c059e5f2675e4e89c2d8a33b80e58e9e/core/src/main/java/org/apache/accumulo/core/metadata/StoredTabletFile.java#L44)
 for the new object type, a string is still allowed to be passed to a 
[constructor](https://github.com/cshannon/accumulo/blob/991fe838c059e5f2675e4e89c2d8a33b80e58e9e/core/src/main/java/org/apache/accumulo/core/metadata/StoredTabletFile.java#L53)
 for StoredTabletFile and then just gets converted to the new object but we 
could stop doing that in the future to prevent errors. I wasn't sure how f
 ar to take this in this PR. We could go much further with more changes (at a 
higher risk) and push the object as far as we can and stop using Strings as 
much as possible. The main issue with doing that now is it would cause a lot of 
cascading changes to the code from everything to file handling to external 
compactions, basically anything using a file reference and I'm not quite sure 
yet how things are going to progress with the no chop merge changes.
   
   A notable example would be changes to in `FileManager`.  Instead of 
[tracking](https://github.com/apache/accumulo/blob/56d49f15a05db9a46dbceb845918497760601c11/server/base/src/main/java/org/apache/accumulo/server/fs/FileManager.java#L484)
 Strings for the files when range information is added we will need to  track 
the new object `TabletFileMetadataEntry`. However, I think that the changes for 
that may be better done in my upcoming PR for adding range information because 
I haven't decided yet if I plan to use `TabletFileMetadataEntry` or actually 
just `TabletFile` itself. It might make more sense to just track `TabletFiles` 
(which will now contain `TabletFileMetadataEntry`). 
   
   Anyways, I think we could probably merge this into 3.0 but could also wait 
to 3.1. Either way I plan to make my other no-chop merge PRs based on this and 
continue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to