cshannon commented on issue #5014: URL: https://github.com/apache/accumulo/issues/5014#issuecomment-2537332133
Several of us talked and came up with a path forward for now, the tentative plan to try is: 1. There will be a new property created that is a percentage or ratio of the split threshold that can be used for automatic merges. The intent is that tablets can be merged together once they are below the threshold (ie maybe 10% or 25% of the split threshold). This prevents automatic merging from creating a tablet too big that will just immediately split. 2. A new column in metadata for the tablet that will store mergeability state of a table (if a tablet is eligible to be automatically merged). There will be 3 possible states and these will be represented by a duration that is relative to manager steady time. The exact names are still to be determined but we could represent it with something like NEVER (-1), NOW (0), FUTURE (timestamp). Future would mean that the tablet is eligible after some period of time for merging relative to the steady time. 3. There will need to be an updated API to set the mergeability information. A new method in the API called putSplits() can be created that will either create or update existing splits to set the mergeability. The existing addSplits() method can be deprecated and just default to never. The create table API can also be created with splits and this will also be able to set it. 4. There would need to be some way for a user to read the mergeability state of the tablet so needs to be returned as part of a new API call or modify an existing API. 5. By default, all user created tablets could be marked with a state of NEVER and system spits would be marked as NOW (or whatever we call it) so they can automatically merge. Pre-created splits for system tables like Fate, and Metadata, etc would be marked as NEVER by default. 6. On upgrade we would mark all existing tablets as NEVER. One thing I am still trying to figure out is how the automatic merging should work if we reach the threshold we set (say smaller than 10% of the split threshold) and if we should only ever merge together 2 tablets or try and merge as many as possible etc. @keith-turner - Is there anything I missed or got wrong after the conversation and conclusions we came to? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
