keith-turner opened a new issue, #3397: URL: https://github.com/apache/accumulo/issues/3397
**Is your feature request related to a problem? Please describe.** With the introduction of scan servers and eventually consistent scans, user can set the property `sserver.cache.metadata.expiration` to determine how long scan servers will cache file for any tablets. This property set a rough upper bound on how old the tablet files will be when scanning a tablet on a scan server. Unwritten data in tablet server memory can persist for long periods of time though without ever being flushed to a file (which makes it visible to a scan server). There is currently a property `table.compaction.minor.idle` that causes a minor compaction if tablet has not been written to in that time period. However if the tablet is constantly being slowly written to it will not hit the idle time and may not hit the size threshhold for a long time, so data could be held in memory and not visible to the scan server for long periods of time. **Describe the solution you'd like** A new tablet property that forces tablets to write out their data after a specified amount of time. The implementation could track the time when the first write is made to tablet memory and then force a compaction when time since the first write exceeds the configuration. Possible name for the new property could be `table.compaction.minor.maxAge`. With this new property `sserver.cache.metadata.expiration` + `table.compaction.minor.maxAge` gives an upper bound on how old the data for an eventual scan would be expected to be. Wondering if `sserver.cache.metadata.expiration` should be a per table property. Then tablet metadata could be cached for different time period in scan servers for different tables. When its a scan server wide property it forces it to be set to the needs of the table with the lowest tolerance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
