wchevreuil commented on pull request #3673: URL: https://github.com/apache/hbase/pull/3673#issuecomment-916988843
> +1 for the changes, btw @wchevreuil any special MapReduce application use-case that requires scanning delete markers, I am just curious because since we don't have control over major compaction, use-case might not be able to get accurate history of delete markers? It's an old issue involving delete markers, multiple versions and scans when compaction has not yet been run. You probably don't remember anymore, but we had discussed a candidate PR for that in HBASE-21596, a while ago. We ended up not merging that PR because it could have performance implications, and there was already an attempt to fix it in HBASE-15968 (NEW_VERSION_BEHAVIOUR). Problem with that approach is that order of operation matters, which is unfeasible when standard, non-serial replication is in place. The main goal here is not to count&match delete markers, and we don't care about delete markers already removed. Motivation is to allow scan to find versions above the max versions limit (this is only possible with raw scans and read versions >1). This way, synctable will produce a delete marker to every single version, which would prevent old versions to reappear (the issue reported back in HBASE-21596). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
