virajjasani commented on PR #5545: URL: https://github.com/apache/hbase/pull/5545#issuecomment-2000634670
> @Apache9, @virajjasani, @bbeaudreault , This PR is not done yet. I just realized that I need to add testing for newVersionBehavior. > > Also, I need to discuss the following case: > > Assume that for a given cell, two versions inserted and max versions is set to 1. If memory compaction is not enabled then I expect that both versions will be written to a new hfile (hf1) during flush even though the second version is redundant (is that true? I need to verify this). Now during minor compaction, the latest version will be written to a new live file (hf2) and the redundant version to a new historical file (hf3). Assume that a delete version marker is inserted for the latest version. This delete marker will be written to a new hfile (hf4). This delete marker will mask mask the latest version, and regular scans for the latest versions will not return any of the versions of this cell as latest version is masked by the delete marker and the redundant version is in the historical file (hf3) will be omitted by these scans. When the major compaction happens I expect that the redundant version should be revived and will be written to a new live file (hf5). Now the redunda nt version would be visible to regular scans. Please let me know if any of these is incorrect. > > Please note this should not happen with newVersionBehavior as the deleted versions are considered toward total version count. Should we enable this feature only when newVersionBehavior is enabled? @kadirozde i just verified with HBase 2.6 (branch-2) that the flush writes only maxVersions versions to the new HFile. Hence, if the max version is 1, and if we write 2 versions of the cell, only the latest cell is written to the HFile. So we should be good for the first case. At this point, we can add some tests for newVersionBehavior, and we should be good for the above concern. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
