[
https://issues.apache.org/jira/browse/HBASE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Lapan updated HBASE-10118:
------------------------------
Description:
Hello!
During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how
major compact handles delete markers with timestamps in the future. Before
HBASE-4721 major compact purged deletes regardless of their timestamp. Newer
versions keep them in HFile until timestamp not reached.
I guess this happened due to new check in ScanQueryMatcher
{{EnvironmentEdgeManager.currentTimeMillis() - timestamp) <=
timeToPurgeDeletes}}.
This can be worked around by specifying large negative value in
{{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative
values are pulled up to zero by Math.max in HStore.java.
Maybe, we are trying to do something weird by specifing delete timestamp in
future, but HBASE-4721 definitely breaks old behaviour we rely on.
Steps to reproduce this:
{code}
put 'test', 'delmeRow', 'delme:something', 'hello'
flush 'test'
delete 'test', 'delmeRow', 'delme:something', 1394161431061
flush 'test'
major_compact 'test'
{code}
Before major_compact we have two hfiles with the following:
{code}
first:
K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0
second:
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}
After major compact we get the following:
{code}
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}
In our installation, we resolved this by removing Math.max and setting
hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete
markers.
was:
Hello!
During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how
major compact handles delete markers with timestamps in the future. Before
HBASE-4721 major compact purged deletes regardless of their timestamp. Newer
versions keep them in HFile until timestamp not reached.
I guess this happened due to new check in ScanQueryMatcher
{{EnvironmentEdgeManager.currentTimeMillis() - timestamp) <=
timeToPurgeDeletes}}.
This can be worked around by specifying large negative value in
{{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative
values are pulled up to zero by Math.max in HStore.java.
It is very possible that we are trying to do something weird by specifing
delete timestamp in future, but HBASE-4721 definitely breaks old behaviour we
rely on.
Steps to reproduce this:
{code}
put 'test', 'delmeRow', 'delme:something', 'hello'
flush 'test'
delete 'test', 'delmeRow', 'delme:something', 1394161431061
flush 'test'
major_compact 'test'
{code}
Before major_compact we have two hfiles with the following:
{code}
first:
K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0
second:
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}
After major compact we get the following:
{code}
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}
In our installation, we resolved this by removing Math.max and setting
hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete
markers.
> Major compact keeps deletes with future timestamps
> --------------------------------------------------
>
> Key: HBASE-10118
> URL: https://issues.apache.org/jira/browse/HBASE-10118
> Project: HBase
> Issue Type: Bug
> Components: Compaction, Deletes, regionserver
> Reporter: Max Lapan
> Priority: Minor
>
> Hello!
> During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in
> how major compact handles delete markers with timestamps in the future.
> Before HBASE-4721 major compact purged deletes regardless of their timestamp.
> Newer versions keep them in HFile until timestamp not reached.
> I guess this happened due to new check in ScanQueryMatcher
> {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) <=
> timeToPurgeDeletes}}.
> This can be worked around by specifying large negative value in
> {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative
> values are pulled up to zero by Math.max in HStore.java.
> Maybe, we are trying to do something weird by specifing delete timestamp in
> future, but HBASE-4721 definitely breaks old behaviour we rely on.
> Steps to reproduce this:
> {code}
> put 'test', 'delmeRow', 'delme:something', 'hello'
> flush 'test'
> delete 'test', 'delmeRow', 'delme:something', 1394161431061
> flush 'test'
> major_compact 'test'
> {code}
> Before major_compact we have two hfiles with the following:
> {code}
> first:
> K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0
> second:
> K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
> {code}
> After major compact we get the following:
> {code}
> K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
> {code}
> In our installation, we resolved this by removing Math.max and setting
> hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete
> markers.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)