[ 
https://issues.apache.org/jira/browse/HBASE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Lapan updated HBASE-10118:
------------------------------

    Description: 
Hello!

During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how 
major compact handles delete markers with timestamps in the future. Before 
HBASE-4721 major compact purged deletes regardless of their timestamp. Newer 
versions keep them in HFile until timestamp not reached.

I guess this happened due to new check in ScanQueryMatcher 
{{EnvironmentEdgeManager.currentTimeMillis() - timestamp) <= 
timeToPurgeDeletes}}.

This can be worked around by specifying large negative value in 
{{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative 
values are pulled up to zero by Math.max in HStore.java.

Maybe, we are trying to do something weird by specifing delete timestamp in 
future, but HBASE-4721 definitely breaks old behaviour we rely on.

Steps to reproduce this:
{code}
put 'test', 'delmeRow', 'delme:something', 'hello'
flush 'test'
delete 'test', 'delmeRow', 'delme:something', 1394161431061
flush 'test'
major_compact 'test'
{code}

Before major_compact we have two hfiles with the following:
{code}
first:
K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0

second:
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}

After major compact we get the following:
{code}
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}

In our installation, we resolved this by removing Math.max and setting 
hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete 
markers, and it looks like a solution. But, maybe, there are better approach.

  was:
Hello!

During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how 
major compact handles delete markers with timestamps in the future. Before 
HBASE-4721 major compact purged deletes regardless of their timestamp. Newer 
versions keep them in HFile until timestamp not reached.

I guess this happened due to new check in ScanQueryMatcher 
{{EnvironmentEdgeManager.currentTimeMillis() - timestamp) <= 
timeToPurgeDeletes}}.

This can be worked around by specifying large negative value in 
{{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative 
values are pulled up to zero by Math.max in HStore.java.

Maybe, we are trying to do something weird by specifing delete timestamp in 
future, but HBASE-4721 definitely breaks old behaviour we rely on.

Steps to reproduce this:
{code}
put 'test', 'delmeRow', 'delme:something', 'hello'
flush 'test'
delete 'test', 'delmeRow', 'delme:something', 1394161431061
flush 'test'
major_compact 'test'
{code}

Before major_compact we have two hfiles with the following:
{code}
first:
K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0

second:
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}

After major compact we get the following:
{code}
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
{code}

In our installation, we resolved this by removing Math.max and setting 
hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete 
markers.


> Major compact keeps deletes with future timestamps
> --------------------------------------------------
>
>                 Key: HBASE-10118
>                 URL: https://issues.apache.org/jira/browse/HBASE-10118
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction, Deletes, regionserver
>            Reporter: Max Lapan
>            Priority: Minor
>
> Hello!
> During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in 
> how major compact handles delete markers with timestamps in the future. 
> Before HBASE-4721 major compact purged deletes regardless of their timestamp. 
> Newer versions keep them in HFile until timestamp not reached.
> I guess this happened due to new check in ScanQueryMatcher 
> {{EnvironmentEdgeManager.currentTimeMillis() - timestamp) <= 
> timeToPurgeDeletes}}.
> This can be worked around by specifying large negative value in 
> {{hbase.hstore.time.to.purge.deletes}} option, but, unfortunately, negative 
> values are pulled up to zero by Math.max in HStore.java.
> Maybe, we are trying to do something weird by specifing delete timestamp in 
> future, but HBASE-4721 definitely breaks old behaviour we rely on.
> Steps to reproduce this:
> {code}
> put 'test', 'delmeRow', 'delme:something', 'hello'
> flush 'test'
> delete 'test', 'delmeRow', 'delme:something', 1394161431061
> flush 'test'
> major_compact 'test'
> {code}
> Before major_compact we have two hfiles with the following:
> {code}
> first:
> K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0
> second:
> K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
> {code}
> After major compact we get the following:
> {code}
> K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
> {code}
> In our installation, we resolved this by removing Math.max and setting 
> hbase.hstore.time.to.purge.deletes to Integer.MIN_VALUE, which purges delete 
> markers, and it looks like a solution. But, maybe, there are better approach.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to