[jira] Updated: (HBASE-1880) DeleteColumns are not recovered properly from the write-ahead-log

Clint Morgan (JIRA) Fri, 02 Oct 2009 11:39:49 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Clint Morgan updated HBASE-1880:
--------------------------------

    Attachment: 1880-v3.patch

Last patch broke the case when we have two deletes with LATEST timestamp. If 
so, then we need for the first delete to get into the store, so it removes the 
first version, then the second delete will get the timestamp of the second 
version to remove.

Good thing for unit tests, as the change did seem harmless. TestClient now 
passes. I'll run the full suite again...

This patch also removes the delete KVs that get discarded because there is 
nothing to delete. So they don't get in the WAL.

@Stack
Yeah, writing to memstore directly. Currently its wired up to flush memstore 
during log recovery, but this would be a good idea. 

"these new edits are not going into a WAL at all?" Yeah thats right, we talk to 
the memstore directly rather than go through HRegion which does the WALing. 
This is ok because those edits were coming from a WAL anyway. The key is we 
need to flush after the recovery so that we no longer need that WAL we just 
read to recover in case of subsequent crash.



> DeleteColumns are not recovered properly from the write-ahead-log
> -----------------------------------------------------------------
>
>                 Key: HBASE-1880
>                 URL: https://issues.apache.org/jira/browse/HBASE-1880
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0, 0.20.1, 0.21.0
>            Reporter: Clint Morgan
>            Priority: Critical
>         Attachments: 1880-v2.patch, 1880-v3.patch, 1880.patch
>
>
> I found a couple of issues:
>  - The timestamp is being set to now after it has been written to the wal. So 
> if the WAL was flushed on that write, it gets in with ts of MAX_INT and is 
> effectively lost.
>  - Even after that fix, I had issues getting the delete to apply properly. In 
> my case, the WAL had a put to a column, then a DeleteColumn for the same 
> column. The DeleteColumn KV had a later timestamp, but it was still lost on 
> recovery. I traced around a bit, and it looks like the current approach of 
> just using an HFile.writer to write the set of KVs read from the log will not 
> work. There is special logic in MemStore for deletes that needs to happen 
> before writing. I got around this by just adding to memstore in the log 
> recovery process. Not sure if there are other implications of this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1880) DeleteColumns are not recovered properly from the write-ahead-log

Reply via email to