[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070548#comment-13070548
 ] 

ramkrishna.s.vasudevan commented on HBASE-3845:
-----------------------------------------------

Thank you very much for the comments Prakash.
One query:
{noformat}
The bigger problem here is that completeCacheFlush() is not called with 
updatedLock acquired. Therefore there might still be correctness issues with 
the latest patch.
{noformat}

As per the current code the completeCacheFlush() has acquired the updateLock. 
Only the sync() and finally block is out of the lock. So can you please 
elaborate on the correctness issue?
Other 2 comments i can implement.





> data loss because lastSeqWritten can miss memstore edits
> --------------------------------------------------------
>
>                 Key: HBASE-3845
>                 URL: https://issues.apache.org/jira/browse/HBASE-3845
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.3
>            Reporter: Prakash Khemani
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.90.5
>
>         Attachments: HBASE-3845_1.patch, HBASE-3845_2.patch, 
> HBASE-3845_4.patch, HBASE-3845_5.patch, HBASE-3845__trunk.patch, 
> HBASE-3845_trunk_2.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and 
> Kannan internally and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. 
> That way I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a 
> log-sequence-id for a region that is not the earliest log-sequence-id for 
> that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
> that we only keep track  of the earliest log-sequence-number that is present 
> in the memstore.
> Every time the memstore is flushed we remove the region's entry in 
> lastSequenceWritten and wait for the next append to populate this entry 
> again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under 
> HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added 
> into the memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry 
> from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). 
> But this will be the log seq id of the current append. All the edits that 
> were added in step 2 are missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I 
> will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to