[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-13811: -------------------------- Attachment: 13811.v3.branch-1.txt Thanks [~Apache9] That helped. Thinking on it, I was a little confused on what is needed here. Rather than add a new method that does what the old getEarliestMemstoreSeqNum did, I changed getEarliestMemstoreSeqNum to be how the old version worked. My new version was incorrect taking into consideration sequenceids of ongoing flushes now we are doing per column-family flushes. getEarliestMemstoreSeqNum(regionname) is asking for the earliest 'region' sequenceid. It is called from two places, at flush time and at close. At flush time, there will be no sequenceid returned UNLESS we are flushing a subset of column families. In this case, we do not want to use the region flush sequence id but what comes out of getEarliestMemstoreSeqNum for the region (minus one); the region may have edits older than those being flushed in the current family. getEarliestMemstoreSeqNum(regionname, familyName) on the other hand is scoped to the column family so it needs to work on a different scale, on the column family scale without regard for oldest in the region. I did some trivial fixup to fix the checkstyle warning. > Splitting WALs, we are filtering out too many edits -> DATALOSS > --------------------------------------------------------------- > > Key: HBASE-13811 > URL: https://issues.apache.org/jira/browse/HBASE-13811 > Project: HBase > Issue Type: Bug > Components: wal > Affects Versions: 2.0.0, 1.2.0 > Reporter: stack > Assignee: stack > Priority: Critical > Fix For: 2.0.0, 1.2.0 > > Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, > 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, HBASE-13811-v1.testcase.patch, > HBASE-13811.testcase.patch > > > I've been running ITBLLs against branch-1 around HBASE-13616 (move of > ServerShutdownHandler to pv2). I have come across an instance of dataloss. My > patch for HBASE-13616 was in place so can only think it the cause (but cannot > see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)