[
https://issues.apache.org/jira/browse/HBASE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149982#comment-15149982
]
Duo Zhang commented on HBASE-14949:
-----------------------------------
{quote}
We use to have an isCreate flag. We don't have it anymore. Was it always true?
(It looks like it going by your patch).
{quote}
Yes, it is always true and only called from WALSplitter. I think we could
change it to private.
{quote}
Should you change formatRecoveredEditsFileName to take the original file name?
It looks like it is called from one other place at least.
{quote}
No, I just append the file name after the result of
formatRecoveredEditsFileName...
{code}
String fileName =
formatRecoveredEditsFileName(logEntry.getKey().getSequenceId());
fileName = getTmpRecoveredEditsFileName(fileName + "-" +
fileBeingSplit.getPath().getName());
return new Path(dir, fileName);
{code}
{quote}
So, we write with the name of the WAL in the split file name. Where do we read
it back? (I'm asking you because you probably have your finger on it). I want
to see if we handle case of bare sequenceid as well as this new format. In
fact, should we have a test that demonstrates this?
{quote}
Sorry I do not get the point... We only change the intermediate tmp file name,
it will be renamed when split end. And for the final recovered edits file name
conflict, the old logic just delete the old one, and for our new logic, we need
to delete the one with fewer entries...
Thanks.
> Resolve name conflict when splitting if there are duplicated WAL entries
> ------------------------------------------------------------------------
>
> Key: HBASE-14949
> URL: https://issues.apache.org/jira/browse/HBASE-14949
> Project: HBase
> Issue Type: Sub-task
> Reporter: Heng Chen
> Assignee: Duo Zhang
> Attachments: HBASE-14949-v3.patch, HBASE-14949-v4.patch,
> HBASE-14949.patch, HBASE-14949_v1.patch, HBASE-14949_v2.patch
>
>
> The AsyncFSHLog introduced in HBASE-14790 may write same WAL entries to
> different WAL files. WAL entry itself is idempotent so replay is not a
> problem but the intermediate file name and final name when splitting is
> constructed using the lowest or highest sequence id of the WAL entries
> written, so it is possible that different WAL files will have same
> intermediate or final file name when splitting. In the currentm
> implementation, this will cause split fail or data loss. We need to solve
> this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)