[ 
https://issues.apache.org/jira/browse/HBASE-26658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474223#comment-17474223
 ] 

Duo Zhang commented on HBASE-26658:
-----------------------------------

{quote}
I do not know why there is data loss ? Would you mind explain more? In 
appendAndAsync, we indeed clear toWriteAppends after we successfully send them, 
but we also put them in unackedAppends at the same time, so if we face a HDFS 
again, unackedAppends would be transferred to the toWriteAppends to send them 
again, it seems no data loss.
{quote}

OK, I missed that part too. So no data loss, but there could be other problems.

We need to track all the entries which have already been sent out without ack, 
as we do not know the state of these entries. If we just clear them after 
transferring them back to toWriteAppends, and then there is a shutdown, we may 
report to the upper layer that these entries have not been written out, but 
this is not true, it may have already been successfully persist to HDFS, but 
only failed to report back to us due to some network issues between region 
server and data node.

> AsyncFSWAL.unackedAppends should clear after  transfered to  
> AsyncFSWAL.toWriteAppends
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-26658
>                 URL: https://issues.apache.org/jira/browse/HBASE-26658
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 3.0.0-alpha-2, 2.4.9
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Major
>
> When {{ASyncFSWAL}} syncing to HDFS failed,  {{AsyncFSWAL.unackedAppends}} 
> are transfered to {{AsyncFSWAL.toWriteAppends}} to avoid data loss, but 
> {{AsyncFSWAL.unackedAppends}} itself is not cleared. I think there is no need 
> to continue retain them in {{AsyncFSWAL.unackedAppends}} because we would 
> open a new HDFS pipeline to resend the {{AsyncFSWAL.unackedAppends}}. 
> BTW :  It would also simplify the logic for fixing HBASE-25905, current fix 
> for HBASE-25905 is somewhat hard to understand. I think the problem to cause 
> HBASE-25905 is  that {{AsyncFSWAL.unackedAppends}}  could not exactly reflect 
> the *unacked* for current HDFS pipeline. If we clear 
> {{AsyncFSWAL.unackedAppends}} after transferring them to 
> {{AsyncFSWAL.toWriteAppends}}, HBASE-25905 could also avoid.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to