[ 
https://issues.apache.org/jira/browse/HBASE-26658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474286#comment-17474286
 ] 

chenglei edited comment on HBASE-26658 at 1/12/22, 6:55 AM:
------------------------------------------------------------

[~zhangduo],thank you for feedback, 
bq.We need to track all the entries which have already been sent out without 
ack, as we do not know the state of these entries. If we just clear them after 
transferring them back to toWriteAppends, and then there is a shutdown, we may 
report to the upper layer that these entries have not been written out
     I do not very understand..., how we using unackedAppends to report the 
upper layer when there is shutdown?seems I can not find existing code in the 
master for using unackedAppends to report not written out or not received 
response to the upper layer.  I think for distribute system it is not very 
meaningful to report this unacked intermediate state to upper layer , just as 
you said we could not make sure whether it is successful or failed in reality. 
Regardless of reporting upper layer not written out or not received response, 
upper lay should resend.  so seems that only report to the upper layer 
successful or failed is suitable, just as AsyncFSWAL does now.
     Well, now it is not a bug. I just find that seems clearing unackedAppends 
after transferring to toWriteAppends is a more simpler and easier to understand 
way to fix 
HBASE-25905 from my view. Because the data we want to resend is already in 
toWriteAppends, clearing seems natural and harmless(not make the data loss or 
logic error) . 


was (Author: comnetwork):
[~zhangduo],thank you for feedback, 
??We need to track all the entries which have already been sent out without 
ack, as we do not know the state of these entries. If we just clear them after 
transferring them back to toWriteAppends, and then there is a shutdown, we may 
report to the upper layer that these entries have not been written out??
     I do not very understand..., how we using unackedAppends to report the 
upper layer when there is shutdown?seems I can not find existing code in the 
master for using unackedAppends to report not written out or not received 
response to the upper layer.  I think for distribute system it is not very 
meaningful to report this unacked intermediate state to upper layer , just as 
you said we could not make sure whether it is successful or failed in reality. 
Regardless of reporting upper layer not written out or not received response, 
upper lay should resend.  so seems that only report to the upper layer 
successful or failed is suitable, just as AsyncFSWAL does now.
     Well, now it is not a bug. I just find that seems clearing unackedAppends 
after transferring to toWriteAppends is a more simpler and easier to understand 
way to fix 
HBASE-25905 from my view. Because the data we want to resend is already in 
toWriteAppends, clearing seems natural and harmless(not make the data loss or 
logic error) . 

> AsyncFSWAL.unackedAppends should clear after  transfered to  
> AsyncFSWAL.toWriteAppends
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-26658
>                 URL: https://issues.apache.org/jira/browse/HBASE-26658
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 3.0.0-alpha-2, 2.4.9
>            Reporter: chenglei
>            Assignee: chenglei
>            Priority: Major
>
> When {{ASyncFSWAL}} syncing to HDFS failed,  {{AsyncFSWAL.unackedAppends}} 
> are transfered to {{AsyncFSWAL.toWriteAppends}} to avoid data loss, but 
> {{AsyncFSWAL.unackedAppends}} itself is not cleared. I think there is no need 
> to continue retain them in {{AsyncFSWAL.unackedAppends}} because we would 
> open a new HDFS pipeline to resend the {{AsyncFSWAL.unackedAppends}}. 
> BTW :  It would also simplify the logic for fixing HBASE-25905, current fix 
> for HBASE-25905 is somewhat hard to understand. I think the problem to cause 
> HBASE-25905 is  that {{AsyncFSWAL.unackedAppends}}  could not exactly reflect 
> the *unacked* for current HDFS pipeline. If we clear 
> {{AsyncFSWAL.unackedAppends}} after transferring them to 
> {{AsyncFSWAL.toWriteAppends}}, HBASE-25905 could also avoid.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to