[
https://issues.apache.org/jira/browse/HBASE-26658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17474286#comment-17474286
]
chenglei edited comment on HBASE-26658 at 1/12/22, 7:56 AM:
------------------------------------------------------------
[~zhangduo],thank you for feedback,
bq.We need to track all the entries which have already been sent out without
ack, as we do not know the state of these entries. If we just clear them after
transferring them back to toWriteAppends, and then there is a shutdown, we may
report to the upper layer that these entries have not been written out
I do not very understand..., how we using unackedAppends to report the
upper layer when there is shutdown?seems I can not find existing code in the
master for using unackedAppends to report entries not written out or not
received response to the upper layer. I think for distribute system it is not
very meaningful to report this unacked intermediate state to upper layer , just
as you said we could not make sure whether it is successful or failed in
reality(moreover even if an entry is in unackedAppends, we could not say it is
written out because it may be just in FanOutOneBlockAsyncDFSOutput.buf).
Regardless of reporting the entry not written out or not received response,
upper lay should resend. so seems that only report to the upper layer
successful or failed is suitable, just as AsyncFSWAL does now.
Well, now it is not a bug. I just find that seems clearing unackedAppends
after transferring to toWriteAppends is a more simpler and easier to understand
way to fix
HBASE-25905 from my view,just for your reference. Because the data we want to
resend is already in toWriteAppends, clearing seems natural and harmless(not
make the data loss or logic error) .
was (Author: comnetwork):
[~zhangduo],thank you for feedback,
bq.We need to track all the entries which have already been sent out without
ack, as we do not know the state of these entries. If we just clear them after
transferring them back to toWriteAppends, and then there is a shutdown, we may
report to the upper layer that these entries have not been written out
I do not very understand..., how we using unackedAppends to report the
upper layer when there is shutdown?seems I can not find existing code in the
master for using unackedAppends to report entries not written out or not
received response to the upper layer. I think for distribute system it is not
very meaningful to report this unacked intermediate state to upper layer , just
as you said we could not make sure whether it is successful or failed in
reality(moreover even if an entry is in unackedAppends, we could not say it is
written out because it may be just in FanOutOneBlockAsyncDFSOutput.bu).
Regardless of reporting the entry not written out or not received response,
upper lay should resend. so seems that only report to the upper layer
successful or failed is suitable, just as AsyncFSWAL does now.
Well, now it is not a bug. I just find that seems clearing unackedAppends
after transferring to toWriteAppends is a more simpler and easier to understand
way to fix
HBASE-25905 from my view,just for your reference. Because the data we want to
resend is already in toWriteAppends, clearing seems natural and harmless(not
make the data loss or logic error) .
> AsyncFSWAL.unackedAppends should clear after transfered to
> AsyncFSWAL.toWriteAppends
> --------------------------------------------------------------------------------------
>
> Key: HBASE-26658
> URL: https://issues.apache.org/jira/browse/HBASE-26658
> Project: HBase
> Issue Type: Bug
> Components: wal
> Affects Versions: 3.0.0-alpha-2, 2.4.9
> Reporter: chenglei
> Assignee: chenglei
> Priority: Major
>
> When {{ASyncFSWAL}} syncing to HDFS failed, {{AsyncFSWAL.unackedAppends}}
> are transfered to {{AsyncFSWAL.toWriteAppends}} to avoid data loss, but
> {{AsyncFSWAL.unackedAppends}} itself is not cleared. I think there is no need
> to continue retain them in {{AsyncFSWAL.unackedAppends}} because we would
> open a new HDFS pipeline to resend the {{AsyncFSWAL.unackedAppends}}.
> BTW : It would also simplify the logic for fixing HBASE-25905, current fix
> for HBASE-25905 is somewhat hard to understand. I think the problem to cause
> HBASE-25905 is that {{AsyncFSWAL.unackedAppends}} could not exactly reflect
> the *unacked* for current HDFS pipeline. If we clear
> {{AsyncFSWAL.unackedAppends}} after transferring them to
> {{AsyncFSWAL.toWriteAppends}}, HBASE-25905 could also avoid.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)