[
https://issues.apache.org/jira/browse/HDFS-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531016#comment-17531016
]
袁枫 edited comment on HDFS-16493 at 5/3/22 3:17 AM:
---------------------------------------------------
I am confusing in:
org/apache/hadoop/hdfs/qjournal/client/SegmentRecoveryComparator.java:86
{code:java}
return ComparisonChain.start()
.compare(r1SeenEpoch, r2SeenEpoch)
.compare(r1.getSegmentState().getEndTxId(),
r2.getSegmentState().getEndTxId())
.result();
{code}
Why would pick longest when recovery, if this way, will pick jn3`s log to sync?
Why not quorum length? Do you know this? [~liutongwei]
was (Author: feng yuan):
I am confusing in:
org/apache/hadoop/hdfs/qjournal/client/SegmentRecoveryComparator.java:86
{code:java}
return ComparisonChain.start()
.compare(r1SeenEpoch, r2SeenEpoch)
.compare(r1.getSegmentState().getEndTxId(),
r2.getSegmentState().getEndTxId())
.result();
{code}
Why would pick longest when round1, if this way, will pick jn3`s log to sync?
Why not quorum length? Do you know this? [~liutongwei]
> [SBN Read]When fast path tail enabled, standby or observer namenode may read
> uncommitted data
> ---------------------------------------------------------------------------------------------
>
> Key: HDFS-16493
> URL: https://issues.apache.org/jira/browse/HDFS-16493
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: journal-node, namanode
> Reporter: liutongwei
> Priority: Critical
> Attachments: exapmle.v1.patch
>
>
> Although fast path tail use quorum read to pull edit log, it seem like can
> read uncommitted data in some corner case.
> Here is an example. Suppose we have three JN, their init state is:
>
> {code:java}
> epoch 1
> JN1 [1-3](in-progress)
> JN2 [1-3](in-progress)
> JN3 [1-4](in-progress)
> Note that, in epoch 1 txid 1-3 was committed, and txid 4 not.
> {code}
> When a failover occur, if a new writer cannot contact to JN3 for network
> partition, and finish the recovery stage, and write a new txid 4 in epoch 2,
> which value not equal to JN3's.
>
> {code:java}
> epcho 2
> JN1 [1-3](finalized) [4-4](inprogress)
> JN2 [1-3](finalized) [4-4](inprogress)
> JN3 [1-4](inprogress)
> Note that, in JN3 txid4's value not equal to other JN.
> {code}
>
> Now there is a read namenode to pull edits, and it contact to JN3 and JN2, it
> got majority response. But it got logs of same length but different
> content.And no more information to choose which log is right. If we choose
> JN3, we got meta data corruption.
> There is a test example patch [^example.patch] for running and debug.
> For fix it i think we should add finalized state to
> {{{}GetJournaledEditsResponseProto{}}}, so we can discard the fault log.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]