[
https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352565#comment-17352565
]
Daryn Sharp commented on HDFS-15915:
------------------------------------
I'm very nervous about this patch and need to thoroughly reacquaint myself with
the code. Skimming the patch, I'm initially very worried about the added
synchronization and the potential for deadlock particularly during an edit log
roll. We're in the midst of a upgrade cycle so I likely won't have time to
review till early next but in the meantime we will internally revert due to
risk...
> Race condition with async edits logging due to updating txId outside of the
> namesystem log
> ------------------------------------------------------------------------------------------
>
> Key: HDFS-15915
> URL: https://issues.apache.org/jira/browse/HDFS-15915
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs, namenode
> Reporter: Konstantin Shvachko
> Assignee: Konstantin Shvachko
> Priority: Major
> Fix For: 3.4.0, 3.1.5, 2.10.2, 3.2.3, 3.3.2
>
> Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch,
> HDFS-15915-03.patch, HDFS-15915-04.patch, HDFS-15915-05.patch,
> testMkdirsRace.patch
>
>
> {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside
> {{FSNamesystem.writeLock}}. But one essential field the transaction id of the
> edits op remains unset until the time when the operation is scheduled for
> synching. At that time {{beginTransaction()}} will set the the
> {{FSEditLogOp.txid}} and increment the global transaction count. On busy
> NameNode this event can fall outside the write lock.
> This causes problems for Observer reads. It also can potentially reshuffle
> transactions and Standby will apply them in a wrong order.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]