[
https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347336#comment-17347336
]
Konstantin Shvachko commented on HDFS-15915:
--------------------------------------------
Updated the patch per [~virajith]'s suggestions. Thanks.
# The default implementation of {{EditLogOutputStream.getLastJournalledTxId()}}
returns {{INVALID_TXID}} rather than {{0}}.
# Changed {{beginTransaction()}} type to void.
??This change forces the txid to be assigned when the operation takes place
under the FSN lock.??
Exactly right. The advantage of this in non-Observer case is verifiability and
proper enforcement.
When you merely rely on placing operations into the queue in the right order
you cannot verify that, such as write unit tests or set asserts. And it is hard
to detect a bug if there is one in this very multi-threaded code.
With the patch the txId is generated when the operation is queued, so I could
add asserts to ensure operations are queued and synced in the order they were
applied on the active NN.
> Race condition with async edits logging due to updating txId outside of the
> namesystem log
> ------------------------------------------------------------------------------------------
>
> Key: HDFS-15915
> URL: https://issues.apache.org/jira/browse/HDFS-15915
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs, namenode
> Reporter: Konstantin Shvachko
> Assignee: Konstantin Shvachko
> Priority: Major
> Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch,
> HDFS-15915-03.patch, HDFS-15915-04.patch, testMkdirsRace.patch
>
>
> {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside
> {{FSNamesystem.writeLock}}. But one essential field the transaction id of the
> edits op remains unset until the time when the operation is scheduled for
> synching. At that time {{beginTransaction()}} will set the the
> {{FSEditLogOp.txid}} and increment the global transaction count. On busy
> NameNode this event can fall outside the write lock.
> This causes problems for Observer reads. It also can potentially reshuffle
> transactions and Standby will apply them in a wrong order.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]