[ 
https://issues.apache.org/jira/browse/HDFS-15915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346591#comment-17346591
 ] 

Virajith Jalaparti edited comment on HDFS-15915 at 5/18/21, 5:54 AM:
---------------------------------------------------------------------

Thanks for finding this and providing a fix [~shv]. A few questions:
# Nit: Should the default implementation of 
{{EditLogOutputStream#getLastJournalledTxId}} return a value of -1 instead of 0 
as 0 can be a valid txid?
# Nit: In the current implementation, the return value of {{beginTransaction}} 
is used to get the start time in one place but ignored in other places. Should 
we just make it return void and force the caller to track the start time?
# Without this change, the previous implementation seems to have relied on the 
ordering within the queue (elements added under the FSN lock) 
({{FSEditLogAsync#editPendingQ}}) to ensure that the order in which edits are 
assigned txids is the same in which they are processed. Why is that not 
sufficient when Observer is not used?





was (Author: virajith):
Thanks for finding this and providing a fix [~shv]. A few questions:
# Nit: Should the default implementation of 
{{EditLogOutputStream#getLastJournalledTxId}} return a value of -1 instead of 0 
as 0 can be a valid txid?
# Nit: In the current implementation, the return value of {{beginTransaction}} 
is used to get the start time in one place but ignored in other places. Should 
we just make it return void and force the caller to track the start time?
# Without this change, the previous implementation seems to have relied on the 
ordering within the queue (elements added under the FSN lock) 
({{FSEditLogAsync#editPendingQ}}) to ensure that the order in which edits are 
assigned txids is the same in which they are processed. Why is that not 
sufficient?




> Race condition with async edits logging due to updating txId outside of the 
> namesystem log
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15915
>                 URL: https://issues.apache.org/jira/browse/HDFS-15915
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, namenode
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Major
>         Attachments: HDFS-15915-01.patch, HDFS-15915-02.patch, 
> HDFS-15915-03.patch, testMkdirsRace.patch
>
>
> {{FSEditLogAsync}} creates an {{FSEditLogOp}} and populates its fields inside 
> {{FSNamesystem.writeLock}}. But one essential field the transaction id of the 
> edits op remains unset until the time when the operation is scheduled for 
> synching. At that time {{beginTransaction()}} will set the the 
> {{FSEditLogOp.txid}} and increment the global transaction count. On busy 
> NameNode this event can fall outside the write lock. 
> This causes problems for Observer reads. It also can potentially reshuffle 
> transactions and Standby will apply them in a wrong order.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to