[ 
https://issues.apache.org/jira/browse/HDFS-16490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liutongwei updated HDFS-16490:
------------------------------
    Description: 
As observer namenode is serving CoordinatedCall, it will requeue call if server 
stateId smaller than client stateId.  In a heavy read but rare write cluster, 
the client may get a stateId not replicated to journal node.

For example,  when client call {{FSNamesystem.getBlockLocations}} and need 
{{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but not 
{{{}logSync(){}}}. Then client get a stateId which do not replicated to journal 
node. So observer name will requeue the client call when nothing trigger a{{{} 
logSync(){}}}. In rare write cluster, this elapsed time could be from seconds 
to minutes.

For fix this, we could add a  requeue call timeout  config or return client 
stateId with active committed txid.

  was:
As observer namenode is serving CoordinatedCall, it will requeue call if server 
stateId smaller than client stateId.  In a heavy read but rare write cluster, 
the client may get a stateId not replicated to journal node.

For example,  when client call FSNamesystem.getBlockLocations and need 
updateAccessTime , active namenode call getEditLog().logTimes() but not 
logSync(). Then client get a stateId which do not replicated to journal node. 
So observer name will requeue the client call when nothing trigger a logSync(). 
In rare write cluster, this elapsed time could be from seconds to minutes.

For fix this, we could add a  requeue call timeout  config or return client 
stateId with active committed txid.


> CoordinatedCall to observer namnode will requeue util the active namenode 
> logsync success
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-16490
>                 URL: https://issues.apache.org/jira/browse/HDFS-16490
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namanode
>            Reporter: liutongwei
>            Priority: Minor
>
> As observer namenode is serving CoordinatedCall, it will requeue call if 
> server stateId smaller than client stateId.  In a heavy read but rare write 
> cluster, the client may get a stateId not replicated to journal node.
> For example,  when client call {{FSNamesystem.getBlockLocations}} and need 
> {{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but 
> not {{{}logSync(){}}}. Then client get a stateId which do not replicated to 
> journal node. So observer name will requeue the client call when nothing 
> trigger a{{{} logSync(){}}}. In rare write cluster, this elapsed time could 
> be from seconds to minutes.
> For fix this, we could add a  requeue call timeout  config or return client 
> stateId with active committed txid.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to