[
https://issues.apache.org/jira/browse/HDFS-16490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liutongwei updated HDFS-16490:
------------------------------
Description:
As observer namenode is serving CoordinatedCall, it will requeue call if server
stateId smaller than client stateId. In a heavy read but rare write cluster,
the client may get a stateId not replicated to journal node.
For example, when client call {{FSNamesystem.getBlockLocations}} and need
{{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but do
not call {{{}logSync(){}}}. Then client get a stateId which do not replicated
to journal node. So observer name will requeue the client call when nothing
trigger a {{{}logSync(){}}}. In rare write cluster, this elapsed time could be
from seconds to minutes.
For fix this, we could add a requeue call timeout config or return client
stateId with active committed txid.
was:
As observer namenode is serving CoordinatedCall, it will requeue call if server
stateId smaller than client stateId. In a heavy read but rare write cluster,
the client may get a stateId not replicated to journal node.
For example, when client call {{FSNamesystem.getBlockLocations}} and need
{{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but do
not call {{{}logSync(){}}}. Then client get a stateId which do not replicated
to journal node. So observer name will requeue the client call when nothing
trigger a{{{{}} logSync(){}}}. In rare write cluster, this elapsed time could
be from seconds to minutes.
For fix this, we could add a requeue call timeout config or return client
stateId with active committed txid.
> CoordinatedCall to observer namnode will requeue util the active namenode
> logsync success
> -----------------------------------------------------------------------------------------
>
> Key: HDFS-16490
> URL: https://issues.apache.org/jira/browse/HDFS-16490
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namanode
> Reporter: liutongwei
> Priority: Minor
>
> As observer namenode is serving CoordinatedCall, it will requeue call if
> server stateId smaller than client stateId. In a heavy read but rare write
> cluster, the client may get a stateId not replicated to journal node.
> For example, when client call {{FSNamesystem.getBlockLocations}} and need
> {{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but
> do not call {{{}logSync(){}}}. Then client get a stateId which do not
> replicated to journal node. So observer name will requeue the client call
> when nothing trigger a {{{}logSync(){}}}. In rare write cluster, this elapsed
> time could be from seconds to minutes.
> For fix this, we could add a requeue call timeout config or return client
> stateId with active committed txid.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]