[ https://issues.apache.org/jira/browse/HDFS-16490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liutongwei updated HDFS-16490: ------------------------------ Description: As observer namenode is serving CoordinatedCall, it will requeue call if server stateId smaller than client stateId. In a heavy read but rare write cluster, the client may get a stateId not replicated to journal node. For example, when client call {{FSNamesystem.getBlockLocations}} and need {{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but not {{{}logSync(){}}}. Then client get a stateId which do not replicated to journal node. So observer name will requeue the client call when nothing trigger a{{{} logSync(){}}}. In rare write cluster, this elapsed time could be from seconds to minutes. For fix this, we could add a requeue call timeout config or return client stateId with active committed txid. was: As observer namenode is serving CoordinatedCall, it will requeue call if server stateId smaller than client stateId. In a heavy read but rare write cluster, the client may get a stateId not replicated to journal node. For example, when client call FSNamesystem.getBlockLocations and need updateAccessTime , active namenode call getEditLog().logTimes() but not logSync(). Then client get a stateId which do not replicated to journal node. So observer name will requeue the client call when nothing trigger a logSync(). In rare write cluster, this elapsed time could be from seconds to minutes. For fix this, we could add a requeue call timeout config or return client stateId with active committed txid. > CoordinatedCall to observer namnode will requeue util the active namenode > logsync success > ----------------------------------------------------------------------------------------- > > Key: HDFS-16490 > URL: https://issues.apache.org/jira/browse/HDFS-16490 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namanode > Reporter: liutongwei > Priority: Minor > > As observer namenode is serving CoordinatedCall, it will requeue call if > server stateId smaller than client stateId. In a heavy read but rare write > cluster, the client may get a stateId not replicated to journal node. > For example, when client call {{FSNamesystem.getBlockLocations}} and need > {{updateAccessTime}} , active namenode call {{getEditLog().logTimes()}} but > not {{{}logSync(){}}}. Then client get a stateId which do not replicated to > journal node. So observer name will requeue the client call when nothing > trigger a{{{} logSync(){}}}. In rare write cluster, this elapsed time could > be from seconds to minutes. > For fix this, we could add a requeue call timeout config or return client > stateId with active committed txid. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org