- Observer HDFS Namenode. IIUC this was introduced in Hadoop 2.10, it would be nice if we could offer it via puppet for the docker provisioner (if we don't already do it, I didn't find it). Having a separate Namenode to handle read requests could be interesting for busy clusters. Has anybody already deployed it?
Since it does not require additional services, We should be able to enable it based on existing manifest for NameNode HA. You need to update some configuration properties for NameNode, JournalNode and Client then you can transition part of standby NameNodes into observer by `hdfs haadmin -transitionToObserver`. https://hadoop.apache.org/docs/r3.3.0/hadoop-project-dist/hadoop-hdfs/ObserverNameNode.html Practically you need both Standby and Observer NameNodes since Observer can not be promoted to active when current active NameNode is down. While I'm not sure current Puppet manifest allows multiple standby NameNodes, It should be small fix if necessary. Masatake Iwasaki On 2021/06/11 15:52, Luca Toscano wrote:
Hi everybody, https://engineering.linkedin.com/blog/2021/the-exabyte-club--linkedin-s-journey-of-scaling-the-hadoop-distr is a nice blog post reading. There are some interesting follow ups in my opinion: - Fair vs Non-Fair locking for the HDFS Namenode. IIUC this seems to be a code change rather than a jvm setting tunable, but I am wondering if others have experience with different locking mechanisms in production for HDFS. - Observer HDFS Namenode. IIUC this was introduced in Hadoop 2.10, it would be nice if we could offer it via puppet for the docker provisioner (if we don't already do it, I didn't find it). Having a separate Namenode to handle read requests could be interesting for busy clusters. Has anybody already deployed it? Thanks in advance, Luca
