Hi!

TL;DR: The Analytics team added automatic failover for the HDFS Namenode. A
new daemon is running on the analytics1001/1002 hosts called
hadoop-hdfs-zkfc (port 8019) responsible to talk with Zookeeper and execute
periodical health checks. Monitoring and Ferm rules has been added.

More info:

1) https://phabricator.wikimedia.org/T129838
2)
https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Automatic_Failover
3) Monitoring and Ferm rules added:
https://gerrit.wikimedia.org/r/#/c/279408/

Let me know if you have any questions!

Luca
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to