We are planning to run a next generation of Hadoop ecosystem components in
our production in a few months. We plan to use HDFS 2.0 for the HA NameNode
work. The platform will also include YARN but its use will be experimental.
So we'll be running something equivalent to the CDH MR1 package to support
production workloads for I'd guess a year.

We have heard a rumor regarding the existence of a version of the MR1
Jobtracker that persists state to Zookeeper such that failover to a new
instance is fast and doesn't lose job state. I'd like to be aspirational
and aim for a HA MR1 Jobtracker to compliment the HA namenode. Even if no
such existing code is available, we might adapt existing classes in the MR1
Jobtracker to models/proxies of state in zookeeper. For clusters of our
size (in the 100s of nodes range) this could be workable. Also, the MR
client could possibly use ZK for failover like the HDFS client.

I'm trying to find out first the availability of such code if anyone knows.
Otherwise, we may try building this, and so also I'd like to get a sense of
any interest in usage or dev collaboration.

Best regards,

    - Andy




-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Reply via email to