Well, we are kind of a poster child for this kind of reliability calculus. We opted for Mogile for real-time serving because we could see how to split the master into shards and how to do HA on it. For batch oriented processes where a good processing model is important, we use hadoop.
I would have been happier pushing for a pure hadoop solution, but I just don't think that it would fit that well and I would rather have a heavy hammer and a sharp chisel than being forced to compromise on either a sharp hammer or a really heavy chisel. On 12/20/07 7:25 AM, "Pat Ferrel" <[EMAIL PROTECTED]> wrote: >> 2. If hadoop is configured in multinode cluster(with One machine as namenode >>>> and jobtracker and other machine as slave. Namenode acts as a slave node >>>> also) . How to handle the namenode failovers?. >> >> There are backup mechanisms that you can use to allow you rebuild the name >> node. There is no official solution for the high availability problem. >> Most hadoop systems work on batch problems where an hour or two of downtime >> every few years is not a problem. > > Actually we were thinking of the product of many mapreduce tasks as needing > high availability. In other words you can handle down time in creating the > database but not so much in serving it up. If hbase is the source from > which we build pages then downtime is more of a problem. If anyone is > thinking about an unofficial solution we¹d be interested.