On Fri, Oct 07, 2011 at 10:17AM, Steve Loughran wrote: > On 06/10/2011 17:49, [email protected] wrote: >> Steve, >> >>> Summary: I'm not sure that HDFS is the right FS in this world, as it >>> contains a lot of assumptions about system stability and HDD persistence >>> that aren't valid any more. With the ability to plug in new placers you >>> could do tricks like ensure 1 replica lives in a persistent blockstore >>> (and rely on it always being there), and add other replicas in transient >>> storage if the data is about to be needed in jobs. >> >> Can you please shed more light on the statement "... as it >> contains a lot of assumptions about system stability and HDD persistence >> that aren't valid any more..." ? >> >> I know that you were doing some analysis of disk failure modes sometime >> ago. Is this the result of that research ? I am very interested. > > no, it's unrelated -experience in hosting virtual hadoop > infrastructures. Which is how my short-lived clusters exist today > > -you don't know the hostname of the master nodes until allocated, so you > need to allocate them and dynamically push out configs to the workers
This is of course is a big win for non-autodiscoverable architecture ;) > -the Datanodes spin when the namenode goes down, forever, rather than > checking somewhere to see if its changed. HDFS HA may fix that. .. > -again, the TaskTrackers spin when the JT goes down, rather than look to > see if its moved. .. > -Blacklisting isn't the right way to deal with task tracker failures: > termination of VM is. See my above comment. Auto-discovery would solve a lot of these issues and many others such as shared distributed memory suitable for condig management etc. Cos
