+1 I was initially concerned about the overhead of having to install separate packages for each component, but in some ways it will make things clearer. Folks on the user list are often asking how to use HDFS by itself for instance - or even if it is possible. By splitting it up it would make it clear that HDFS and MapReduce can be used without the other (although of course, they are best used together). Also, I can see some benefit from having separate configuration for HDFS and MapReduce, since it will make the configuration files smaller and more manageable (something like hdfs-(default|site).xml, mapreduce-(default|site).xml).
It's not totally clear to me how Core fits into this. It's just a jar file and doesn't have daemons, so it should be bundled with the MapReduce and HDFS releases, shouldn't it? Nigel Daley <[EMAIL PROTECTED]> wrote: > How will unit tests be divided? For instance, will all three have to have > MiniDFSCluster and other shared test infrastructure? Today the tests for core, hdfs and mapred are under one source tree because they are so tightly intertwined. I think the goal should be to have independent unit tests for each module, as well as integration tests that test that MapReduce works with HDFS. We should do this even if we don't split the projects. Tom
