For streaming data to Hbase 0.20.6, Hadoop 0.20 is required. If you are using the sequence file and mapreduce, then Hadoop 0.18+ should work fine. I would like to move forward with map reduce on table instead of map reduce on file to reduce data processing latency. Hence, I am making Hbase a required component for trunk for a full end to end deployment. The setup may sound difficult to do, but it is trivial to setup. In the long run, demux does not need to be a map reduce job, and demux will only get optimized for incremental updates, when developers start to adopt the table approach. Considering this migration as planning for a better future. :) For now, the existing Chukwa 0.4 approach is still works in trunk but I am not enhancing the batch model.
Regards, Eric On 10/11/10 1:11 PM, "Bill Graham" <[email protected]> wrote: > Hi Eric, > > A read this over and have a few comments: > > - Prerequisites say that Hadoop 0.20+ is required, which I think isn't > entirely true. Agents, collectors and the data processor processes all > can run with 0.18.3. I though HICC was the only thing that required > 0.20, no? Either way, we should clarify which components require what > version. > > - You talk about a minimal Chukwa install including HBase, which also > seems misleading. Will HBase be required to use Chukwa going forward? > If not, no need to make the barrier for entry sound higher than it > needs to be IMO. > > thanks, > Bill > > On Sat, Oct 9, 2010 at 6:47 PM, Eric Yang <[email protected]> wrote: >> Hi, >> >> I have written some instructions on how to deploy Chukwa 0.5 >> pre-release system with HBase. Instructions are available here: >> >> http://wiki.apache.org/hadoop/Chukwa_Quick_Start >> >> Suggestions and feedback are welcome. >> >> regards, >> Eric >> >
