Hi all, I have a small hbase cluster that I have recently filled with about 500M records (some of them quite large). One of the things that I notice when I do different types of map / reduce jobs over my table is that the network becomes a bottleneck. Currently I am running single gig Ethernet on this cluster, but it has 4 network ports.
My question is this: is it possible to set up hadoop/hbase to take advantage of multiple networks connecting the computers? Could I specify multiple network connections in the config file? Would it make sense to put the region servers on a different network than the data nodes? Would it be more efficient to bond multiple channels at the OS level? Thanks for the suggestions, Dave
