Hi,

I am a newbie to Hadoop, HBase and Hive. I installed Hadoop, HBase and Hive
in pseudodistributed mode and everything works fine. Now I am planning to
set up an simple Hadoop Cluster (5 nodes) with Hive, HBase and ZooKeeper.
I´ve read several documentations and instructions before but i could not
find a good explanation for my question. I´m not sure, where to run all the
daemons. This is my consideration:

*Node_1* (Master)

   - NameNode
   - JobTrakcer
   - HBase Master
   -

   ZooKeeper (Standalone node; managed by HBase)



*Node_2* (Backup_Master)

   -

   SecondaryNameNode



*Node_3* (Slave1)

   - DataNode1
   - TaskTracker1
   -

   RegionServer1



*Node_4* (Slave2)

   - DataNode2
   - TaskTracker2
   -

   RegionServer2



*Node_5* (Slave3)

   - DataNode3
   - TaskTracker3
   - RegionServer3


I know, in production it is recommended to run ZooKeeper ensemble at an odd
number of nodes (seperate Cluster). But for a simple cluster, is it OK to
set up a standalone ZooKeeper node which runs on the master node?
Another question is regarding Hive: I know that Hive is a Hadoop client.
Should I also install Hive on the master node? Does it make sense?

Thanks for all tips and comments!

Hakan

Note: I have just 5 machines to simulate a cluster.

Reply via email to