Mohammad, The NN is low write, and has pretty static memory usage. You will see the NN memory usage go up as you add blocks/files. Since, HBase has memory limitations(GC's Fault), and should have ~1 file per store you will not have a lot of memory pressure on the NN. The JT is the same way, it scales up usage based on number of MR jobs. In a sane HBase environment you are not going to be running 1000s of MR jobs against HBase. ZK also has pretty minimum requirements - 1GB of memory, dedicated CPU core, and place to write to with low I/O wait. I have always found the NN, SNN, and JT to be the next best place to put the ZK if dedicated HW is not available. I have seen some strange behavior with ZK runs on DN/TT/RS nodes. From unexplained timeouts to corrupt znodes causing failures(This one was real nasty).
On Sat, Jun 22, 2013 at 7:21 PM, Mohammad Tariq <[email protected]> wrote: > Hello Iain, > > You would put a lot of pressure on the RAM if you do that. NN > already has high memory requirement and then having JT+ZK on the same > machine would be too heavy, IMHO. > > Warm Regards, > Tariq > cloudfront.blogspot.com > > > On Sun, Jun 23, 2013 at 4:07 AM, iain wright <[email protected]> wrote: > > > Hi Mohammad, > > > > I am curious why you chose not to put the third ZK on the NN+JT? I was > > planning on doing that on a new cluster and want to confirm it would be > > okay. > > > > > > -- > > Iain Wright > > Cell: (562) 852-5916 > > > > <http://www.labctsi.org/> > > This email message is confidential, intended only for the recipient(s) > > named above and may contain information that is privileged, exempt from > > disclosure under applicable law. If you are not the intended recipient, > do > > not disclose or disseminate the message to anyone except the intended > > recipient. If you have received this message in error, or are not the > named > > recipient(s), please immediately notify the sender by return email, and > > delete all copies of this message. > > > > > > On Sat, Jun 22, 2013 at 10:05 AM, Mohammad Tariq <[email protected]> > > wrote: > > > > > Yeah, I forgot to mention that no. of ZKs should be odd. Perhaps those > > > parentheses made that statement look like an optional statement. Just > to > > > clarify it was mandatory. > > > > > > Warm Regards, > > > Tariq > > > cloudfront.blogspot.com > > > > > > > > > On Sat, Jun 22, 2013 at 9:45 PM, Kevin O'dell < > [email protected] > > > >wrote: > > > > > > > If you run ZK with a DN/TT/RS please make sure to dedicate a hard > drive > > > and > > > > a core to the ZK process. I have seen many strange occurrences. > > > > On Jun 22, 2013 12:10 PM, "Jean-Marc Spaggiari" < > > [email protected] > > > > > > > > wrote: > > > > > > > > > You HAVE TO run a ZK3, or else you don't need to have ZK2 and any > ZK > > > > > failure will be an issue. You need to have an odd number of ZK > > > > > servers... > > > > > > > > > > Also, if you don't run MR jobs, you don't need the TT and JT... > Else, > > > > > everything below is correct. But there is many other options, all > > > > > depend on your needs and the hardware you have ;) > > > > > > > > > > JM > > > > > > > > > > 2013/6/22 Mohammad Tariq <[email protected]>: > > > > > > With 8 machines you can do something like this : > > > > > > > > > > > > Machine 1 - NN+JT > > > > > > Machine 2 - SNN+ZK1 > > > > > > Machine 3 - HM+ZK2 > > > > > > Machine 4-8 - DN+TT+RS > > > > > > (You can run ZK3 on a slave node with some additional memory). > > > > > > > > > > > > DN and RS run on the same machine. Although RSs are said to hold > > the > > > > > data, > > > > > > the data is actually stored in DNs. Replication is managed at > HDFS > > > > level. > > > > > > You don't have to worry about that. > > > > > > > > > > > > You can visit this link < > > > > http://hbase.apache.org/book/perf.writing.html> > > > > > to > > > > > > see how to write efficiently into HBase. With a small field there > > > > should > > > > > > not be any problem except storage and increased metadata, as > you'll > > > > have > > > > > > many small cells. If possible club several small fields into one > > and > > > > put > > > > > > them together in one cell. > > > > > > > > > > > > HTH > > > > > > > > > > > > Warm Regards, > > > > > > Tariq > > > > > > cloudfront.blogspot.com > > > > > > > > > > > > > > > > > > On Sat, Jun 22, 2013 at 8:31 PM, myhbase <[email protected]> > wrote: > > > > > > > > > > > >> Thanks for your response. > > > > > >> > > > > > >> Now if 5 servers are enough, how can I install and configure my > > > > nodes? > > > > > If > > > > > >> I need 3 replicas in case data loss, I should at least have 3 > > > > > datanodes, we > > > > > >> still have namenode, regionserver and HMaster nodes, zookeeper > > > nodes, > > > > > some > > > > > >> of them must be installed in the same machine. The datanode > seems > > > the > > > > > disk > > > > > >> IO sensitive node while region server is the mem sensitive, can > I > > > > > install > > > > > >> them in the same machine? Any suggestion on the deployment plan? > > > > > >> > > > > > >> My business requirement is that the write is much more than > > > read(7:3), > > > > > and > > > > > >> I have another concern that I have a field which will have the > > > 8~15KB > > > > in > > > > > >> data size, I am not sure, there will be any problem in hbase > when > > > it > > > > > runs > > > > > >> compaction and split in regions. > > > > > >> > > > > > >> Oh, you already have heavyweight's input :). > > > > > >>> > > > > > >>> Thanks JM. > > > > > >>> > > > > > >>> Warm Regards, > > > > > >>> Tariq > > > > > >>> cloudfront.blogspot.com > > > > > >>> > > > > > >>> > > > > > >>> On Sat, Jun 22, 2013 at 8:05 PM, Mohammad Tariq < > > > [email protected]> > > > > > >>> wrote: > > > > > >>> > > > > > >>> Hello there, > > > > > >>>> > > > > > >>>> IMHO, 5-8 servers are sufficient enough to start > with. > > > But > > > > > it's > > > > > >>>> all relative to the data you have and the intensity of your > > > > > reads/writes. > > > > > >>>> You should have different strategies though, based on whether > > it's > > > > > 'read' > > > > > >>>> or 'write'. You actually can't define 'big' in absolute terms. > > My > > > > > cluster > > > > > >>>> might be big for me, but for someone else it might still be > not > > > big > > > > > >>>> enough > > > > > >>>> or for someone it might be very big. Long story short it > depends > > > on > > > > > your > > > > > >>>> needs. If you are able to achieve your goal with 5-8 RSs, then > > > > having > > > > > >>>> more > > > > > >>>> machines will be a wastage, I think. > > > > > >>>> > > > > > >>>> But you should always keep in mind that HBase is kinda greedy > > when > > > > it > > > > > >>>> comes to memory. For a decent load 4G is sufficient, IMHO. But > > it > > > > > again > > > > > >>>> depends on operations you are gonna perform. If you have large > > > > > clusters > > > > > >>>> where you are planning to run MR jobs frequently you are > better > > > off > > > > > with > > > > > >>>> additional 2G. > > > > > >>>> > > > > > >>>> > > > > > >>>> Warm Regards, > > > > > >>>> Tariq > > > > > >>>> cloudfront.blogspot.com > > > > > >>>> > > > > > >>>> > > > > > >>>> On Sat, Jun 22, 2013 at 7:51 PM, myhbase <[email protected]> > > wrote: > > > > > >>>> > > > > > >>>> Hello All, > > > > > >>>>> > > > > > >>>>> I learn hbase almost from papers and books, according to my > > > > > >>>>> understanding, HBase is the kind of architecture which is > more > > > > > appliable > > > > > >>>>> to a big cluster. We should have many HDFS nodes, and many > > > > > HBase(region > > > > > >>>>> server) nodes. If we only have several severs(5-8), it seems > > > hbase > > > > is > > > > > >>>>> not a good choice, please correct me if I am wrong. In > > addition, > > > > how > > > > > >>>>> many nodes usually we can start to consider the hbase > solution > > > and > > > > > how > > > > > >>>>> about the physic mem size and other hardware resource in each > > > node, > > > > > any > > > > > >>>>> reference document or cases? Thanks. > > > > > >>>>> > > > > > >>>>> --Ning > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > -- Kevin O'Dell Systems Engineer, Cloudera
