If you run ZK with a DN/TT/RS please make sure to dedicate a hard drive and a core to the ZK process. I have seen many strange occurrences. On Jun 22, 2013 12:10 PM, "Jean-Marc Spaggiari" <jean-m...@spaggiari.org> wrote:
> You HAVE TO run a ZK3, or else you don't need to have ZK2 and any ZK > failure will be an issue. You need to have an odd number of ZK > servers... > > Also, if you don't run MR jobs, you don't need the TT and JT... Else, > everything below is correct. But there is many other options, all > depend on your needs and the hardware you have ;) > > JM > > 2013/6/22 Mohammad Tariq <donta...@gmail.com>: > > With 8 machines you can do something like this : > > > > Machine 1 - NN+JT > > Machine 2 - SNN+ZK1 > > Machine 3 - HM+ZK2 > > Machine 4-8 - DN+TT+RS > > (You can run ZK3 on a slave node with some additional memory). > > > > DN and RS run on the same machine. Although RSs are said to hold the > data, > > the data is actually stored in DNs. Replication is managed at HDFS level. > > You don't have to worry about that. > > > > You can visit this link <http://hbase.apache.org/book/perf.writing.html> > to > > see how to write efficiently into HBase. With a small field there should > > not be any problem except storage and increased metadata, as you'll have > > many small cells. If possible club several small fields into one and put > > them together in one cell. > > > > HTH > > > > Warm Regards, > > Tariq > > cloudfront.blogspot.com > > > > > > On Sat, Jun 22, 2013 at 8:31 PM, myhbase <myhb...@126.com> wrote: > > > >> Thanks for your response. > >> > >> Now if 5 servers are enough, how can I install and configure my nodes? > If > >> I need 3 replicas in case data loss, I should at least have 3 > datanodes, we > >> still have namenode, regionserver and HMaster nodes, zookeeper nodes, > some > >> of them must be installed in the same machine. The datanode seems the > disk > >> IO sensitive node while region server is the mem sensitive, can I > install > >> them in the same machine? Any suggestion on the deployment plan? > >> > >> My business requirement is that the write is much more than read(7:3), > and > >> I have another concern that I have a field which will have the 8~15KB in > >> data size, I am not sure, there will be any problem in hbase when it > runs > >> compaction and split in regions. > >> > >> Oh, you already have heavyweight's input :). > >>> > >>> Thanks JM. > >>> > >>> Warm Regards, > >>> Tariq > >>> cloudfront.blogspot.com > >>> > >>> > >>> On Sat, Jun 22, 2013 at 8:05 PM, Mohammad Tariq <donta...@gmail.com> > >>> wrote: > >>> > >>> Hello there, > >>>> > >>>> IMHO, 5-8 servers are sufficient enough to start with. But > it's > >>>> all relative to the data you have and the intensity of your > reads/writes. > >>>> You should have different strategies though, based on whether it's > 'read' > >>>> or 'write'. You actually can't define 'big' in absolute terms. My > cluster > >>>> might be big for me, but for someone else it might still be not big > >>>> enough > >>>> or for someone it might be very big. Long story short it depends on > your > >>>> needs. If you are able to achieve your goal with 5-8 RSs, then having > >>>> more > >>>> machines will be a wastage, I think. > >>>> > >>>> But you should always keep in mind that HBase is kinda greedy when it > >>>> comes to memory. For a decent load 4G is sufficient, IMHO. But it > again > >>>> depends on operations you are gonna perform. If you have large > clusters > >>>> where you are planning to run MR jobs frequently you are better off > with > >>>> additional 2G. > >>>> > >>>> > >>>> Warm Regards, > >>>> Tariq > >>>> cloudfront.blogspot.com > >>>> > >>>> > >>>> On Sat, Jun 22, 2013 at 7:51 PM, myhbase <myhb...@126.com> wrote: > >>>> > >>>> Hello All, > >>>>> > >>>>> I learn hbase almost from papers and books, according to my > >>>>> understanding, HBase is the kind of architecture which is more > appliable > >>>>> to a big cluster. We should have many HDFS nodes, and many > HBase(region > >>>>> server) nodes. If we only have several severs(5-8), it seems hbase is > >>>>> not a good choice, please correct me if I am wrong. In addition, how > >>>>> many nodes usually we can start to consider the hbase solution and > how > >>>>> about the physic mem size and other hardware resource in each node, > any > >>>>> reference document or cases? Thanks. > >>>>> > >>>>> --Ning > >>>>> > >>>>> > >>>>> > >> > >> >