Hello Iain,
You would put a lot of pressure on the RAM if you do that. NN
already has high memory requirement and then having JT+ZK on the same
machine would be too heavy, IMHO.
Warm Regards,
Tariq
cloudfront.blogspot.com
On Sun, Jun 23, 2013 at 4:07 AM, iain wright <[email protected]> wrote:
> Hi Mohammad,
>
> I am curious why you chose not to put the third ZK on the NN+JT? I was
> planning on doing that on a new cluster and want to confirm it would be
> okay.
>
>
> --
> Iain Wright
> Cell: (562) 852-5916
>
> <http://www.labctsi.org/>
> This email message is confidential, intended only for the recipient(s)
> named above and may contain information that is privileged, exempt from
> disclosure under applicable law. If you are not the intended recipient, do
> not disclose or disseminate the message to anyone except the intended
> recipient. If you have received this message in error, or are not the named
> recipient(s), please immediately notify the sender by return email, and
> delete all copies of this message.
>
>
> On Sat, Jun 22, 2013 at 10:05 AM, Mohammad Tariq <[email protected]>
> wrote:
>
> > Yeah, I forgot to mention that no. of ZKs should be odd. Perhaps those
> > parentheses made that statement look like an optional statement. Just to
> > clarify it was mandatory.
> >
> > Warm Regards,
> > Tariq
> > cloudfront.blogspot.com
> >
> >
> > On Sat, Jun 22, 2013 at 9:45 PM, Kevin O'dell <[email protected]
> > >wrote:
> >
> > > If you run ZK with a DN/TT/RS please make sure to dedicate a hard drive
> > and
> > > a core to the ZK process. I have seen many strange occurrences.
> > > On Jun 22, 2013 12:10 PM, "Jean-Marc Spaggiari" <
> [email protected]
> > >
> > > wrote:
> > >
> > > > You HAVE TO run a ZK3, or else you don't need to have ZK2 and any ZK
> > > > failure will be an issue. You need to have an odd number of ZK
> > > > servers...
> > > >
> > > > Also, if you don't run MR jobs, you don't need the TT and JT... Else,
> > > > everything below is correct. But there is many other options, all
> > > > depend on your needs and the hardware you have ;)
> > > >
> > > > JM
> > > >
> > > > 2013/6/22 Mohammad Tariq <[email protected]>:
> > > > > With 8 machines you can do something like this :
> > > > >
> > > > > Machine 1 - NN+JT
> > > > > Machine 2 - SNN+ZK1
> > > > > Machine 3 - HM+ZK2
> > > > > Machine 4-8 - DN+TT+RS
> > > > > (You can run ZK3 on a slave node with some additional memory).
> > > > >
> > > > > DN and RS run on the same machine. Although RSs are said to hold
> the
> > > > data,
> > > > > the data is actually stored in DNs. Replication is managed at HDFS
> > > level.
> > > > > You don't have to worry about that.
> > > > >
> > > > > You can visit this link <
> > > http://hbase.apache.org/book/perf.writing.html>
> > > > to
> > > > > see how to write efficiently into HBase. With a small field there
> > > should
> > > > > not be any problem except storage and increased metadata, as you'll
> > > have
> > > > > many small cells. If possible club several small fields into one
> and
> > > put
> > > > > them together in one cell.
> > > > >
> > > > > HTH
> > > > >
> > > > > Warm Regards,
> > > > > Tariq
> > > > > cloudfront.blogspot.com
> > > > >
> > > > >
> > > > > On Sat, Jun 22, 2013 at 8:31 PM, myhbase <[email protected]> wrote:
> > > > >
> > > > >> Thanks for your response.
> > > > >>
> > > > >> Now if 5 servers are enough, how can I install and configure my
> > > nodes?
> > > > If
> > > > >> I need 3 replicas in case data loss, I should at least have 3
> > > > datanodes, we
> > > > >> still have namenode, regionserver and HMaster nodes, zookeeper
> > nodes,
> > > > some
> > > > >> of them must be installed in the same machine. The datanode seems
> > the
> > > > disk
> > > > >> IO sensitive node while region server is the mem sensitive, can I
> > > > install
> > > > >> them in the same machine? Any suggestion on the deployment plan?
> > > > >>
> > > > >> My business requirement is that the write is much more than
> > read(7:3),
> > > > and
> > > > >> I have another concern that I have a field which will have the
> > 8~15KB
> > > in
> > > > >> data size, I am not sure, there will be any problem in hbase when
> > it
> > > > runs
> > > > >> compaction and split in regions.
> > > > >>
> > > > >> Oh, you already have heavyweight's input :).
> > > > >>>
> > > > >>> Thanks JM.
> > > > >>>
> > > > >>> Warm Regards,
> > > > >>> Tariq
> > > > >>> cloudfront.blogspot.com
> > > > >>>
> > > > >>>
> > > > >>> On Sat, Jun 22, 2013 at 8:05 PM, Mohammad Tariq <
> > [email protected]>
> > > > >>> wrote:
> > > > >>>
> > > > >>> Hello there,
> > > > >>>>
> > > > >>>> IMHO, 5-8 servers are sufficient enough to start with.
> > But
> > > > it's
> > > > >>>> all relative to the data you have and the intensity of your
> > > > reads/writes.
> > > > >>>> You should have different strategies though, based on whether
> it's
> > > > 'read'
> > > > >>>> or 'write'. You actually can't define 'big' in absolute terms.
> My
> > > > cluster
> > > > >>>> might be big for me, but for someone else it might still be not
> > big
> > > > >>>> enough
> > > > >>>> or for someone it might be very big. Long story short it depends
> > on
> > > > your
> > > > >>>> needs. If you are able to achieve your goal with 5-8 RSs, then
> > > having
> > > > >>>> more
> > > > >>>> machines will be a wastage, I think.
> > > > >>>>
> > > > >>>> But you should always keep in mind that HBase is kinda greedy
> when
> > > it
> > > > >>>> comes to memory. For a decent load 4G is sufficient, IMHO. But
> it
> > > > again
> > > > >>>> depends on operations you are gonna perform. If you have large
> > > > clusters
> > > > >>>> where you are planning to run MR jobs frequently you are better
> > off
> > > > with
> > > > >>>> additional 2G.
> > > > >>>>
> > > > >>>>
> > > > >>>> Warm Regards,
> > > > >>>> Tariq
> > > > >>>> cloudfront.blogspot.com
> > > > >>>>
> > > > >>>>
> > > > >>>> On Sat, Jun 22, 2013 at 7:51 PM, myhbase <[email protected]>
> wrote:
> > > > >>>>
> > > > >>>> Hello All,
> > > > >>>>>
> > > > >>>>> I learn hbase almost from papers and books, according to my
> > > > >>>>> understanding, HBase is the kind of architecture which is more
> > > > appliable
> > > > >>>>> to a big cluster. We should have many HDFS nodes, and many
> > > > HBase(region
> > > > >>>>> server) nodes. If we only have several severs(5-8), it seems
> > hbase
> > > is
> > > > >>>>> not a good choice, please correct me if I am wrong. In
> addition,
> > > how
> > > > >>>>> many nodes usually we can start to consider the hbase solution
> > and
> > > > how
> > > > >>>>> about the physic mem size and other hardware resource in each
> > node,
> > > > any
> > > > >>>>> reference document or cases? Thanks.
> > > > >>>>>
> > > > >>>>> --Ning
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>
> > > > >>
> > > >
> > >
> >
>