RE: Hbase / Hadoop Tuning

Jim Kellerman (POWERSET) Thu, 02 Oct 2008 12:30:25 -0700

What you are storing is 140,000,000 bytes, so having multiple
region servers will not help you as a single region is only
served by a single region server. By default, regions split
when they reach 256MB. So until the region splits, all traffic
will go to a single region server. You might try reducing the
maximum file size to encourage region splitting by changing the
value of hbase.hregion.max.filesize to 64MB.


Using a single client will also limit write performance.
Even if the client is multi-threaded, there is a big giant lock
in the RPC mechanism which prevents concurrent requests (This
is something we plan to fix in the future).

Multiple clients do not block against one another the way multi-
threaded clients do currently. So another way to increase
write performance would be to run multiple (HBase, not web) clients,
by either running multiple processes directly, or by utilizing
a Map/Reduce job to do the writes.

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: Slava Gorelik [mailto:[EMAIL PROTECTED]
> Sent: Thursday, October 02, 2008 12:07 PM
> To: [email protected]
> Subject: Re: Hbase / Hadoop Tuning
>
> Hi.Thank you for quick response.
> We are using 7 machines (6 RedHat 5 and 1 is SuSe interprise 10).
> Each machine is : 4 CPU with 4gb ram and 200gb HD, connected with 1gb
> network interface.
> All machines in the same rec. On one machine (master) we are running
> Tomcat
> with one webapp
> that is adding 100000 rows. Nothing else is running. When no webapp
> running
> the CPU load is less the 1%.
>
> We are using Hbase 0.18.0 and Hadoop 0.18.0.
> Hbase cluster is one master and 6 region servers.
>
> Row addition is done by BatchUpdate and commint into single column family.
> The data is simple bytes array (1400 bytes each row).
>
>
> Thank You and Best Regards.
>
>
>
>
> On Thu, Oct 2, 2008 at 9:39 PM, stack <[EMAIL PROTECTED]> wrote:
>
> > Tell us more Slava.  HBase versions and how many regions you have in
> your
> > cluster?
> >
> > If small rows, your best boost will likely come when we support batching
> of
> > updates: HBASE-748.
> >
> > St.Ack
> >
> >
> >
> > Slava Gorelik wrote:
> >
> >> Hi All.
> >> Our environment - 8 Datanodes (1 is also Namenode),
> >> 7 from them is also region servers and 1 is Master, default replication
> -
> >> 3.
> >> We have application that heavy writes with relative small rows - about
> >> 10Kb,
> >> current performance is 100000 rows in 580000 Milisec - 5.8 Milisec /
> row.
> >> Is there any way to improve this performance by some tuning / tweaking
> >> HBase
> >> or Hadoop ?
> >>
> >> Thank You and Best Regards.
> >>
> >>
> >>
> >
> >

RE: Hbase / Hadoop Tuning

Reply via email to