On Nov 30, 2007 9:46 AM, Mike Perkowitz <[EMAIL PROTECTED]> wrote:

>
>
> Hello! We have a web site currently built on linux/apache/mysql/php. Most
> pages do some mysql queries and then stuff the results into php/html
> templates. We've been hitting the limits of what our database can handle,
> and what we can do in realtime for the site. Our plan is to move our data
> over to Hbase, precomputing as much as we can (some queries we currently
> do
> with joins in mysql, for example). Our pages would then be pulling rows
> from
> Hbase to stuff into templates.
>
>
>
> We're still working on getting Hbase working with the amount of data we
> want
> to be able to handle, so haven't yet been able to test it for performance.
> Is anyone else using Hbase in this way, and what has been your experience
> with realtime performance? I haven't really seen examples of people using
> Hbase this way - another approach would be for us to use
> Hadoop/Hbase/mapreduce for computation then put results back into mysql or
> whatever for realtime access. Any experience or suggestions would be
> appreciated!


In my setup of loading 313ish million rows into hbase with about  38 bytes
of column data (not including timestamps) and a variable size row key takes
around 6 hours to complete using a map-reduce job to do the import.  I was
able to sustain an average write-rate of 15220.9 commits/s with a peak
commit rate of 44300/s.  You can take a look at the link below which is an
attempt to graph the commit rate of my job.  I had quite a bit of data
points, so i took the avg commits/s over a 60 period and plotted that.

http://yogurt.org/hadoop_import.jpg

Once the data was loaded, reading the same row key 1000 times took 454
seconds and reading 1000 sequential keys using HTable.get() took 542
seconds.  There's much to be improved on the reading side.

Hope that answers some of your questions.

Reply via email to