Re: Hbase scans taking a lot of time

Luke Lu Fri, 25 Jan 2013 09:32:00 -0800

You'll have this problem if you have a large number of column families
being scanned/populated at the same time. Make sure the data you
scan/populate frequently are in the same column family (you can have many
columns in a column family). Unlike BigTable/Hypertable which has the
concept of locality/access groups, HBase always stores column families in
separate files, essentially making column family not only a logic grouping
mechanism but also a physical locality group.



On Fri, Jan 25, 2013 at 1:10 AM, Vibhav Mundra <[email protected]> wrote:

> I am facing a very strange problem with HBase.
>
> This what I did:
> a) Create a table, using pre partioned splits.
> b) Also the column familes are zipped with lzo compression.
> c) Using the above configuration I am able to populate 2 million row per
> min in the Hbase.
> d) I have created a table with 300 million odd rows, which roughy took me 3
> hours to populate and the data size is of 25GB.
>
> e) But when I query for data the performance I am getting is very bad.
>    Basically this is what I am seeing: High CPU, no disk I/O and network
> I/O is happening at the rate of 6~7MB secs.
>
>
> Because of this, if I scan the entries of the table using Hive it is taking
> ages.
> Basically it is taking around 24 hours to scan the table. Any idea, of how
> to debug.
>
>
> -Vibhav
>

Re: Hbase scans taking a lot of time

Reply via email to