The number of column families I have is 13, which I guess is okie? -Vibhav
On Fri, Jan 25, 2013 at 11:01 PM, Luke Lu <[email protected]> wrote: > You'll have this problem if you have a large number of column families > being scanned/populated at the same time. Make sure the data you > scan/populate frequently are in the same column family (you can have many > columns in a column family). Unlike BigTable/Hypertable which has the > concept of locality/access groups, HBase always stores column families in > separate files, essentially making column family not only a logic grouping > mechanism but also a physical locality group. > > > On Fri, Jan 25, 2013 at 1:10 AM, Vibhav Mundra <[email protected]> wrote: > > > I am facing a very strange problem with HBase. > > > > This what I did: > > a) Create a table, using pre partioned splits. > > b) Also the column familes are zipped with lzo compression. > > c) Using the above configuration I am able to populate 2 million row per > > min in the Hbase. > > d) I have created a table with 300 million odd rows, which roughy took > me 3 > > hours to populate and the data size is of 25GB. > > > > e) But when I query for data the performance I am getting is very bad. > > Basically this is what I am seeing: High CPU, no disk I/O and network > > I/O is happening at the rate of 6~7MB secs. > > > > > > Because of this, if I scan the entries of the table using Hive it is > taking > > ages. > > Basically it is taking around 24 hours to scan the table. Any idea, of > how > > to debug. > > > > > > -Vibhav > > >
