Re: Many storefiles on the same region

Jean-Daniel Cryans Wed, 29 Apr 2009 04:38:50 -0700

The issue you're hitting is exactly this one
https://issues.apache.org/jira/browse/HBASE-1058. Currently it is
committed in trunk but it is not stable. You should use the current
0.19 branch and try to apply the HBASE-1058 v4 patch (I'm 90% sure it
will work without manual modifs) or wait until it is committed by
Andrew.

The problem behind this situation is that the rows are inserted
sequentially. That means that almost only one region will always be
hit and that will always be the most recent one. The reason it took a
while to split is that with this number of stores to compact, your
server probably began swapping.

Another thing to try would be to randomize your insertions. One way of
doing this is using UUIDs as row keys but you lose the ability to
scan.

J-D

On Wed, Apr 29, 2009 at 2:30 AM, 11 Nov. <[email protected]> wrote:
> hi all,
>    We are doing data inserting on hbase, and the table table only have one
> column family with one qualifier, which is about 20 bytes' lenth. There is
> another table carrying 5TB data in the same hbase deployment.
>    Our problem is that when we inserting data into the target table, the
> first region of the table did not split for a very long time, so that we can
> see the storefile number increasing to about 400 or more.
>    Is there anything we can do to fix it? For the performance is too poor
> to inserting on only one retion.
>

Re: Many storefiles on the same region

Reply via email to