Re: Table with 80 regions having nearly no data in it

stack Wed, 17 Dec 2008 12:49:39 -0800

Thibaut_ wrote:

As a side note, it would be helpfull if it would be possible not only to
insert or change rows with BatchUpdate, but also to delete rows (So I can
delete more rows at the end when I'm executing the other batched requests as
well).

HBASE-880.  Maybe add a vote?

But something I have noticed is that I have a table (at least one)

tobeprocessed   {NAME => 'tobeprocessed', IS_ROOT => 'false', IS_META =>
'false', FAMILIES => [{NAME => 'data', BLOOMFILTER => 'false', COMPRESSION
=> 'NONE', VERSIONS => '1', LENGTH => '2147483647', TTL => '-1', IN_MEMORY
=> 'false', BLOCKCACHE => 'false'}], INDEXES => []}

which spans over 70 regions, but only has about 117 rows in it (just a few

MByte).

Did it only ever have 117 rows in it? Or was it once many more thanthis and other rows were deleted?

These entries are all in the last region (as I used a timestamp as
key and I just checked with a mapreduce job). On the webstatus page, there
are also 2 regions with an empty end key which seems very strange.

Not 'strange' but 'broken'. Where do you see that exactly? Can youscan this table successfully?

Scan your .META. and paste in the info:regioninfo cell for each of theseregions so we can take a look.

One at
the end and one near to the middle. When I ran a mapreduce job over this
table, the region split startkey  is set however to the startkey of the next
region. (for the first region with an empty end key in the web interface)
(As a side node, stopping hbase took very long sometimes so I manually
killed the processes a few times before, which could have led to this...)

OK. It might have damaged it though 0.19.0 should be more resilientthan past versions.

Shouldn't the regions be deleted when no data is present? (as I have set
versions to 1 and deleted the keys through HTable.deleteAll() function).

Not currently. Once made a region remains though after deletes it hasno data.

You could merge up all of these empty regions but you'd have to shutdownhbase and run the merge tool (We should add it to the new UI as anoption under the new manual split/force-compaction feature).

Also the startup phase is a lot longer than in hadoop 0.18.1. I have about
1500 regions over 7 servers, and it can take up to 5 minutes until all
regions are loaded. (Hbase doesn't even start to load regions, only when I
make a first request to it). But this could also be related to corrupt
regions?

This is being looked into JBA. Hopefully we can improve here before therelease. Study your master log with DEBUG enabled. Whats it up too?There is a new 'safe mode' in hbase. Maybe this goes on too long. Isit assigning regions? Are they taking a long time to open?


Hbase settings:
 <property>
    <name>hbase.master.lease.period</name>
    <value>720000</value>
    <description>HMaster server lease period in milliseconds. Default is
    120 seconds.  Region servers must report in within this period else
    they are considered dead.  On loaded cluster, may need to up this
    period.</description>
  </property>


Did you find that you needed it to be this long?
St.Ack

Re: Table with 80 regions having nearly no data in it

Reply via email to