Thibaut_ wrote:
Hello St.Ack,
thanks for your answer.
I will vote on HBASE-880, that's exactly what I needed :)
Except of the little glitch in the webinterface regarding the ending key,
that table seems to be ok then. I thought regions would get deleted. I'm
adding new data (Timestamp of current time is the key), and when I have
processed that data, I'm deleting the data. That's the reason why I get more
and more regions... (because as soon as I add data that spans over two
regions, the first region won't be removed when I have deleted that data).
We'd need a cluster cleaner process that would notice empty regions and
would merge them into adjacents. That seems reasonable to me (Sounds
like HBASE-420).
The below is a bit hard to read but from what I can make out, its not
right. The ENDKEY of a region should be the STARTKEY of the next. The
ENDKEY of '', should be last in table. Neither seems to be going on
here. Perhaps make an issue and paste a clean output from .META. and
I'll take a look.
Here is the meta info of that region (NAME =>
'tobeprocessed,12293840182411045696639,1229385024829') and the regions
around that region
.........
tobeprocessed,1229383104274 column=info:regioninfo,
timestamp=1229384271404, value=REGION => {NAME => 'tobepr
796785789,1229384269280 ocessed,1229383104274796785789,1229384269280',
STARTKEY => '122938310427479678578
9', ENDKEY => '12293837601871695303679',
ENCODED => 1839260151, TABLE => {{NAME =
> 'tobeprocessed', IS_ROOT => 'false', IS_META
=> 'false', FAMILIES => [{NAME =>
'data', BLOOMFILTER => 'false', COMPRESSION =>
'NONE', VERSIONS => '1', LENGTH =>
'2147483647', TTL => '-1', IN_MEMORY =>
'false', BLOCKCACHE => 'false'}], INDEXE
S => []}}
tobeprocessed,1229383760187 column=info:regioninfo,
....
But I have another table (rsssources), which when I scanned it yesterday had
400 000 entries (count 'rsssources'), and today only has 180 000 entries
(after killing hbase (kill -9), because it was unresponsive.
Well, the above listing would seem to have missing regions. Lets figure
whats gong on. Make an issue 'missing regions' and paste in your scan
'.META.' output. Do you think you can reproduce this?
I'm also
logging the DEBUG entries now to see what's happening at that point).
Good. Can you see anything about the regions that go missing?
When I execute a mapreduce job, a few regions don't seem to have any data in
them. I did however never delete any data in that table (just replaces).
I did increase the timeout values, because I read somewhere else that it
would help in some cases. But I will reset the values to their original
values.
What's the best way to stop hbase when the hbase-stop script doesn't work.
(Sometimes it just runs for hours... (probably a deadlock somewhere)?
Tail the master log with DEBUG enabled. Should point at what is taking
so long to go down. May indicate a particular regionserver. Thread
dump it ("kill -QUIT PID"). Stick that in the issue to.
I'm waiting now for hbase to shut down, and will try to run the merge script
on those two tables.
Good stuff,
St.Ack