[ 
https://issues.apache.org/jira/browse/HBASE-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836390#action_12836390
 ] 

Kannan Muthukkaruppan commented on HBASE-2244:
----------------------------------------------

Stack: The splitB from test1,1204765,1266581233447, namely  
"test1,1226169,1266609171581" was probably there. Not entirely sure. I should 
have cut-pasted more from the scan of .META. 

I just wanted to add that during this state, the client was receiving errors of 
the form:

{code}
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
region server 10.129.68.212:60020 for region\
 test1,1204765,1266581233447, row '1232762', but failed after 10 attempts. 
{code}

Note: key "1232762" would fall in the "splitB" range for the original 
test1,1204765,1266581233447.

(BTW, are the JIRA time zones always in GMT? Is there a way to configure it to 
local timezones?)

When I look at the same cluster's .META. now, the state for the related META 
regions is as follows:

{code}

 test1,1204765,1266616432091 column=info:regioninfo, timestamp=1266616432230, 
value=REGION => {NAME => 'test1,
                             1204765,1266616432091', STARTKEY => '1204765', 
ENDKEY => '1215466', ENCODED => 41
                             6976676, TABLE => {{NAME => 'test1', FAMILIES => 
[{NAME => 'actions', VERSIONS =>
                              '3', COMPRESSION => 'NONE', TTL => '2147483647', 
BLOCKSIZE => '65536', IN_MEMORY
                              => 'false', BLOCKCACHE => 'true'}]}}
 test1,1204765,1266616432091 column=info:server, timestamp=1266616433943, 
value=10.129.68.213:60020
 test1,1204765,1266616432091 column=info:serverstartcode, 
timestamp=1266616433943, value=1266562597511
 test1,1215466,1266616432091 column=info:regioninfo, timestamp=1266616432232, 
value=REGION => {NAME => 'test1,
                             1215466,1266616432091', STARTKEY => '1215466', 
ENDKEY => '1226169', ENCODED => 40
                             3995950, TABLE => {{NAME => 'test1', FAMILIES => 
[{NAME => 'actions', VERSIONS =>
                              '3', COMPRESSION => 'NONE', TTL => '2147483647', 
BLOCKSIZE => '65536', IN_MEMORY
                              => 'false', BLOCKCACHE => 'true'}]}}
 test1,1215466,1266616432091 column=info:server, timestamp=1266616434963, 
value=10.129.68.213:60020
 test1,1215466,1266616432091 column=info:serverstartcode, 
timestamp=1266616434963, value=1266562597511
 test1,1226169,1266609171581 column=info:regioninfo, timestamp=1266621116341, 
value=REGION => {NAME => 'test1,
                             1226169,1266609171581', STARTKEY => '1226169', 
ENDKEY => '1290703', ENCODED => 45
                             9318323, OFFLINE => true, SPLIT => true, TABLE => 
{{NAME => 'test1', FAMILIES =>
                             [{NAME => 'actions', VERSIONS => '3', COMPRESSION 
=> 'NONE', TTL => '2147483647',
                              BLOCKSIZE => '65536', IN_MEMORY => 'false', 
BLOCKCACHE => 'true'}]}}
 test1,1226169,1266609171581 column=info:server, timestamp=1266613546335, 
value=10.129.68.214:60020
 test1,1226169,1266609171581 column=info:serverstartcode, 
timestamp=1266613546335, value=1266562596451
 test1,1226169,1266609171581 column=info:splitA, timestamp=1266621116341, 
value=\x00\x0512790\x00\x00\x00\x01\
                             
x26\xE8\x80l\xCD\x1Btest1,1226169,1266621115597\x00\x071226169\x00\x00\x00\x05\x0
                             
5test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\x0
                             
0\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x00\
                             
x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\x0
                             
0\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x00
                             
\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x00\
                             
x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04trueb\
                             xD0\x21\xBC
 test1,1226169,1266609171581 column=info:splitB, timestamp=1266621116341, 
value=\x00\x071290703\x00\x00\x00\x0
                             
1\x26\xE8\x80l\xCD\x19test1,12790,1266621115597\x00\x0512790\x00\x00\x00\x05\x05t
                             
est1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\x00\
                             
x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x00\x0
                             
7\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\x00\
                             
x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x00\x
                             
00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x00\x0
                             
9IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true\x5B
                             \x7D\x1A1
 test1,1226169,1266621115597 column=info:regioninfo, timestamp=1266621116358, 
value=REGION => {NAME => 'test1,
                             1226169,1266621115597', STARTKEY => '1226169', 
ENDKEY => '12790', ENCODED => 1988
                             459280, TABLE => {{NAME => 'test1', FAMILIES => 
[{NAME => 'actions', VERSIONS =>
                             '3', COMPRESSION => 'NONE', TTL => '2147483647', 
BLOCKSIZE => '65536', IN_MEMORY
                             => 'false', BLOCKCACHE => 'true'}]}}
 test1,1226169,1266621115597 column=info:server, timestamp=1266768596714, 
value=10.129.68.212:60020
 test1,1226169,1266621115597 column=info:serverstartcode, 
timestamp=1266768596714, value=1266768408023
 test1,12790,1266621115597   column=info:regioninfo, timestamp=1266621116361, 
value=REGION => {NAME => 'test1,
                             12790,1266621115597', STARTKEY => '12790', ENDKEY 
=> '1290703', ENCODED => 179081
                             1553, TABLE => {{NAME => 'test1', FAMILIES => 
[{NAME => 'actions', VERSIONS => '3
                             ', COMPRESSION => 'NONE', TTL => '2147483647', 
BLOCKSIZE => '65536', IN_MEMORY =>
                              'false', BLOCKCACHE => 'true'}]}}
 test1,12790,1266621115597   column=info:server, timestamp=1266768555610, 
value=10.129.68.214:60020
 test1,12790,1266621115597   column=info:serverstartcode, 
timestamp=1266768555610, value=1266768408015
{code}


Note: that both the daughters of  the parent region test1,1226169,1266609171581 
have been installed. But the offlined parent row itself (for 
test1,1226169,1266609171581) is still present. Not sure if it is in these 
situations that the client starts seeing errors... but curious why the offline 
parent row hasn't been reaped yet from .META.

I repro'ed a similar problem yesterday. Might have more detailed logs and 
errors. Will share them shortly.




> META gets inconsistent in a number of crash scenarios
> -----------------------------------------------------
>
>                 Key: HBASE-2244
>                 URL: https://issues.apache.org/jira/browse/HBASE-2244
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.20.4
>
>
> (Forking this issue off from HBASE-2235).
> During load testing, in a number of failure scenarios (unexpected region 
> server deaths) etc., we notice that META can get inconsistent. This primarily 
> happens for regions which are in the process of being split. Manually running 
> add_table.rb seems to fix the tables meta data just fine. 
> But it would be good to do automatic cleansing (as part of META scanners 
> work) and/or avoid these inconsistent states altogether.
> For example, for a particular startkey, I see all these entries:
> {code}
> test1,1204765,1266569946560 column=info:regioninfo, timestamp=1266581302018, 
> value=REGION => {NAME => 'test1,
>                              1204765,1266569946560', STARTKEY => '1204765', 
> ENDKEY => '1441091', ENCODED => 18
>                              19368969, OFFLINE => true, SPLIT => true, TABLE 
> => {{NAME => 'test1', FAMILIES =>
>                               [{NAME => 'actions', VERSIONS => '3', 
> COMPRESSION => 'NONE', TTL => '2147483647'
>                              , BLOCKSIZE => '65536', IN_MEMORY => 'false', 
> BLOCKCACHE => 'true'}]}}
>  test1,1204765,1266569946560 column=info:server, timestamp=1266570029133, 
> value=10.129.68.212:60020
>  test1,1204765,1266569946560 column=info:serverstartcode, 
> timestamp=1266570029133, value=1266562597546
>  test1,1204765,1266569946560 column=info:splitB, timestamp=1266581302018, 
> value=\x00\x071441091\x00\x00\x00\x0
>                              
> 1\x26\xE6\x1F\xDF\x27\x1Btest1,1290703,1266581233447\x00\x071290703\x00\x00\x00\x
>                              
> 05\x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x
>                              
> 00\x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00
>                              
> \x00\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSI
>                              
> ON\x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TT
>                              
> L\x00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00
>                              
> \x00\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04t
>                              rueh\x0FQ\xCF
>  test1,1204765,1266581233447 column=info:regioninfo, timestamp=1266609172177, 
> value=REGION => {NAME => 'test1,
>                              1204765,1266581233447', STARTKEY => '1204765', 
> ENDKEY => '1290703', ENCODED => 13
>                              73493090, OFFLINE => true, SPLIT => true, TABLE 
> => {{NAME => 'test1', FAMILIES =>
>                               [{NAME => 'actions', VERSIONS => '3', 
> COMPRESSION => 'NONE', TTL => '2147483647'
>                              , BLOCKSIZE => '65536', IN_MEMORY => 'false', 
> BLOCKCACHE => 'true'}]}}
>  test1,1204765,1266581233447 column=info:server, timestamp=1266604768670, 
> value=10.129.68.213:60020
>  test1,1204765,1266581233447 column=info:serverstartcode, 
> timestamp=1266604768670, value=1266562597511
>  test1,1204765,1266581233447 column=info:splitA, timestamp=1266609172177, 
> value=\x00\x071226169\x00\x00\x00\x0
>                              
> 1\x26\xE7\xCA,\x7D\x1Btest1,1204765,1266609171581\x00\x071204765\x00\x00\x00\x05\
>                              
> x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\
>                              
> x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x0
>                              
> 0\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\
>                              
> x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x
>                              
> 00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x0
>                              
> 0\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true
>                              \xB9\xBD\xFEO
>  test1,1204765,1266581233447 column=info:splitB, timestamp=1266609172177, 
> value=\x00\x071290703\x00\x00\x00\x0
>                              
> 1\x26\xE7\xCA,\x7D\x1Btest1,1226169,1266609171581\x00\x071226169\x00\x00\x00\x05\
>                              
> x05test1\x00\x00\x00\x00\x00\x02\x00\x00\x00\x07IS_ROOT\x00\x00\x00\x05false\x00\
>                              
> x00\x00\x07IS_META\x00\x00\x00\x05false\x00\x00\x00\x01\x07\x07actions\x00\x00\x0
>                              
> 0\x07\x00\x00\x00\x0BBLOOMFILTER\x00\x00\x00\x05false\x00\x00\x00\x0BCOMPRESSION\
>                              
> x00\x00\x00\x04NONE\x00\x00\x00\x08VERSIONS\x00\x00\x00\x013\x00\x00\x00\x03TTL\x
>                              
> 00\x00\x00\x0A2147483647\x00\x00\x00\x09BLOCKSIZE\x00\x00\x00\x0565536\x00\x00\x0
>                              
> 0\x09IN_MEMORY\x00\x00\x00\x05false\x00\x00\x00\x0ABLOCKCACHE\x00\x00\x00\x04true
>                              \xE1\xDF\xF8p
>  test1,1204765,1266609171581 column=info:regioninfo, timestamp=1266609172212, 
> value=REGION => {NAME => 'test1,
>                              1204765,1266609171581', STARTKEY => '1204765', 
> ENDKEY => '1226169', ENCODED => 21
>                              34878372, TABLE => {{NAME => 'test1', FAMILIES 
> => [{NAME => 'actions', VERSIONS =
>                              > '3', COMPRESSION => 'NONE', TTL => 
> '2147483647', BLOCKSIZE => '65536', IN_MEMOR
>                              Y => 'false', BLOCKCACHE => 'true'}]}}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to