[ 
https://issues.apache.org/jira/browse/HBASE-870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-870.
-------------------------

    Resolution: Cannot Reproduce

> locked up master
> ----------------
>
>                 Key: HBASE-870
>                 URL: https://issues.apache.org/jira/browse/HBASE-870
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> This morning, the pset master got into a locked up state.  Was stuck for tens 
> of minutes doing the below logging.  Cluster was unusable during this time:
> {code}
> ....
> 2008-09-04 18:06:45,893 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
> RegionManager.metaScanner REGION => {NAME => 
> 'enwiki,YSWgGYWLLur87bjpYfj5--==,1220070599758', STARTKEY => 
> 'YSWgGYWLLur87bjpYfj5--==', ENDKEY => 'Y_dJHSiXE_hJ8jmGEgg1Dk==', ENCODED => 
> 1060266767, TABLE => {{NAME => 'enwiki', IS_ROOT => 'false'
> , IS_META => 'false', FAMILIES => [{NAME => 'alternate_title', BLOOMFILTER => 
> 'false', VERSIONS => '2147483647', COMPRESSION => 'NONE', LENGTH => 
> '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, 
> {NAME => 'anchor', BLOOMFILTER => 'false', VERSIONS => '2147483647', 
> COMPRESSION => 'NONE', LENGT
> H => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, 
> {NAME => 'alternate_url', BLOOMFILTER => 'false', VERSIONS => '2147483647', 
> COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 
> 'false', BLOCKCACHE => 'false'}, {NAME => 'page', BLOOMFILTER => 'false', 
> VERSIONS => '21
> 47483647', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', 
> IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'misc', BLOOMFILTER => 
> 'false', VERSIONS => '2147483647', COMPRESSION => 'NONE', LENGTH => 
> '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, 
> SERVER => 'XX.XX.XX.1
> 83:60020', STARTCODE => 1219794634959
> 2008-09-04 18:06:45,895 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 1 
> time(s).
> 2008-09-04 18:06:46,908 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 2 
> time(s).
> 2008-09-04 18:06:47,918 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 3 
> time(s).
> 2008-09-04 18:06:48,928 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 4 
> time(s).
> 2008-09-04 18:06:49,938 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 5 
> time(s).
> 2008-09-04 18:06:50,958 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 6 
> time(s).
> 2008-09-04 18:06:51,978 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 7 
> time(s).
> 2008-09-04 18:06:52,398 INFO org.apache.hadoop.hbase.master.BaseScanner: 
> RegionManager.rootScanner scanning meta region {regionname: -ROOT-,,0, 
> startKey: <>, server: 208.76.44.68:60020}
> 2008-09-04 18:06:52,988 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 8 
> time(s).
> 2008-09-04 18:06:53,998 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 9 
> time(s).
> 2008-09-04 18:06:55,008 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 10 
> time(s).
> 2008-09-04 18:06:56,018 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: Connection refused
> 2008-09-04 18:06:56,045 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
> RegionManager.rootScanner REGION => {NAME => '.META.,,1', STARTKEY => '', 
> ENDKEY => '', ENCODED => 1028785192, TABLE => {{NAME => '.META.', IS_ROOT => 
> 'false', IS_META => 'true', FAMILIES => [{NAME => 'info', BLOOMFILTER => 
> 'false', COMPRESSI
> ON => 'NONE', VERSIONS => '2147483647', LENGTH => '2147483647', TTL => '-1', 
> IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'historian', 
> BLOOMFILTER => 'false', VERSIONS => '2147483647', COMPRESSION => 'NONE', 
> LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 
> 'false'}]}}}, SERVER => '
> 208.76.44.48:60020', STARTCODE => 1219794657357
> 2008-09-04 18:07:01,084 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Total Load: 1917, Num Servers: 69, Avg Load: 28.0
> 2008-09-04 18:07:01,084 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit in 
> table locations for row <> and tableName .META.: location server 
> XX.XX.XX.253:60020, location region name .META.,,1
> 2008-09-04 18:07:01,087 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 1 
> time(s).
> 2008-09-04 18:07:02,088 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 2 
> time(s).
> 2008-09-04 18:07:03,108 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 3 
> time(s).
> 2008-09-04 18:07:04,118 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 4 
> time(s).
> 2008-09-04 18:07:05,138 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 5 
> time(s).
> 2008-09-04 18:07:06,038 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Removed 
> .META.,,1 from cache because of 
> shroomz_062108,a8XltuT_bfDh6RCYKWEf9-==,1215120916399
> 2008-09-04 18:07:06,158 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 6 
> time(s).
> 2008-09-04 18:07:07,168 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 7 
> time(s).
> 2008-09-04 18:07:08,178 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 8 
> time(s).
> 2008-09-04 18:07:09,188 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 9 
> time(s).
> 2008-09-04 18:07:10,208 INFO org.apache.hadoop.ipc.Client: Retrying connect 
> to server: aa0-004-19.u.powerset.com/XX.XX.XX.253:60020. Already tried 10 
> time(s).
> 2008-09-04 18:07:11,218 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: Connection refused
> 2008-09-04 18:08:09,779 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Total Load: 1917, Num Servers: 69, Avg Load: 28.0
> 2008-09-04 18:08:11,238 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:09:11,248 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:09:43,328 INFO org.apache.hadoop.hbase.master.ServerManager: 
> XX.XX.XX.185:60020 lease expired
> 2008-09-04 18:09:43,348 INFO org.apache.hadoop.hbase.master.ServerManager: 
> XX.XX.XX.183:60020 lease expired
> 2008-09-04 18:09:43,368 INFO org.apache.hadoop.hbase.master.ServerManager: 
> XX.XX.XX.231:60020 lease expired
> 2008-09-04 18:09:43,438 INFO org.apache.hadoop.hbase.master.ServerManager: 
> XX.XX.XX.127:60020 lease expired
> 2008-09-04 18:09:43,528 INFO org.apache.hadoop.hbase.master.ServerManager: 
> XX.XX.XX.68:60020 lease expired
> 2008-09-04 18:10:11,258 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:11:11,278 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:12:11,298 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:13:11,318 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:14:11,338 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:15:11,358 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:16:11,368 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:17:11,378 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:18:11,388 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:19:11,398 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:20:11,408 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:21:11,418 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:22:11,428 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:23:11,438 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> 2008-09-04 18:23:28,291 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Total Load: 1805, Num Servers: 65, Avg Load: 28.0
> 2008-09-04 18:24:11,458 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers: reloading 
> table servers because: timed out waiting for rpc response
> ....
> {code}
> Attached are thread dumps taking around this time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to