[
https://issues.apache.org/jira/browse/HBASE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183925#comment-13183925
]
Shrijeet Paliwal commented on HBASE-3638:
-----------------------------------------
Here is the relevant portion of log.
The master (even if you restart all the Hbase services across the cluster) will
always
get stuck at this state.
{noformat}
2012-01-10 21:28:03,382 WARN org.apache.hadoop.hbase.master.AssignmentManager:
Region in transition 1028785192 references a server no longer up
txa-18.rfiserve.net,60020,1326125886539; letting RIT timeout so will be
assigned elsewhere
2012-01-10 21:28:06,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out: .META.,,1.1028785192 state=OPENING,
ts=1326241230066
2012-01-10 21:28:06,788 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning region=.META.,,1.1028785192
2012-01-10 21:28:16,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out: .META.,,1.1028785192 state=OPENING,
ts=1326241230066
2012-01-10 21:28:16,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning region=.META.,,1.1028785192
2012-01-10 21:28:26,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out: .META.,,1.1028785192 state=OPENING,
ts=1326241230066
2012-01-10 21:28:26,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning region=.META.,,1.1028785192
2012-01-10 21:28:36,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out: .META.,,1.1028785192 state=OPENING,
ts=1326241230066
2012-01-10 21:28:36,787 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning region=.META.,,1.1028785192
2012-01-10 21:28:46,788 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out: .META.,,1.1028785192 state=OPENING,
ts=1326241230066
2012-01-10 21:28:46,788 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPENING for too long, reassigning region=.META.,,1.1028785192
2012-01-10 21:28:56,788 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out: .META.,,1.1028785192 state=OPENING,
ts=1326241230066
{noformat}
bq. What do you think Stack, can master pick a stale ZK state which is not a
leftover from previous HBase install, in other words a stale state created by
itself?
By this I was referring to comment made by Todd in the related jira when he
said:
bq. Notably, it wasn't clearing ZK between runs. So some leftover RIT data from
a previous HBase incarnation may be confusing this one's master.
He floated one possibility, left over RIT from previous incarnation. I am
thinking what other possibilities are there?
> If a FS bootstrap, need to also ensure ZK is cleaned
> ----------------------------------------------------
>
> Key: HBASE-3638
> URL: https://issues.apache.org/jira/browse/HBASE-3638
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Minor
>
> In a test environment where a cycle of start, operation, kill hbase (repeat),
> noticed that we were doing a bootstrap on startup but then we were picking up
> the previous cycles zk state. It made for a mess in the test.
> Last thing seen on previous cycle was:
> {code}
> 2011-03-11 06:33:36,708 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_OPENING, server=X.X.X.60020,1299853933073,
> region=1028785192/.META.
> {code}
> Then, in the messed up cycle I saw:
> {code}
> 2011-03-11 06:42:48,530 INFO org.apache.hadoop.hbase.master.MasterFileSystem:
> BOOTSTRAP: creating ROOT and first META regions
> .....
> {code}
> Then after setting watcher on .META., we get a
> {code}
> 2011-03-11 06:42:58,301 INFO
> org.apache.hadoop.hbase.master.AssignmentManager: Processing region
> .META.,,1.1028785192 in state RS_ZK_REGION_OPENED
> 2011-03-11 06:42:58,302 WARN
> org.apache.hadoop.hbase.master.AssignmentManager: Region in transition
> 1028785192 references a server no longer up X.X.X; letting RIT timeout so
> will be assigned elsewhere
> {code}
> We're all confused.
> Should at least clear our zk if a bootstrap happened.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira