[
https://issues.apache.org/jira/browse/HBASE-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208018#comment-13208018
]
Phabricator commented on HBASE-5344:
------------------------------------
stack has commented on the revision "[jira] [HBASE-5344] [89-fb] Scan
unassigned region directory on master failover".
Whats the state on this patch Mikhail? You going to apply to 0.89fb? If it
goes into 0.89fb, I'd then like to forward port it. It looks like it could
take care of some trunk issues we see.
Is it possible that querying the regionservers would return state that is
different to what is up in .META.? (I suppose if it does, we have bigger
issues?)
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:56
Should get via Configuration?
src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:68
This does not do retries (and it looks like down in the code you are not doing
retrying of Callable). In TRUNK we use an HTable instance -- i.e. a Callable
w/ retries -- so we get retying (thats a big change in trunk -- doing retries
rather than one-time HConnection calls)
src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:51
FYI, in trunk, hbck needs what this class does over in
HBaseFSCK#processRegionServers. It could use this class one day. Currently it
asks master for this cluster status (which wouldn't work where this is needed
on master failover)
src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:96
What is this? It seems fb particular? If no regionservers in zk, then its a
cluster startup which means? Does it mean cluster is starting? What if there
was a a regionserver up and running already but it had not yet been assigned
any regions? Wouldn't this be a clean cluster startup too?
src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:107
Yeah, this stuff does not retry which maybe ok on startup here.
src/main/java/org/apache/hadoop/hbase/master/DirectRegionServerScanner.java:235
Nice utility
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:160 Misspelled
src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java:50 We
don't have this class in TRUNK. Was it added to 0.89fb?
src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java:88 Why
delete it? In case it has unassigned znodes? I suppose this legit if the
isClusterStartup means no regionservers up on cluster.
src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java:128
ZKUtil.joinZNode does this.
So we are going through each of the unassigned znodes and we are going to
update .META.? I see that in the loop, if we trip over .META., then we'll just
return. Whats that about? Is it that .META. is not assigned? Is .META. and
-ROOT- assigned before this method is called?
REVISION DETAIL
https://reviews.facebook.net/D1605
> [89-fb] Scan unassigned region directory on master failover
> -----------------------------------------------------------
>
> Key: HBASE-5344
> URL: https://issues.apache.org/jira/browse/HBASE-5344
> Project: HBase
> Issue Type: Bug
> Reporter: Mikhail Bautin
> Assignee: Mikhail Bautin
> Attachments: D1605.1.patch
>
>
> In case the master dies after a regionserver writes region state as OPENED or
> CLOSED in ZK but before the update is received by master and written to meta,
> the new master that comes up has to pick up the region state from ZK and
> write it to meta. Otherwise we can get multiply-assigned regions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira