[
https://issues.apache.org/jira/browse/HBASE-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871404#action_12871404
]
HBase Review Board commented on HBASE-2599:
-------------------------------------------
Message from: [email protected]
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/88/
-----------------------------------------------------------
(Updated 2010-05-25 15:55:50.626407)
Review request for hbase.
Changes
-------
Changed summary
Summary (updated)
-------
Version of hbase-2599 for 0.20 branch for 0.20.5.
It cuts DNS lookups from HServerInfo and uses the regionservers name everywhere
instead of ip sometimes and a lookedup hostname at other times. Also puts
hostname:port into .META. rather than IP.
This addresses bug hbase-2599.
Diffs
-----
branches/0.20/src/java/org/apache/hadoop/hbase/ClusterStatus.java 948218
branches/0.20/src/java/org/apache/hadoop/hbase/HServerInfo.java 948218
branches/0.20/src/java/org/apache/hadoop/hbase/master/BaseScanner.java 948218
branches/0.20/src/java/org/apache/hadoop/hbase/master/HMaster.java 948218
branches/0.20/src/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java
948218
branches/0.20/src/java/org/apache/hadoop/hbase/master/ServerManager.java
948218
branches/0.20/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
948218
branches/0.20/src/test/org/apache/hadoop/hbase/TestServerInfo.java
PRE-CREATION
branches/0.20/src/webapps/master/table.jsp 948218
Diff: http://review.hbase.org/r/88/diff
Testing
-------
Doing now.
Thanks,
stack
> BaseScanner says "Current assignment of X is not valid" over and over for
> same region
> -------------------------------------------------------------------------------------
>
> Key: HBASE-2599
> URL: https://issues.apache.org/jira/browse/HBASE-2599
> Project: HBase
> Issue Type: Bug
> Reporter: stack
>
> From IRC today
> {code}
> 12:41 < cmorgan> hey guys. I'm having a recent issue with a single node
> cluster running 0.20.4. After stopping for a backup I now get region
> assignment churn. Seems master keeps thinking that region
> assignment is not valid even when it is. Following is a log
> snippet:
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443246 [ HMaster] DEBUG
> ter.RegionServerOperationQueue - Processing todo: PendingOpenOperation from
> localhost.,7802,1274425405680
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443246 [ HMaster] INFO
> e.master.RegionServerOperation -
> net_troove_coin_account_AccountCredentials,,1234913258116 open on
> 127.0.0.1:7802
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443246 [ HMaster] INFO
> e.master.RegionServerOperation - Updated row
> net_troove_coin_account_AccountCredentials,,1234913258116 in region .META.,,1
> with
> startcode=1274425405680, server=127.0.0.1:7802
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443246 [ HMaster] DEBUG
> ter.RegionServerOperationQueue - Processing todo: PendingOpenOperation from
> localhost.,7802,1274425405680
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443246 [ HMaster] INFO
> e.master.RegionServerOperation -
> net_troove_application_request_TemporaryRequest,,1234913268355 open on
> 127.0.0.1:7802
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443247 [ HMaster] INFO
> e.master.RegionServerOperation - Updated row
> net_troove_application_request_TemporaryRequest,,1234913268355 in region
> .META.,,1 with
> startcode=1274425405680, server=127.0.0.1:7802
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443247 [ger.metaScanner] DEBUG
> adoop.hbase.master.BaseScanner - Current assignment of
> net_troove_coin_account_AccountEntry,,1271448856984 is not valid;
> serverAddress=127.0.0.1:7802, startCode=1274425405680
> unknown.
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443248 [ger.metaScanner] DEBUG
> adoop.hbase.master.BaseScanner - Current assignment of
> net_troove_coin_account_AccountEntry-Base_EntryDay_DESCENDING,,1273266418876
> is not valid; serverAddress=127.0.0.1:7802,
> startCode=1274425405680 unknown.
> 12:41 < cmorgan> [21/05/10 00:59:42] 3443251 [ger.metaScanner] DEBUG
> adoop.hbase.master.BaseScanner - Current assignment of
> net_troove_coin_bank_BankStatement,,1266433980935 is not valid;
> serverAddress=127.0.0.1:7802, startCode=1274425405680
> unknown.
> 12:58 < cmorgan> stack: I'd been running with 0.20.4 for a week or so
> starting/stopping every night. Now this happens...
> 14:11 < cmorgan> stack: some more info: On our mini production server the
> regionserver is getting "My address is localhost.:7802" (notice the dot after
> localhost). But the master is also sometimes
> referring to it as 127.0.0.1. I just used the same data and
> config on my laptop, and its binding to my external LAN ip ("My address is
> 10.0.1.4:7802"). Under this setup hbase comes up
> stable (no region assignment churn).
> {code}
> Looking at this, I think issue is that when we register a server we use a
> getServerName on a HServerInfo provided by the regionserver (though we are on
> the master side) but BaseScanner uses a getServerName that is made by doing a
> dns lookup using the IP that it finds in the server column of .META. My
> sense is that is possible for the regionserver hostname and what the master
> finds when it does a lookup against dns can disagree, fatally.
> This issue seems popular over last few weeks. Was reported at least once
> more on a standalone instance and also on krispykola's 15-node ec2 cluster
> (He went back to 0.20.3 and then it went away?). It made for what looked
> like double-assignment in his case (Our attempt at caching DNS names may be
> amiss -- I tihnk tht the main diff between 0.20.3 and 0.20.4 in this area).
> My thought is to purge DNS from the HServerInfo passed by the RS to Master on
> startup and heartbeating and to use IPs only (and even then, the IP that the
> master tells the RS to use, its remote address as seen by the master). We
> might have to do this fix for 0.20.5 since it seems to happen more in 0.20.4.
> I'm looking into this. Opinions welcome.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.