[
https://issues.apache.org/jira/browse/HBASE-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-495:
------------------------
Attachment: 495-0.1.patch
Here is a patch against 0.1. Will make others if this passes muster.
My thought on this issue is that the cluster is so messy w/ millions of log
lines, its hard to debug. Suggest that we commit this patch against this issue
and open another when we see duplicate regions next time.
What seems to be happening is regions are failing to open out on the
regionservers because dfs is corrupt. Was thinking could shutdown if IOE out
of HDFS but looking at where the exception is coming up, we actually do do a
filesystem check and it must be succeeding. Also, a failed compaction may not
always be worthy of our shutting down regionserver -- in this case on region
startup it probably is but later as part of normal operation it probably is
not. DFS health seems to be a tad more involved.
HBASE-495 No server address listed in .META.
M src/java/org/apache/hadoop/hbase/HMaster.java
(regionServerStartup): Refactor. Create lease BEFORE scheduling shutdown
process. We used do things other way round; meant that we'd shedule a
shutdown process for every report the regionserver made. Could be many
if old lease hanging around.
(registerRegionServer): Added. This is body of what used to be in
regionServerStartup moved here so easy to have a finally in the calling
method (Should never be an exception out of this method so finally should
never have to run).
Removed some useless DEBUG level logs; If thousands of rows in .META.,
then at least a DEBUG per row multiplied by the shutdown processes
queued.
> No server address listed in .META.
> ----------------------------------
>
> Key: HBASE-495
> URL: https://issues.apache.org/jira/browse/HBASE-495
> Project: Hadoop HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.16.0
> Reporter: stack
> Fix For: 0.1.0, 0.2.0
>
> Attachments: 495-0.1.patch
>
>
> Michael Bieniosek manufactured the following in a 0.16.0 install:
> {code}
> 08/03/06 17:52:02 DEBUG hbase.HTable: Advancing internal scanner to startKey
> g80Fi5WZHlzLqGzErrAd7V==
> 08/03/06 17:52:02 DEBUG hbase.HConnectionManager$TableServers: reloading
> table servers because: No server address listed in .META. for region
> enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> 08/03/06 17:52:12 DEBUG hbase.HConnectionManager$TableServers: reloading
> table servers because: No server address listed in .META. for region
> enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> 08/03/06 17:52:22 DEBUG hbase.HConnectionManager$TableServers: reloading
> table servers because: No server address listed in .META. for region
> enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> org.apache.hadoop.hbase.NoServerForRegionException: No server address listed
> in .META. for region enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:449)
> at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:346)
> at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:309)
> at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:103)
> at
> org.apache.hadoop.hbase.HTable$ClientScanner.nextScanner(HTable.java:854)
> at org.apache.hadoop.hbase.HTable$ClientScanner.next(HTable.java:915)
> at
> org.apache.hadoop.hbase.hql.SelectCommand.scanPrint(SelectCommand.java:233)
> at
> org.apache.hadoop.hbase.hql.SelectCommand.execute(SelectCommand.java:100)
> at
> org.apache.hadoop.hbase.hql.HQLClient.executeQuery(HQLClient.java:50)
> at org.apache.hadoop.hbase.Shell.main(Shell.java:114)
> {code}
> When I look in the .META., I see that the above region range has multiple
> mentions... : one offlined, two that have startcodes and servers associated
> and about 5 others that are just HRIs. Table is broke. At least need the
> merge of overlapping regions tool to fix. Digging more....
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.