[
https://issues.apache.org/jira/browse/HBASE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660255#action_12660255
]
Andrew Purtell commented on HBASE-1104:
---------------------------------------
Got this on the JP cluster today:
>From master log:
2009-01-02 00:05:09,671 DEBUG org.apache.hadoop.hbase.master.RegionManager:
Going to close region content,b1d86dfc01925c3b891285b12d3bf452,1230853204947
2009-01-02 00:05:12,765 INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 to
server 10.3.134.207:60020
2009-01-02 00:05:15,778 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_PROCESS_OPEN:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 from 10.3.134.207:60020
2009-01-02 00:05:15,779 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 open on
10.3.134.207:60020
2009-01-02 00:05:27,699 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_CLOSE:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 from 10.3.134.221:60020
2009-01-02 00:05:28,688 INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 to
server 10.3.134.222:60020
2009-01-02 00:05:31,697 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_OPEN:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 from 10.3.134.222:60020
2009-01-02 00:05:31,697 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 open on
10.3.134.222:60020
2009-01-02 00:05:31,698 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947 in region .META.,,1 with
startcode 1230848433306 and server 10.3.134.222:60020
There are three HRS involved here.
This leads to file system trouble:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server 10.3.134.225:60020 for region
content,b1d86dfc01925c3b891285b12d3bf452,1230853204947, row
'b1d86dfc01925c3b891285b12d3bf452', but failed after 10 attempts.
Exceptions:
java.io.IOException: java.io.IOException: HStoreScanner failed construction
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.(StoreFileScanner.java:70)
at
org.apache.hadoop.hbase.regionserver.HStoreScanner.(HStoreScanner.java:84)
at
org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2119)
at
org.apache.hadoop.hbase.regionserver.HRegion$HScanner.(HRegion.java:1878)
at
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1162)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1673)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:894)
Caused by: java.io.FileNotFoundException: File does not exist:
hdfs://jdc2-atr-dc-1.atr.trendmicro.com:50000/data/hbase/content/485774435/info/mapfiles/1743031531971685847/data
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:394)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:679)
at
org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1431)
at
org.apache.hadoop.hbase.io.SequenceFile$Reader.(SequenceFile.java:1426)
at
org.apache.hadoop.hbase.io.MapFile$Reader.createDataFileReader(MapFile.java:310)
at
org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.createDataFileReader(HBaseMapFile.java:96)
at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:292)
at
org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.(HBaseMapFile.java:79)
at
org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.(BloomFilterMapFile.java:65)
at
org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:96)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.(StoreFileScanner.java:67)
... 10 more
> Doubly-assigned regions redux
> -----------------------------
>
> Key: HBASE-1104
> URL: https://issues.apache.org/jira/browse/HBASE-1104
> Project: Hadoop HBase
> Issue Type: Bug
> Environment: pset cluster with TRUNK.
> Reporter: stack
> Fix For: 0.19.0
>
>
> Testing, I see doubly assigned regions. Below is from master log for
> TestTable,0000135598,1230761605500.
> {code}
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT:
> TestTable,0000116170,1230761152219: TestTable,0000116170,1230761152219 split;
> daughters: TestTable,0000116170,1230761605500,
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.142:60020
> 2008-12-31 22:13:38,561 [IPC Server handler 6 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759988953 and server XX.XX.XX.142:60020
> 2008-12-31 22:13:44,640 [IPC Server handler 4 on 60000] DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Going to close region
> TestTable,0000135598,1230761605500
> 2008-12-31 22:13:50,441 [IPC Server handler 9 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,457 [IPC Server handler 5 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from
> XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [IPC Server handler 5 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759988788 and server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,688 [IPC Server handler 6 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE:
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:53,688 [HMaster] DEBUG
> org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessRegionClose
> of TestTable,0000135598,1230761605500, false
> 2008-12-31 22:13:54,263 [IPC Server handler 7 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.141:60020
> 2008-12-31 22:13:57,273 [IPC Server handler 9 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from
> XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [IPC Server handler 0 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.141:60020
> 2008-12-31 22:14:03,918 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759989031 and server XX.XX.XX.141:60020
> 2008-12-31 22:14:29,350 [RegionManager.metaScanner] DEBUG
> org.apache.hadoop.hbase.master.BaseScanner:
> TestTable,0000135598,1230761605500 no longer has references to
> TestTable,0000116170,1230761152219
> {code}
> See how we choose to assign before we get the close back from the
> regionserver.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.