[
https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628828#action_12628828
]
Jonathan Gray commented on HBASE-868:
-------------------------------------
Using this patch, I don't see any of the added debug output anywhere.
Here is where it breaks. During the first memcache flush of META, Flusher says
Replay of hlog required and then DroppedSnapshot/Key out of order exceptions:
2008-09-05 19:09:12,202 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
Rolling hlog. Number of entries: 30001
2008-09-05 19:09:12,215 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
Closing current log writer
/hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220666931124
2008-09-05 19:09:12,218 INFO org.apache.hadoop.hbase.regionserver.HLog: New log
writer created at
/hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220666952215
2008-09-05 19:14:02,219 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
Rolling hlog. Number of entries: 14234
2008-09-05 19:14:02,230 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
Closing current log writer
/hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220666941412
2008-09-05 19:14:02,233 INFO org.apache.hadoop.hbase.regionserver.HLog: New log
writer created at
/hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220667242230
2008-09-05 19:14:22,342 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Started memcache flush for region .META.,,1. Current region memcache size 23.6k
2008-09-05 19:14:22,362 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded
the native-hadoop library
2008-09-05 19:14:22,363 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory:
Successfully loaded & initialized native-zlib library
2008-09-05 19:14:22,364 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new compressor
2008-09-05 19:14:22,392 FATAL org.apache.hadoop.hbase.regionserver.Flusher:
Replay of hlog required. Forcing server restart
org.apache.hadoop.hbase.DroppedSnapshotException: region: .META.,,1
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1081)
at
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:979)
at
org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:174)
at org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:91)
Caused by: java.io.IOException: key out of order:
items,1220665885921/historian:assignment/1220665888554 after
items,,1220665885921/historian:split/1220665886459
at org.apache.hadoop.io.MapFile$Writer.checkKey(MapFile.java:205)
at org.apache.hadoop.io.MapFile$Writer.append(MapFile.java:191)
at
org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Writer.append(HStoreFile.java:846)
at
org.apache.hadoop.hbase.regionserver.HStore.internalFlushCache(HStore.java:625)
at
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:578)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1068)
... 3 more
2008-09-05 19:14:22,393 INFO org.apache.hadoop.hbase.regionserver.Flusher:
regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher exiting
2008-09-05 19:14:24,583 DEBUG org.apache.hadoop.hbase.RegionHistorian: Offlined
2008-09-05 19:14:24,583 INFO org.apache.hadoop.ipc.Server: Stopping server on
60020
2008-09-05 19:14:24,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0
on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2
on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server
listener on 60020
2008-09-05 19:14:24,584 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2008-09-05 19:14:24,584 INFO org.mortbay.util.ThreadedServer: Stopping Acceptor
ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1
on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6
on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3
on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4
on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5
on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8
on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9
on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server
Responder
2008-09-05 19:14:24,587 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7
on 60020: exiting
2008-09-05 19:14:24,587 INFO org.mortbay.http.SocketListener: Stopped
SocketListener on 0.0.0.0:60030
2008-09-05 19:14:25,254 INFO org.mortbay.util.Container: Stopped
HttpContext[/static,/static]
2008-09-05 19:14:25,762 INFO org.mortbay.util.Container: Stopped
HttpContext[/logs,/logs]
2008-09-05 19:14:25,763 INFO org.mortbay.util.Container: Stopped [EMAIL
PROTECTED]
2008-09-05 19:14:26,418 INFO org.mortbay.util.Container: Stopped
WebApplicationContext[/,/]
2008-09-05 19:14:26,418 INFO org.mortbay.util.Container: Stopped [EMAIL
PROTECTED]
2008-09-05 19:14:26,419 DEBUG org.apache.hadoop.hbase.regionserver.HLog:
closing log writer in
hdfs://web1:9000/hbase/log_72.34.249.110_1220665441919_60020
2008-09-05 19:14:26,419 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
LogRoller exiting.
2008-09-05 19:14:26,419 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver/0:0:0:0:0:0:0:0:60020.compactor exiting
2008-09-05 19:14:26,432 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog
2008-09-05 19:14:26,433 DEBUG
org.apache.hadoop.hbase.regionserver.HRegionServer: closing region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Compactions and cache flushes disabled for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Scanners disabled for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No
more active scanners for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Updates disabled for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No
more row locks outstanding on region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
closed 1028785192/historian
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
closed 1028785192/info
2008-09-05 19:14:26,433 INFO org.apache.hadoop.hbase.regionserver.HRegion:
closed .META.,,1
...
2008-09-05 19:14:26,435 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
72.34.249.110:60020
2008-09-05 19:14:30,719 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2008-09-05 19:14:30,719 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
regionserver/0:0:0:0:0:0:0:0:60020 exiting
2008-09-05 19:14:32,341 INFO org.apache.hadoop.hbase.Leases:
regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker closing leases
2008-09-05 19:14:32,341 INFO org.apache.hadoop.hbase.Leases:
regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker closed leases
2008-09-05 19:14:32,342 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread.
2008-09-05 19:14:32,342 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
> Key: HBASE-868
> URL: https://issues.apache.org/jira/browse/HBASE-868
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.2.1
> Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
> Reporter: Jonathan Gray
> Assignee: Jonathan Gray
> Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: hbase-868-branch-0.2.patch,
> hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png,
> regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been
> doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem
> somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs,
> very odd rendering in the web ui, hard to interpret in the shell, etc..)
> Once we figure out these serious bugs we will spend some time trying to clean
> that up. But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the
> client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException:
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
> of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> �', row='c�'
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while
> the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is
> hosting the META table. And secondly it also hosts the first region in the
> table, startkey = ''. I'm unsure which is the cause, I have a clue leading
> to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Started memcache flush for region .META.,,1. Current region memcache size
> 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with
> 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96
> entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510,
> compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested
> for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> started compaction of 2 files into
> /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> moving
> /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Completed compaction of 1028785192/historian store size is 41.1k; time since
> last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> started compaction of 3 files into
> /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> moving
> /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to
> /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore:
> Completed compaction of 1028785192/info store size is 61.0k; time since last
> major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
> -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller:
> Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally. This leads me to believe
> there's an issue with the META table memcache, but oddly the other regions of
> this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it
> seems to be consistent that I get the row out of range errors when looking
> for a region with startkey = '', although there are 5 other regions on this
> RS.
> Will attach full logs from master and three RS. Also a couple screenshots
> showing weird behavior in listing the regions in the table.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.