Andrew Purtell created HBASE-17069:
--------------------------------------
Summary: RegionServer writes invalid META entries for split
daughters in some circumstances
Key: HBASE-17069
URL: https://issues.apache.org/jira/browse/HBASE-17069
Project: HBase
Issue Type: Bug
Affects Versions: 1.2.4
Reporter: Andrew Purtell
Priority: Critical
I have been seeing frequent ITBLL failures testing various versions of 1.2.x.
Over the lifetime of 1.2.x the following issues have been fixed:
- HBASE-15315 (Remove always set super user call as high priority)
- HBASE-16093 (Fix splits failed before creating daughter regions leave meta
inconsistent)
And this one is pending:
- HBASE-17044 (Fix merge failed before creating merged region leaves meta
inconsistent)
I can apply all of the above to branch-1.2 and still see this failure:
*The life of stillborn region d55ef81c2f8299abbddfce0445067830*
*Master sees SPLITTING_NEW*
{noformat}
2016-11-08 04:23:21,186 INFO [AM.ZK.Worker-pool2-t82] master.RegionStates:
Transition null to {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW,
ts=1478579001186, server=node-3.cluster,16020,1478578389506}
{noformat}
*The RegionServer creates it*
{noformat}
2016-11-08 04:23:26,035 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1]
hfile.CacheConfig: Created cacheConfig for GomnU:
blockCache=LruBlockCache{blockCount=34, currentSize=14996112,
freeSize=12823716208, maxSize=12838712320, heapSize=14996112,
minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5,
singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2016-11-08 04:23:26,038 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1]
hfile.CacheConfig: Created cacheConfig for big:
blockCache=LruBlockCache{blockCount=34, currentSize=14996112,
freeSize=12823716208, maxSize=12838712320, heapSize=14996112,
minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5,
singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2016-11-08 04:23:26,442 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1]
hfile.CacheConfig: Created cacheConfig for meta:
blockCache=LruBlockCache{blockCount=63, currentSize=17187656,
freeSize=12821524664, maxSize=12838712320, heapSize=17187656,
minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5,
singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2016-11-08 04:23:26,713 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1]
hfile.CacheConfig: Created cacheConfig for nwmrW:
blockCache=LruBlockCache{blockCount=96, currentSize=19178440,
freeSize=12819533880, maxSize=12838712320, heapSize=19178440,
minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5,
singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2016-11-08 04:23:26,715 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1]
hfile.CacheConfig: Created cacheConfig for piwbr:
blockCache=LruBlockCache{blockCount=96, currentSize=19178440,
freeSize=12819533880, maxSize=12838712320, heapSize=19178440,
minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5,
singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
2016-11-08 04:23:26,717 INFO [StoreOpener-d55ef81c2f8299abbddfce0445067830-1]
hfile.CacheConfig: Created cacheConfig for tiny:
blockCache=LruBlockCache{blockCount=96, currentSize=19178440,
freeSize=12819533880, maxSize=12838712320, heapSize=19178440,
minSize=12196776960, minFactor=0.95, multiSize=6098388480, multiFactor=0.5,
singleSize=3049194240, singleFactor=0.25}, cacheDataOnRead=true,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false,
cacheEvictOnClose=false, cacheDataCompressed=false, prefetchOnOpen=false
{noformat}
*The RegionServer onlines it*
{noformat}
2016-11-08 04:23:27,015 INFO
[node-3.cluster,16020,1478578389506-daughterOpener=d55ef81c2f8299abbddfce0445067830]
regionserver.HRegion: Onlined d55ef81c2f8299abbddfce0445067830; next
sequenceid=19184
2016-11-08 04:23:27,029 INFO
[regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099]
regionserver.HRegionServer: Post open deploy tasks for
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.
2016-11-08 04:23:27,047 INFO
[regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099]
hbase.MetaTableAccessor: Updated row
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.
with server=node-3.cluster,16020,1478578389506
{noformat}
*The Master transitions state from SPLITTING_NEW to OPEN*
{noformat}
2016-11-08 04:23:27,058 INFO [AM.ZK.Worker-pool2-t84] master.RegionStates:
Transition {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW,
ts=1478579007057, server=node-3.cluster,16020,1478578389506} to
{d55ef81c2f8299abbddfce0445067830 state=OPEN, ts=1478579007058,
server=node-3.cluster,16020,1478578389506}
2016-11-08 04:23:27,059 INFO [AM.ZK.Worker-pool2-t84]
master.AssignmentManager: Handled SPLIT event;
parent=IntegrationTestBigLinkedList,,1478577020916.393d2bfd8b1c52ce08540306659624f2.,
daughter
a=IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.,
daughter
b=IntegrationTestBigLinkedList,/\xFB\x14,1478579001155.08629d59564726da2497f70451aafcdb.,
on node-3.cluster,16020,1478578389506
{noformat}
*RegionServer updates META - BUT APPARENTLY NOT CORRECTLY*
{noformat}
2016-11-08 04:23:27,165 INFO
[regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099]
regionserver.SplitRequest: Region split, hbase:meta updated, and report to
master.
Parent=IntegrationTestBigLinkedList,,1478577020916.393d2bfd8b1c52ce08540306659624f2.,
new regions:
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.,
IntegrationTestBigLinkedList,/\xFB\x14,1478579001155.08629d59564726da2497f70451aafcdb..
Split took 6sec
{noformat}
*RegionServer delays flush*
(Is this important?)
{noformat}
2016-11-08 04:24:14,639 WARN [MemStoreFlusher.0] regionserver.MemStoreFlusher:
Region
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.
has too many store files; delaying flush up to 90000ms
{noformat}
*Immediate warnings about No serialized HRegionInfo*
{noformat}
2016-11-08 04:24:44,691 WARN
[B.defaultRpcServer.handler=26,queue=2,port=16000] hbase.MetaTableAccessor: No
serialized HRegionInfo in
keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478579007029/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478579007029/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478579007029/Put/vlen=8/seqid=0}
{noformat}
*Master is not happy either*
{noformat}
2016-11-08 04:24:51,148 WARN [MASTER_TABLE_OPERATIONS-node-1:16000-0]
hbase.MetaTableAccessor: No serialized HRegionInfo in
keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478579007029/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478579007029/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478579007029/Put/vlen=8/seqid=0}
{noformat}
*TestRunner MetaScanner complains about invalid entries in META missing
HRegionInfo*
{noformat}
(standard input):9086:2016-11-08 05:04:17,230 WARN
[B.defaultRpcServer.handler=4,queue=1,port=16000] hbase.MetaTableAccessor: No
serialized HRegionInfo in
keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
{noformat}
*ITBLL MapReduce tasks fail because part of the keyspace cannot be located:*
{noformat}
java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList,
row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1293)
at
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1185)
at
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:410)
at
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:359)
at
org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
at
org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:154)
at
org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:121)
at
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.persist(IntegrationTestBigLinkedList.java:486)
at
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:431)
at
org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:375)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
{noformat}
{noformat}
./application_1478574724776_0002/container_1478574724776_0002_01_000008/syslog:920:java.io.IOException:
HRegionInfo was null in IntegrationTestBigLinkedList,
row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0}
{noformat}
{noformat}
./application_1478574724776_0002/container_1478574724776_0002_01_000010/syslog:920:java.io.IOException:
HRegionInfo was null in IntegrationTestBigLinkedList,
row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0}
{noformat}
{noformat}
./application_1478574724776_0002/container_1478574724776_0002_01_000011/syslog:909:java.io.IOException:
HRegionInfo was null in IntegrationTestBigLinkedList,
row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0}
{noformat}
{noformat}
./application_1478574724776_0002/container_1478574724776_0002_01_000030/syslog:48:java.io.IOException:
HRegionInfo was null in IntegrationTestBigLinkedList,
row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
{noformat}
{noformat}
./application_1478574724776_0002/container_1478574724776_0002_01_000048/syslog:48:java.io.IOException:
HRegionInfo was null in IntegrationTestBigLinkedList,
row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)