[
https://issues.apache.org/jira/browse/HBASE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878739#comment-15878739
]
Abhishek Singh Chouhan commented on HBASE-17069:
------------------------------------------------
bq. What are these: "The first request with daughter info is a multi (with
hregion info) while the second one is a put." These are writes of the split
info? What does the second put do? Just trying to follow along.
Yep these are writes of the split info. The second put is when the region
actually opens on the rs, during postopen it sends a put that updates the
location, this doesn't have hregioninfo.
bq. Why are mutations empty, if (!walEdit.isEmpty()) {, such that we go the
route of '6. Append no sync'? (An append with no data shouldn't go in to
memstore so I suppose false is correct in this case?)
The mutations are not empty, however we still were going the route of appending
with inMemstore as false.
Thanks [~stack] :)
> RegionServer writes invalid META entries for split daughters in some
> circumstances
> ----------------------------------------------------------------------------------
>
> Key: HBASE-17069
> URL: https://issues.apache.org/jira/browse/HBASE-17069
> Project: HBase
> Issue Type: Bug
> Components: wal
> Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.4
> Reporter: Andrew Purtell
> Assignee: Abhishek Singh Chouhan
> Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.5
>
> Attachments: daughter_1_d55ef81c2f8299abbddfce0445067830.log,
> daughter_2_08629d59564726da2497f70451aafcdb.log,
> HBASE-17069.branch-1.3.001.patch, HBASE-17069.branch-1.3.002.patch,
> HBASE-17069.master.001.patch, logs.tar.gz,
> parent-393d2bfd8b1c52ce08540306659624f2.log
>
>
> I have been seeing frequent ITBLL failures testing various versions of 1.2.x.
> Over the lifetime of 1.2.x the following issues have been fixed:
> - HBASE-15315 (Remove always set super user call as high priority)
> - HBASE-16093 (Fix splits failed before creating daughter regions leave meta
> inconsistent)
> And this one is pending:
> - HBASE-17044 (Fix merge failed before creating merged region leaves meta
> inconsistent)
> I can apply all of the above to branch-1.2 and still see this failure:
> *The life of stillborn region d55ef81c2f8299abbddfce0445067830*
> *Master sees SPLITTING_NEW*
> {noformat}
> 2016-11-08 04:23:21,186 INFO [AM.ZK.Worker-pool2-t82] master.RegionStates:
> Transition null to {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW,
> ts=1478579001186, server=node-3.cluster,16020,1478578389506}
> {noformat}
> *The RegionServer creates it*
> {noformat}
> 2016-11-08 04:23:26,035 INFO
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created
> cacheConfig for GomnU: blockCache=LruBlockCache{blockCount=34,
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320,
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480,
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25},
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
> prefetchOnOpen=false
> 2016-11-08 04:23:26,038 INFO
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created
> cacheConfig for big: blockCache=LruBlockCache{blockCount=34,
> currentSize=14996112, freeSize=12823716208, maxSize=12838712320,
> heapSize=14996112, minSize=12196776960, minFactor=0.95, multiSize=6098388480,
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25},
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
> prefetchOnOpen=false
> 2016-11-08 04:23:26,442 INFO
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created
> cacheConfig for meta: blockCache=LruBlockCache{blockCount=63,
> currentSize=17187656, freeSize=12821524664, maxSize=12838712320,
> heapSize=17187656, minSize=12196776960, minFactor=0.95, multiSize=6098388480,
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25},
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
> prefetchOnOpen=false
> 2016-11-08 04:23:26,713 INFO
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created
> cacheConfig for nwmrW: blockCache=LruBlockCache{blockCount=96,
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320,
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480,
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25},
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
> prefetchOnOpen=false
> 2016-11-08 04:23:26,715 INFO
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created
> cacheConfig for piwbr: blockCache=LruBlockCache{blockCount=96,
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320,
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480,
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25},
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
> prefetchOnOpen=false
> 2016-11-08 04:23:26,717 INFO
> [StoreOpener-d55ef81c2f8299abbddfce0445067830-1] hfile.CacheConfig: Created
> cacheConfig for tiny: blockCache=LruBlockCache{blockCount=96,
> currentSize=19178440, freeSize=12819533880, maxSize=12838712320,
> heapSize=19178440, minSize=12196776960, minFactor=0.95, multiSize=6098388480,
> multiFactor=0.5, singleSize=3049194240, singleFactor=0.25},
> cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
> cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
> prefetchOnOpen=false
> {noformat}
> *The RegionServer onlines it*
> {noformat}
> 2016-11-08 04:23:27,015 INFO
> [node-3.cluster,16020,1478578389506-daughterOpener=d55ef81c2f8299abbddfce0445067830]
> regionserver.HRegion: Onlined d55ef81c2f8299abbddfce0445067830; next
> sequenceid=19184
> 2016-11-08 04:23:27,029 INFO
> [regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099]
> regionserver.HRegionServer: Post open deploy tasks for
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.
> 2016-11-08 04:23:27,047 INFO
> [regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099]
> hbase.MetaTableAccessor: Updated row
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.
> with server=node-3.cluster,16020,1478578389506
> {noformat}
> *The Master transitions state from SPLITTING_NEW to OPEN*
> {noformat}
> 2016-11-08 04:23:27,058 INFO [AM.ZK.Worker-pool2-t84] master.RegionStates:
> Transition {d55ef81c2f8299abbddfce0445067830 state=SPLITTING_NEW,
> ts=1478579007057, server=node-3.cluster,16020,1478578389506} to
> {d55ef81c2f8299abbddfce0445067830 state=OPEN, ts=1478579007058,
> server=node-3.cluster,16020,1478578389506}
> 2016-11-08 04:23:27,059 INFO [AM.ZK.Worker-pool2-t84]
> master.AssignmentManager: Handled SPLIT event;
> parent=IntegrationTestBigLinkedList,,1478577020916.393d2bfd8b1c52ce08540306659624f2.,
> daughter
> a=IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.,
> daughter
> b=IntegrationTestBigLinkedList,/\xFB\x14,1478579001155.08629d59564726da2497f70451aafcdb.,
> on node-3.cluster,16020,1478578389506
> {noformat}
> *RegionServer updates META - BUT APPARENTLY NOT CORRECTLY*
> {noformat}
> 2016-11-08 04:23:27,165 INFO
> [regionserver/node-3.cluster/192.168.124.4:16020-splits-1478579001099]
> regionserver.SplitRequest: Region split, hbase:meta updated, and report to
> master.
> Parent=IntegrationTestBigLinkedList,,1478577020916.393d2bfd8b1c52ce08540306659624f2.,
> new regions:
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.,
>
> IntegrationTestBigLinkedList,/\xFB\x14,1478579001155.08629d59564726da2497f70451aafcdb..
> Split took 6sec
> {noformat}
> *RegionServer delays flush*
> (Is this important?)
> {noformat}
> 2016-11-08 04:24:14,639 WARN [MemStoreFlusher.0]
> regionserver.MemStoreFlusher: Region
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830.
> has too many store files; delaying flush up to 90000ms
> {noformat}
> *Immediate warnings about No serialized HRegionInfo*
> {noformat}
> 2016-11-08 04:24:44,691 WARN
> [B.defaultRpcServer.handler=26,queue=2,port=16000] hbase.MetaTableAccessor:
> No serialized HRegionInfo in
> keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478579007029/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478579007029/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478579007029/Put/vlen=8/seqid=0}
> {noformat}
> *Master is not happy either*
> {noformat}
> 2016-11-08 04:24:51,148 WARN [MASTER_TABLE_OPERATIONS-node-1:16000-0]
> hbase.MetaTableAccessor: No serialized HRegionInfo in
> keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478579007029/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478579007029/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478579007029/Put/vlen=8/seqid=0}
> {noformat}
> *TestRunner MetaScanner complains about invalid entries in META missing
> HRegionInfo*
> {noformat}
> (standard input):9086:2016-11-08 05:04:17,230 WARN
> [B.defaultRpcServer.handler=4,queue=1,port=16000] hbase.MetaTableAccessor: No
> serialized HRegionInfo in
> keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
> {noformat}
> *ITBLL MapReduce tasks fail because part of the keyspace cannot be located:*
> {noformat}
> java.io.IOException: HRegionInfo was null in IntegrationTestBigLinkedList,
> row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1293)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1185)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:410)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:359)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:154)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:121)
> at
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.persist(IntegrationTestBigLinkedList.java:486)
> at
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:431)
> at
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator$GeneratorMapper.map(IntegrationTestBigLinkedList.java:375)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
> {noformat}
> {noformat}
> ./application_1478574724776_0002/container_1478574724776_0002_01_000008/syslog:920:java.io.IOException:
> HRegionInfo was null in IntegrationTestBigLinkedList,
> row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0}
> {noformat}
> {noformat}
> ./application_1478574724776_0002/container_1478574724776_0002_01_000010/syslog:920:java.io.IOException:
> HRegionInfo was null in IntegrationTestBigLinkedList,
> row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0}
> {noformat}
> {noformat}
> ./application_1478574724776_0002/container_1478574724776_0002_01_000011/syslog:909:java.io.IOException:
> HRegionInfo was null in IntegrationTestBigLinkedList,
> row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478580288482/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478580288482/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478580288482/Put/vlen=8/seqid=0}
> {noformat}
> {noformat}
> ./application_1478574724776_0002/container_1478574724776_0002_01_000030/syslog:48:java.io.IOException:
> HRegionInfo was null in IntegrationTestBigLinkedList,
> row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
> {noformat}
> {noformat}
> ./application_1478574724776_0002/container_1478574724776_0002_01_000048/syslog:48:java.io.IOException:
> HRegionInfo was null in IntegrationTestBigLinkedList,
> row=keyvalues={IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:seqnumDuringOpen/1478581041080/Put/vlen=8/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:server/1478581041080/Put/vlen=20/seqid=0,
>
> IntegrationTestBigLinkedList,,1478579001155.d55ef81c2f8299abbddfce0445067830./info:serverstartcode/1478581041080/Put/vlen=8/seqid=0}
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)