[
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526749#comment-16526749
]
Josh Elser commented on HBASE-20792:
------------------------------------
Same cycling on the RS:
{noformat}
2018-06-27 17:33:40,100 INFO
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16020]
regionserver.RSRpcServices: submitting region close type: REGION_NAME
value: "table_izljd,,1530098578429.523007f77f96474d01d74ed3d048e173."
2018-06-27 17:33:40,100 INFO
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16020]
regionserver.RSRpcServices: Close 523007f77f96474d01d74ed3d048e173, moving to
ctr-e138-1518143905142-381863-01-000002.hwx.site,16020,1530113920688. Requested
from /172.27.18.195
2018-06-27 17:33:40,101 INFO
[RS_CLOSE_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-1]
regionserver.HRegion: Closed
table_izljd,,1530098578429.523007f77f96474d01d74ed3d048e173.
2018-06-27 17:33:40,101 WARN
[RS_CLOSE_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-1]
regionserver.HRegionServer: Not adding moved region record:
523007f77f96474d01d74ed3d048e173 to self.
2018-06-27 17:33:40,662 INFO
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16020]
regionserver.RSRpcServices: Open
table_izljd,,1530098578429.523007f77f96474d01d74ed3d048e173.
2018-06-27 17:33:40,666 INFO
[RS_OPEN_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-3]
coprocessor.CoprocessorHost: System coprocessor
org.apache.hadoop.hbase.security.token.TokenProvider loaded, priority=536870911.
2018-06-27 17:33:40,666 WARN
[RS_OPEN_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-3]
access.SecureBulkLoadEndpoint: SecureBulkLoadEndpoint is deprecated. It will be
removed in future releases.
2018-06-27 17:33:40,666 WARN
[RS_OPEN_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-3]
access.SecureBulkLoadEndpoint: Secure bulk load has been integrated into HBase
core.
2018-06-27 17:33:40,666 INFO
[RS_OPEN_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-3]
coprocessor.CoprocessorHost: System coprocessor
org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint loaded,
priority=536870912.
2018-06-27 17:33:40,673 INFO [StoreOpener-523007f77f96474d01d74ed3d048e173-1]
hfile.CacheConfig: Created cacheConfig for col_fam_yipdf:
blockCache=LruBlockCache{blockCount=3, currentSize=627.95 KB, freeSize=818.59
MB, maxSize=819.20 MB, heapSize=627.95 KB, minSize=778.24 MB, minFactor=0.95,
multiSize=389.12 MB, multiFactor=0.5, singleSize=194.56 MB, singleFactor=0.25},
cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false,
cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
prefetchOnOpen=false
2018-06-27 17:33:40,673 INFO [StoreOpener-523007f77f96474d01d74ed3d048e173-1]
compactions.CompactionConfiguration: size [128 MB, 8.00 EB, 8.00 EB); files [3,
10); ratio 1.200000; off-peak ratio 5.000000; throttle point 2684354560; major
period 604800000, major jitter 0.500000, min locality to compact 0.000000;
tiered compaction: max_age 9223372036854775807, incoming window min 6,
compaction policy for tiered window
org.apache.hadoop.hbase.regionserver.compactions.ExploringCompactionPolicy,
single output for minor true, compaction window factory
org.apache.hadoop.hbase.regionserver.compactions.ExponentialCompactionWindowFactory
2018-06-27 17:33:40,674 INFO [StoreOpener-523007f77f96474d01d74ed3d048e173-1]
regionserver.HStore: Store=col_fam_yipdf, memstore type=DefaultMemStore,
storagePolicy=HOT, verifyBulkLoads=false
2018-06-27 17:33:40,676 INFO
[RS_OPEN_REGION-regionserver/ctr-e138-1518143905142-381863-01-000002:16020-3]
regionserver.HRegion: Opened 523007f77f96474d01d74ed3d048e173; next sequenceid=2
2018-06-27 17:33:40,676 INFO
[PostOpenDeployTasks:523007f77f96474d01d74ed3d048e173]
regionserver.HRegionServer: Post open deploy tasks for
table_izljd,,1530098578429.523007f77f96474d01d74ed3d048e173.
{noformat}
> info:servername and info:sn inconsistent for OPEN region
> --------------------------------------------------------
>
> Key: HBASE-20792
> URL: https://issues.apache.org/jira/browse/HBASE-20792
> Project: HBase
> Issue Type: Bug
> Components: Region Assignment
> Reporter: Josh Elser
> Assignee: Josh Elser
> Priority: Blocker
> Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
> Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java,
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-000004.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection
> of regions will simply not be assigned out to the RS. I was able to reproduce
> this my mimic the restart patterns our tests do internally (ignore whether
> this is the best way to restart nodes for now :)). The general pattern is
> this:
> {code:java}
> for rs in regionservers:
> stop(server, rs, RS)
> for master in masters:
> stop(server, master, MASTER)
> sleep(15)
> for master in masters:
> start(server, master, MASTER)
> for rs in regionservers:
> start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
> test
> column=table:state, timestamp=1529871718998, value=\x08\x00
> test,,1529871718122.0297f680df6dc0166a44f9536346268e.
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED =>
> 0297f680df6dc0166a44f9536346268e, NAME =>
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
> => '', ENDKEY =>
> ''}
> test,,1529871718122.0297f680df6dc0166a44f9536346268e.
> column=info:seqnumDuringOpen, timestamp=1529967103390,
> value=\x00\x00\x00\x00\x00\x00\x00*
> test,,1529871718122.0297f680df6dc0166a44f9536346268e.
> column=info:server, timestamp=1529967103390,
> value=ctr-e138-1518143905142-378097-02-000012.hwx.site:16020
> test,,1529871718122.0297f680df6dc0166a44f9536346268e.
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
> test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:sn,
> timestamp=1529967096482,
> value=ctr-e138-1518143905142-378097-02-000006.hwx.site,16020,1529966755170
> test,,1529871718122.0297f680df6dc0166a44f9536346268e.
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better.
> However, the interesting bit is that {{info:server}} and {{info:sn}} are
> inconsistent (which, according to the javadoc should not be possible for an
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd
> attempt, so I'm hopeful it's not a bear to repro.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)