[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525780#comment-16525780
 ] 

Duo Zhang commented on HBASE-20792:
-----------------------------------

We're not removing it, just add a comment to tell developers you should keep in 
mind that the lastHost may not be in sync with the data in meta before using 
it. For retaining assignment it is fine, as the target server is just a hint, 
if we fail to assign it we will try to assign it elsewhere. But updating meta 
is another story, since the lastHost may not be in sync with the data in meta, 
we should not use it as a guard when updating meta. And there is no big 
performance issue, we always need to update the row in meta, one more small 
qualifier is fine.

> info:servername and info:sn inconsistent for OPEN region
> --------------------------------------------------------
>
>                 Key: HBASE-20792
>                 URL: https://issues.apache.org/jira/browse/HBASE-20792
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0
>
>         Attachments: HBASE-20792.patch, TestRegionMoveAndAbandon.java, 
> hbase-hbase-master-ctr-e138-1518143905142-380753-01-000004.hwx.site.log
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test                                                        
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>                                                              => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-000012.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-000006.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to