[ 
https://issues.apache.org/jira/browse/HBASE-20792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524513#comment-16524513
 ] 

Duo Zhang commented on HBASE-20792:
-----------------------------------

{code}
else if (regionLocation != null && !regionLocation.equals(lastHost)) {
      // Ideally, if no regionLocation, write null to the hbase:meta but this 
will confuse clients
      // currently; they want a server to hit. TODO: Make clients wait if no 
location.
      put.add(CellBuilderFactory.create(CellBuilderType.SHALLOW_COPY)
          .setRow(put.getRow())
          .setFamily(HConstants.CATALOG_FAMILY)
          .setQualifier(getServerNameColumn(replicaId))
          .setTimestamp(put.getTimestamp())
          .setType(Cell.Type.Put)
          .setValue(Bytes.toBytes(regionLocation.getServerName()))
          .build());
      info.append(", regionLocation=").append(regionLocation);
    }
{code}
else if (regionLocation != null && !regionLocation.equals(lastHost)) {
      // Ideally, if no regionLocation, write null to the hbase:meta but this 
will confuse clients
      // currently; they want a server to hit. TODO: Make clients wait if no 
location.
      put.add(CellBuilderFactory.create(CellBuilderType.SHALLOW_COPY)
          .setRow(put.getRow())
          .setFamily(HConstants.CATALOG_FAMILY)
          .setQualifier(getServerNameColumn(replicaId))
          .setTimestamp(put.getTimestamp())
          .setType(Cell.Type.Put)
          .setValue(Bytes.toBytes(regionLocation.getServerName()))
          .build());
      info.append(", regionLocation=").append(regionLocation);
    }
{code}

So here if we do not log the regionLocation when OPENING, it means that the 
regionLocation is null, or it is the same with the previous location. Since for 
opening a region the regionLocation can never be null, this means the new 
region location is the same with the previous one.

And in your case, I guess this is not the truth right? The procedure with pid 
17 is a MRP? Or SCP? Anyway, in both cases, I do not think the region location 
should be the same with the previous one. Maybe you can add more logs in 
updateUserRegionLocation to see what's going on with the regionLocation and 
lastHost?

> info:servername and info:sn inconsistent for OPEN region
> --------------------------------------------------------
>
>                 Key: HBASE-20792
>                 URL: https://issues.apache.org/jira/browse/HBASE-20792
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 2.0.2
>
>
> Next problem we've run into after HBASE-20752 and HBASE-20708
> After a rolling restart of a cluster, we'll see situations where a collection 
> of regions will simply not be assigned out to the RS. I was able to reproduce 
> this my mimic the restart patterns our tests do internally (ignore whether 
> this is the best way to restart nodes for now :)). The general pattern is 
> this:
> {code:java}
> for rs in regionservers:
>   stop(server, rs, RS)
> for master in masters:
>   stop(server, master, MASTER)
> sleep(15)
> for master in masters:
>   start(server, master, MASTER)
> for rs in regionservers:
>   start(server, rs, RS){code}
> Looking at meta, we can see why the Master is ignoring some regions:
> {noformat}
>  test                                                        
> column=table:state, timestamp=1529871718998, value=\x08\x00
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 
> 0297f680df6dc0166a44f9536346268e, NAME => 
> 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY
>                                                              => '', ENDKEY => 
> ''}
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:seqnumDuringOpen, timestamp=1529967103390, 
> value=\x00\x00\x00\x00\x00\x00\x00*
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:server, timestamp=1529967103390, 
> value=ctr-e138-1518143905142-378097-02-000012.hwx.site:16020
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:serverstartcode, timestamp=1529967103390, value=1529966776248
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       column=info:sn, 
> timestamp=1529967096482, 
> value=ctr-e138-1518143905142-378097-02-000006.hwx.site,16020,1529966755170
>  test,,1529871718122.0297f680df6dc0166a44f9536346268e.       
> column=info:state, timestamp=1529967103390, value=OPEN{noformat}
> The region is marked as {{OPEN}}. The master doesn't know any better. 
> However, the interesting bit is that {{info:server}} and {{info:sn}} are 
> inconsistent (which, according to the javadoc should not be possible for an 
> {{OPEN}} region).{{}}
> This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd 
> attempt, so I'm hopeful it's not a bear to repro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to