Jonathan Hurley created AMBARI-18590:

             Summary: RegionServer Registration Checks Fail During Upgrade If 
rDNS is Not Enabled
                 Key: AMBARI-18590
             Project: Ambari
          Issue Type: Bug
          Components: ambari-agent
    Affects Versions: 2.2.0
            Reporter: Jonathan Hurley
            Assignee: Jonathan Hurley
            Priority: Blocker
             Fix For: 2.5.0

During a rolling upgrade, the upgrade orchestration must wait for each 
RegionServer to register with the HBase master before moving onto the next RS 
restart. This is a very asynchronous process which may occur several minutes 
after the daemon has actually started. 

We have a check now which uses {{hbase shell}} along with {{status 'simple'}} 
to determine if the host has registered by looking for the hostname. 

However, if reverse DNS is not enabled, then this could potentially be IP 
addresses. As a result, the check would always fail during upgrades:


The HBase status command we use is {{status simple}}, which returns like so:

active master: 1475801031124
2 backup masters 1475801061290 1475801046018
2 live servers 1475798271407
        requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=159, 
maxHeapMB=7840, numberOfStores=3, numberOfStorefiles=1, 
storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, 
storefileIndexSizeMB=0, readRequestsCount=14, writeRequestsCount=1, 
rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, 
totalCompactingKVs=14, currentCompactedKVs=14, compactionProgressPct=1.0, 
coprocessors=[MultiRowMutationEndpoint, SecureBulkLoadEndpoint] 1475872741297
        requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=1002, 
maxHeapMB=7840, numberOfStores=1, numberOfStorefiles=1, 
storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, 
storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, 
rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, 
totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, 
0 dead servers
Aggregate load: 0, regions: 3

If this lookup fails for the hostname, we should also try by IP address.

This message was sent by Atlassian JIRA

Reply via email to