> On Oct. 13, 2016, 1:02 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/package/scripts/upgrade.py,
> >  line 64
> > <https://reviews.apache.org/r/52833/diff/1/?file=1534837#file1534837line64>
> >
> >     Let's decrease the sleep time to 10 secs.

Is there a reason? I am hesistent against changing stuff like this; we've been 
burned before where customer environments take much longer than we think. 
HBase, especially, as it rebuilds regions.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52833/#review152531
-----------------------------------------------------------


On Oct. 13, 2016, 11:19 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52833/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2016, 11:19 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez and Nate Cole.
> 
> 
> Bugs: AMBARI-18590
>     https://issues.apache.org/jira/browse/AMBARI-18590
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During a rolling upgrade, the upgrade orchestration must wait for each 
> RegionServer to register with the HBase master before moving onto the next RS 
> restart. This is a very asynchronous process which may occur several minutes 
> after the daemon has actually started. 
> 
> We have a check now which uses {{hbase shell}} along with {{status 'simple'}} 
> to determine if the host has registered by looking for the hostname. 
> 
> However, if reverse DNS is not enabled, then this could potentially be IP 
> addresses. As a result, the check would always fail during upgrades:
> 
> The HBase status command we use is {{status simple}}, which returns like so:
> 
> ```
> active master:  10.0.0.8:16000 1475801031124
> 2 backup masters
>     10.0.0.10:16000 1475801061290
>     10.0.0.13:16000 1475801046018
> 2 live servers
>     10.0.0.5:16020 1475798271407
>         requestsPerSecond=0.0, numberOfOnlineRegions=2, usedHeapMB=159, 
> maxHeapMB=7840, numberOfStores=3, numberOfStorefiles=1, 
> storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, 
> storefileIndexSizeMB=0, readRequestsCount=14, writeRequestsCount=1, 
> rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, 
> totalCompactingKVs=14, currentCompactedKVs=14, compactionProgressPct=1.0, 
> coprocessors=[MultiRowMutationEndpoint, SecureBulkLoadEndpoint]
>     10.0.0.7:16020 1475872741297
>         requestsPerSecond=0.0, numberOfOnlineRegions=1, usedHeapMB=1002, 
> maxHeapMB=7840, numberOfStores=1, numberOfStorefiles=1, 
> storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, 
> storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, 
> rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, 
> totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN, 
> coprocessors=[SecureBulkLoadEndpoint]
> 0 dead servers
> Aggregate load: 0, regions: 3
> ```
> 
> If this lookup fails for the hostname, we should also try by IP address.
> 
> 
> Diffs
> -----
> 
>   
> ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/package/scripts/upgrade.py
>  f1fa80c 
> 
> Diff: https://reviews.apache.org/r/52833/diff/
> 
> 
> Testing
> -------
> 
> Total run:1133
> Total errors:0
> Total failures:0
> OK
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>

Reply via email to