> On July 20, 2017, 10:14 a.m., Attila Doroszlai wrote:
> > Normally the property is added during Ambari upgrade: initially with 
> > default value of "1024", then updated to "1024m" by `UpgradeCatalog222`.  
> > (Try upgrading from Apache Ambari 2.2.1 to 2.5.2.)
> > 
> > The root cause of the problem is that `zk_server_heapsize` is referenced in 
> > `zookeeper-env` (the `content`) in BigInsights 4.2, but the property itself 
> > is missing.  It is then added during stack upgrade with its raw default 
> > value.
> > 
> > I think the proper fix is to add the missing property in the BI 4.2 stack 
> > definition.  The current patch would be a nice workaround if there already 
> > were clusters with the broken value.
> 
> Jonathan Hurley wrote:
>     I think that there are clusters with the broken value today.

Ah, I see what you're saying. So, if we added it to the BI stack, then it would 
get taken care of up Ambari Server upgrade automatically. We should do that.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60986/#review181042
-----------------------------------------------------------


On July 19, 2017, 8:13 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60986/
> -----------------------------------------------------------
> 
> (Updated July 19, 2017, 8:13 p.m.)
> 
> 
> Review request for Ambari, Di Li, Jonathan Hurley, Sumit Mohanty, Sid Wagle, 
> and Tim Thorpe.
> 
> 
> Bugs: AMBARI-21528
>     https://issues.apache.org/jira/browse/AMBARI-21528
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Repro Steps:
> 
> * Installed BI 4.2.0 cluster on IBM Ambari 2.2.2 with Zookeeper
> * Upgraded Ambari to 2.5.2.0-146
> * Registered HDP 2.6.2.0 repo, installed packages
> * Ran service checks
> * Started Express Upgrade
> 
> Result: _Service Check ZooKeeper_ step failed with {{KeeperErrorCode = 
> ConnectionLoss for /zk_smoketest}}
> 
> This was caused by Zookeeper dying immediately during restart:
> ```
> Error occurred during initialization of VM
> Too small initial heap
> ```
> 
> Before EU
> ```
> export JAVA_HOME=/usr/jdk64/java-1.8.0-openjdk-1.8.0.77-0.b03.el7_2.x86_64
> export ZOOKEEPER_HOME=/usr/iop/current/zookeeper-server
> export ZOO_LOG_DIR=/var/log/zookeeper
> export ZOOPIDFILE=/var/run/zookeeper/zookeeper_server.pid
> export SERVER_JVMFLAGS=-Xmx1024m
> export JAVA=$JAVA_HOME/bin/java
> export CLASSPATH=$CLASSPATH:/usr/share/zookeeper/*
> ```
> 
> After EU
> ```
> export JAVA_HOME=/usr/jdk64/java-1.8.0-openjdk-1.8.0.77-0.b03.el7_2.x86_64
> export ZOOKEEPER_HOME=/usr/hdp/current/zookeeper-client
> export ZOO_LOG_DIR=/var/log/zookeeper
> export ZOOPIDFILE=/var/run/zookeeper/zookeeper_server.pid
> export SERVER_JVMFLAGS=-Xmx1024
> export JAVA=$JAVA_HOME/bin/java
> ```
> 
> Note missing "m" in memory setting.
> 
> zookeeper-env template contains,
> ```
> export SERVER_JVMFLAGS={{zk_server_heapsize}}
> ```
> 
> In this cluster, zookeeper-env contains,
> zk_server_heapsize: "1024"
> 
> While the params_linux.py file has some inconsistencies with appending the 
> letter "m".
> ```
> zk_server_heapsize_value = 
> str(default('configurations/zookeeper-env/zk_server_heapsize', "1024m"))
> zk_server_heapsize = format("-Xmx{zk_server_heapsize_value}")
> ```
> 
> Instead, it should be,
> ```
> zk_server_heapsize_value = 
> str(default('configurations/zookeeper-env/zk_server_heapsize', "1024"))
> zk_server_heapsize_value = zk_server_heapsize_value.strip()
> if len(zk_server_heapsize_value) > 0 and not 
> zk_server_heapsize_value[-1].isdigit():
>   zk_server_heapsize_value = zk_server_heapsize_value + "m"
> zk_server_heapsize = format("-Xmx{zk_server_heapsize_value}")
> ```
> 
> 
> Diffs
> -----
> 
>   
> ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5/package/scripts/params_linux.py
>  0780d2e 
> 
> 
> Diff: https://reviews.apache.org/r/60986/diff/2/
> 
> 
> Testing
> -------
> 
> Python unit tests passed,
> 
> ----------------------------------------------------------------------
> Total run:1161
> Total errors:0
> Total failures:0
> OK
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>

Reply via email to