[
https://issues.apache.org/jira/browse/AMBARI-16914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327943#comment-15327943
]
Hudson commented on AMBARI-16914:
---------------------------------
FAILURE: Integrated in Ambari-trunk-Commit #5067 (See
[https://builds.apache.org/job/Ambari-trunk-Commit/5067/])
AMBARI-16914. Ambari uses too small a window for region server shutdown
(aonishuk:
[http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=b220d26f7c158aa48338018ec281a3dab34929d5])
* ambari-server/src/test/python/stacks/2.0.6/configs/default.json
*
ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/package/scripts/phoenix_service.py
*
ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/configuration/hbase-env.xml
*
ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/package/scripts/params_linux.py
*
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/scripts/params_linux.py
*
ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_service.py
*
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/scripts/hbase_service.py
*
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/configuration/ams-hbase-env.xml
* ambari-server/src/test/python/stacks/2.0.6/configs/secured.json
* ambari-server/src/test/python/stacks/2.0.6/HBASE/test_phoenix_queryserver.py
> Ambari uses too small a window for region server shutdown
> ---------------------------------------------------------
>
> Key: AMBARI-16914
> URL: https://issues.apache.org/jira/browse/AMBARI-16914
> Project: Ambari
> Issue Type: Bug
> Components: ambari-web
> Affects Versions: 2.2.1
> Reporter: Shankar Venkataraman
> Attachments: AMBARI-16914.patch
>
>
> Ambari seems to issue a formal shutdown to a Region server but quickly (30
> seconds) follows it up with SIGKILL. On a full loaded HBase system with
> about 200 regions per region server and active transaction flow, there is no
> way a RS can stop in 30 seconds. This has caused many issues in production
> including a memstore corruption. Why not use the shutdown script that comes
> with HBase?
> 2016-05-24 15:36:19,191 -
> Execute['/usr/hdp/current/hbase-regionserver/bin/hbase-daemon.sh --config
> /usr/hdp/current/hbase-regionserver/conf stop regionserver'] {'only_if':
> 'ambari-sudo.sh -H -E test -f /var/run/hbase/hbase-hbase-regionserver.pid &&
> ps -p `ambari-sudo.sh -H -E cat /var/run/hbase/hbase-hbase-regionserver.pid`
> >/dev/null 2>&1', 'on_timeout': '! ( ambari-sudo.sh -H -E test -f
> /var/run/hbase/hbase-hbase-regionserver.pid && ps -p `ambari-sudo.sh -H -E
> cat /var/run/hbase/hbase-hbase-regionserver.pid` >/dev/null 2>&1 ) ||
> ambari-sudo.sh -H -E kill -9 `ambari-sudo.sh -H -E cat
> /var/run/hbase/hbase-hbase-regionserver.pid`', 'timeout': 30, 'user': 'hbase'}
> 2016-05-24 15:36:50,982 - Executing '! ( ambari-sudo.sh -H -E test -f
> /var/run/hbase/hbase-hbase-regionserver.pid && ps -p `ambari-sudo.sh -H -E
> cat /var/run/hbase/hbase-hbase-regionserver.pid` >/dev/null 2>&1 ) ||
> ambari-sudo.sh -H -E kill -9 `ambari-sudo.sh -H -E cat
> /var/run/hbase/hbase-hbase-regionserver.pid`'. Reason: Execution of
> 'ambari-sudo.sh su hbase -l -s /bin/bash -c 'export
> PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent'"'"'
> ; /usr/hdp/current/hbase-regionserver/bin/hbase-daemon.sh --config
> /usr/hdp/current/hbase-regionserver/conf stop regionserver'' was killed due
> timeout after 30 seconds
> 2016-05-24 15:36:51,053 - File['/var/run/hbase/hbase-hbase-regionserver.pid']
> {'action': ['delete']}
> 2016-05-24 15:36:51,054 - Deleting
> File['/var/run/hbase/hbase-hbase-regionserver.pid'
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)