[
https://issues.apache.org/jira/browse/AMBARI-19768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Lysnichenko updated AMBARI-19768:
----------------------------------------
Description:
{code} CMD = """ps xf | awk -v PID=""" + str(pid) + \
""" ' $1 == PID { P = $1; next } P && /_/ { P = P " " $1;""" + \
"""K=P } P && !/_/ { P="" } END { print "kill -""" \
+ str(signal) + """ "K }' | sh """{code}
*Example*
{code}((sleep 3141592) & ((sleep 3141592) & (sleep 3141592))
ps xf
3231 ? Ss 0:01 \_ sshd: root@pts/0
3233 pts/0 Ss 0:00 | \_ -bash
17984 pts/0 S+ 0:00 | \_ -bash
17985 pts/0 S+ 0:00 | \_ sleep 3141592
17986 pts/0 S+ 0:00 | \_ -bash
17987 pts/0 S+ 0:00 | \_ sleep 3141592
17988 pts/0 S+ 0:00 | \_ sleep 3141592
17738 ? Ss 0:00 \_ sshd: root@pts/1
17740 pts/1 Ss 0:00 \_ -bash
17989 pts/1 R+ 0:00 \_ ps xf
ps xf | awk -v PID=17987 ' $1 == PID { P = $1; next } P && /_/ { P = P " "
$1;K=P } P && !/_/ { P="" } END { print "kill "K }'
(PID=17987)
result : "kill 17987 17988 17738 17740 18083 18084"
but right will only "kill 17987"
(PID=17985)
result : "kill 17985 17986 17987 17988 17738 17740 18697 18698"
right : "kill 17985"
(PID=17986)
result : "kill 17986 17987 17988 17738 17740 18980 18981"
right : "kill 17986 17987 17988"
{code}
was:
*Steps*
# Deploy HDP-2.5.0.0 with Ambari 2.4.1.0
# Upgrade ambari to 2.5.0.0-481 (I did not register Falcon library, as the jar
was already present in /var/lib/ambari-server/resources/je-5.0.73.jar on Ambari
server node)
# Register HDP-2.6.0.0-216
# Start package installation
*Result:*
Got below errors:
{code}
2016-12-16 13:47:10,419|INFO|MainThread|machine.py:145 -
run()|CRITICAL:yum.main:
2016-12-16 13:47:10,419|INFO|MainThread|machine.py:145 - run()|
2016-12-16 13:47:10,419|INFO|MainThread|machine.py:145 - run()|Error: rpmdb
open failed
2016-12-16 13:47:10,420|INFO|MainThread|machine.py:145 - run()|Traceback (most
recent call last):
2016-12-16 13:47:10,420|INFO|MainThread|machine.py:145 - run()|File
"/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", line
166, in actionexecute
2016-12-16 13:47:10,420|INFO|MainThread|machine.py:145 - run()|ret_code =
self.install_packages(package_list)
2016-12-16 13:47:10,420|INFO|MainThread|machine.py:145 - run()|File
"/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", line
400, in install_packages
2016-12-16 13:47:10,420|INFO|MainThread|machine.py:145 - run()|if not
verifyDependencies():
2016-12-16 13:47:10,421|INFO|MainThread|machine.py:145 - run()|File
"/usr/lib/python2.6/site-packages/resource_management/libraries/functions/packages_analyzer.py",
line 311, in verifyDependencies
2016-12-16 13:47:10,421|INFO|MainThread|machine.py:145 - run()|code, out =
rmf_shell.checked_call(cmd, sudo=True)
2016-12-16 13:47:10,421|INFO|MainThread|machine.py:145 - run()|File
"/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72,
in inner
2016-12-16 13:47:10,421|INFO|MainThread|machine.py:145 - run()|result =
function(command, **kwargs)
2016-12-16 13:47:10,421|INFO|MainThread|machine.py:145 - run()|File
"/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102,
in checked_call
2016-12-16 13:47:10,422|INFO|MainThread|machine.py:145 - run()|tries=tries,
try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
2016-12-16 13:47:10,422|INFO|MainThread|machine.py:145 - run()|File
"/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150,
in _call_wrapper
2016-12-16 13:47:10,422|INFO|MainThread|machine.py:145 - run()|result =
_call(command, **kwargs_copy)
2016-12-16 13:47:10,422|INFO|MainThread|machine.py:145 - run()|File
"/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303,
in _call
2016-12-16 13:47:10,423|INFO|MainThread|machine.py:145 - run()|raise
ExecutionFailed(err_msg, code, out, err)
2016-12-16 13:47:10,423|INFO|MainThread|machine.py:145 - run()|ExecutionFailed:
Execution of '/usr/bin/yum -d 0 -e 0 check dependencies' returned 1. error:
rpmdb: BDB0113 Thread/process 16016/139791567193920 failed: BDB1507 Thread died
in Berkeley DB library
2016-12-16 13:47:10,424|INFO|MainThread|machine.py:145 - run()|error: db5
error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run
database recovery
2016-12-16 13:47:10,424|INFO|MainThread|machine.py:145 - run()|error: cannot
open Packages index using db5 - (-30973)
2016-12-16 13:47:10,424|INFO|MainThread|machine.py:145 - run()|error: cannot
open Packages database in /var/lib/rpm
2016-12-16 13:47:10,424|INFO|MainThread|machine.py:145 -
run()|CRITICAL:yum.main:
2016-12-16 13:47:10,424|INFO|MainThread|machine.py:145 - run()|
2016-12-16 13:47:10,425|INFO|MainThread|machine.py:145 - run()|Error: rpmdb
open failed
2016-12-16 13:47:10,425|INFO|MainThread|machine.py:145 - run()|Traceback (most
recent call last):
2016-12-16 13:47:10,425|INFO|MainThread|machine.py:145 - run()|File
"/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", line
469, in <module>
2016-12-16 13:47:10,425|INFO|MainThread|machine.py:145 -
run()|InstallPackages().execute()
2016-12-16 13:47:10,425|INFO|MainThread|machine.py:145 - run()|File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 287, in execute
2016-12-16 13:47:10,426|INFO|MainThread|machine.py:145 - run()|method(env)
2016-12-16 13:47:10,426|INFO|MainThread|machine.py:145 - run()|File
"/var/lib/ambari-agent/cache/custom_actions/scripts/install_packages.py", line
179, in actionexecute
2016-12-16 13:47:10,426|INFO|MainThread|machine.py:145 - run()|raise
Fail("Failed to distribute repositories/install packages")
{code}
> Broken kill_process_with_children shell single liner
> -----------------------------------------------------
>
> Key: AMBARI-19768
> URL: https://issues.apache.org/jira/browse/AMBARI-19768
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.5.0
> Reporter: Dmitry Lysnichenko
> Assignee: Dmitry Lysnichenko
> Priority: Critical
> Fix For: 2.5.0
>
> Attachments: AMBARI-19768.1.patch, AMBARI-19768.2.patch,
> AMBARI-19768.patch
>
>
> {code} CMD = """ps xf | awk -v PID=""" + str(pid) + \
> """ ' $1 == PID { P = $1; next } P && /_/ { P = P " " $1;""" + \
> """K=P } P && !/_/ { P="" } END { print "kill -""" \
> + str(signal) + """ "K }' | sh """{code}
> *Example*
> {code}((sleep 3141592) & ((sleep 3141592) & (sleep 3141592))
> ps xf
> 3231 ? Ss 0:01 \_ sshd: root@pts/0
> 3233 pts/0 Ss 0:00 | \_ -bash
> 17984 pts/0 S+ 0:00 | \_ -bash
> 17985 pts/0 S+ 0:00 | \_ sleep 3141592
> 17986 pts/0 S+ 0:00 | \_ -bash
> 17987 pts/0 S+ 0:00 | \_ sleep 3141592
> 17988 pts/0 S+ 0:00 | \_ sleep 3141592
> 17738 ? Ss 0:00 \_ sshd: root@pts/1
> 17740 pts/1 Ss 0:00 \_ -bash
> 17989 pts/1 R+ 0:00 \_ ps xf
> ps xf | awk -v PID=17987 ' $1 == PID { P = $1; next } P && /_/ { P = P " "
> $1;K=P } P && !/_/ { P="" } END { print "kill "K }'
> (PID=17987)
> result : "kill 17987 17988 17738 17740 18083 18084"
> but right will only "kill 17987"
> (PID=17985)
> result : "kill 17985 17986 17987 17988 17738 17740 18697 18698"
> right : "kill 17985"
> (PID=17986)
> result : "kill 17986 17987 17988 17738 17740 18980 18981"
> right : "kill 17986 17987 17988"
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)