[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
*** This bug is a duplicate of bug 1637026 *** https://bugs.launchpad.net/bugs/1637026 Hello, I 've got the same problem with DB2. When I'm leaving the system as the DB instance owner, all procs are killed. I guess, it depends on the new systemd in 1604. ( see /var/log/syslog "received SIGRTMIN+24 from PID (kill) " ) maybe some sysctl param like kernel.shmall .. are involved too. It would be nice if a maintainer finally fixed that problem. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
*** This bug is a duplicate of bug 1637026 *** https://bugs.launchpad.net/bugs/1637026 I still get this bug on 16.04 LTS after updating my system. I've looked at https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1637026 and confirmed that I have procps - 2:3.3.10-4ubuntu2.2 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
*** This bug is a duplicate of bug 1637026 *** https://bugs.launchpad.net/bugs/1637026 ** This bug has been marked a duplicate of bug 1637026 kill incorrectly parses negative PIDs -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
Wow, this has been driving me nuts. Also new to ubuntu; how would one obtain the 14.04 kill binary? Is there a way short of downloading the full distribution? -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
Hi shanmuga, (1) download the sourcecode sudo apt-get source procps (2) install dependency sudo apt-get build-dep procps (3) compile procps cd procps-3.3.10 sudo dpkg-buildpackage then you could get a kill binary . -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
Hi @groden, I am running hadoop 2.7.3 in pseudo distribution mode on ubuntu 16.04 through a Virtual Machine. I am facing the same issue. My ubuntu logs off whenever i submit a new hadoop job. I would like to try your workaround. Can you provide me a link/explain on how to download and override procps-3.3.10 source code. I am a beginner with ubuntu. Please help! -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
** Package changed: alsa-driver (Ubuntu) => procps (Ubuntu) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to procps in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in procps package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1610499] Re: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group
** Summary changed: - /bin/kill in ubuntu16.04 has bug in killing process group + hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to alsa-driver in Ubuntu. https://bugs.launchpad.net/bugs/1610499 Title: hadoop crash: /bin/kill in ubuntu16.04 has bug in killing process group Status in alsa-driver package in Ubuntu: Confirmed Bug description: when i run hadoop in ubuntu 16.04, ssh will exit, all process which belong to hadoop user will be killed ,through debug ,i found the /bin/kill in ubuntu16.04 has a bug , it has bug in killing process group . Ubuntu version is: Description:Ubuntu 16.04.1 LTS Release:16.04 (1)The way to repeat this bug It is easy to repeat this bug , run “/bin/kill -15 -12345” or any like “/bin/kill -15 -1” in ubuntu16.04 , it will kill all the process . (2)Cause analysis The code of /bin/kill in ubuntu16.04 come from procps-3.3.10 , when I run “/bin/kill -15 -1” , it actually send signal 15 to -1 , -1 mean it will kill all the process . (3)The bug in procps-3.3.10/skill.c ,I think the code "pid = (long)('0' - optopt) " is not right . static void __attribute__ ((__noreturn__)) kill_main(int argc, char **argv) { case '?': if (!isdigit(optopt)) { xwarnx(_("invalid argument %c"), optopt); kill_usage(stderr); } else { /* Special case for signal digit negative * PIDs */ pid = (long)('0' - optopt); if (kill((pid_t)pid, signo) != 0) exitvalue = EXIT_FAILURE; exit(exitvalue); } loop=0; } (4) the cause sometimes when the resource is tight or a hadoop container lost connection in sometime, the nodemanager will kill this container , it send a signal to kill this jvm process ,it is a normal behavior for hadoop to kill a task and then reexecute this task. but with this kill bug ,it kill all the process belong to a hadoop user . (5) The way to workaround I copy /bin/kill in ubuntu14.04 to override /bin/kill in ubuntu16.04, it is ok in this way . I also think it is better to ask procps-3.3.10 maintainer to solve their bug,but i don't know how to contact them . To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1610499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp