Hi David. We don't have a repro outside a few customer environments. If
you can provide information what logs/info to collect when reproducing,
we can collect the info in the environment where the problem surfaces.
Thanks.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1788643

Title:
  zombies pile up, system becomes unresponsive

Status in systemd package in Ubuntu:
  New

Bug description:
  Description:    Ubuntu 16.04.5 LTS
  Release:        16.04

  systemd:
    Installed: 229-4ubuntu21.4
    Candidate: 229-4ubuntu21.4
    Version table:
   *** 229-4ubuntu21.4 500
          500 http://azure.archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
          100 /var/lib/dpkg/status
       229-4ubuntu21.1 500
          500 http://security.ubuntu.com/ubuntu xenial-security/main amd64 
Packages
       229-4ubuntu4 500
          500 http://azure.archive.ubuntu.com/ubuntu xenial/main amd64 Packages

  This problem is in Azure. We are seeing these problems on different
  systems. Worker nodes (Ubuntu 16.04) in a hadoop cluster start piling
  up zombies and become unresponsive. The syslog and the kernel logs
  don't provide much information.

  The only error we could correlate with what we are seeing was in the
  audit logs. See at the end of this message, the "Connection timed out"
  and the "Cannot create session: Already running in a session"
  messages.

  Our first suspect was memory pressure on the machines. We added
  logging and settings to reboot on out of memory, but all these turned
  to be red herrings.

  Aug 18 19:11:08 wn2-d3ncsp su[112600]: Successful su for root by root
  Aug 18 19:11:08 wn2-d3ncsp su[112600]: + ??? root:root
  Aug 18 19:11:08 wn2-d3ncsp su[112600]: pam_unix(su:session): session opened 
for user root by (uid=0)
  Aug 18 19:11:08 wn2-d3ncsp systemd-logind[1486]: New session c8 of user root.
  Aug 18 19:11:26 wn2-d3ncsp sshd[112690]: Did not receive identification 
string from 10.84.93.35
  Aug 18 19:11:34 wn2-d3ncsp su[112600]: pam_systemd(su:session): Failed to 
create session: Connection timed out
  Aug 18 19:11:34 wn2-d3ncsp su[112600]: pam_unix(su:session): session closed 
for user root
  Aug 18 19:11:34 wn2-d3ncsp systemd-logind[1486]: Removed session c8.

   
  Aug 18 19:12:03 wn2-d3ncsp sudo: ehiadmin : TTY=pts/1 ; PWD=/home/ehiadmin ; 
USER=root ; COMMAND=/bin/su -
  Aug 18 19:12:03 wn2-d3ncsp sudo: pam_unix(sudo:session): session opened for 
user root by ehiadmin(uid=0)
  Aug 18 19:12:03 wn2-d3ncsp su[113085]: Successful su for root by root
  Aug 18 19:12:03 wn2-d3ncsp su[113085]: + /dev/pts/1 root:root
  Aug 18 19:12:03 wn2-d3ncsp su[113085]: pam_unix(su:session): session opened 
for user root by ehiadmin(uid=0)
  Aug 18 19:12:03 wn2-d3ncsp su[113085]: pam_systemd(su:session): Cannot create 
session: Already running in a session
  Aug 18 19:12:42 wn2-d3ncsp sshd[113274]: Did not receive identification 
string from 10.84.93.42
  Aug 18 19:13:37 wn2-d3ncsp su[113085]: pam_unix(su:session): session closed 
for user root
  Aug 18 19:13:37 wn2-d3ncsp sudo: pam_unix(sudo:session): session closed for 
user root
  Aug 18 19:13:37 wn2-d3ncsp sshd[112285]: pam_unix(sshd:session): session 
closed for user ehiadmin
  Aug 18 19:13:37 wn2-d3ncsp systemd-logind[1486]: Removed session 1291.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1788643/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to