This bug is believed to be fixed in cloud-init in version 19.2. If this
is still a problem for you, please make a comment and set the state back
to New

Thank you.

** Changed in: cloud-init
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1806701

Title:
  cloud-init may hang OS boot process due to grep for the entire ISO
  file when it is attached

Status in cloud-init:
  Fix Released

Bug description:
  We have found in our test for SLES15 with cloud-init installed, if we
  attach a ISO file with the VM before VM is boot, it often takes more
  than 10 minutes to start the SLES OS. Sometimes it failed to start the
  SLES OS at all.

  We've root caused it is due to the  "is_cdrom_ovf()" func of "tools
  /ds-identify".

  In this function, there is the following logic to detect if an ISO contains 
certain string:
  >    local idstr="http://schemas.dmtf.org/ovf/environment/1";
  >    grep --quiet --ignore-case "$idstr" "${PATH_ROOT}$dev"
  ref: https://git.launchpad.net/cloud-init/tree/tools/ds-identify

  It is trying to grep the who ISO file for a certain string, which causes 
intense IO pressure for the system.
  What is worse is that sometimes the ISO file is large (e.g. >5GB for 
installer DVD) and it is mounted over NFS. The "grep" process often consume 99% 
CPU and seems hang. Then the systemd starts more and more "grep" process which 
smoke the CPU and consumes all the IO bandwidth for the ISO file. Then the 
system may hang for a long time and sometimes failed to start.

  To fix this issue, I suggest that we should not grep for the entire
  ISO file. Rather then we should just check if the file/dir exists with
  os.path.exists().

  
  -------------------------debug log snip------------------------
  pek2-gosv-16-dhcp180:~ # ps -ef
  UID        PID  PPID  C STIME TTY          TIME CMD
  root         1     0  0 13:32 ?        00:00:04 /usr/lib/systemd/systemd 
--switched-root --system --deserialize 24
  …
  root       474     1  0 13:34 ?        00:00:00 /bin/sh 
/usr/lib/cloud-init/ds-identify
  root       482   474  2 13:34 ?        00:00:15 grep --quiet --ignore-case 
http://schemas.dmtf.org/ovf/environment/1 /dev/sr1
  root      1020     1  0 13:35 ?        00:00:00 /bin/sh 
/usr/lib/cloud-init/ds-identify
  root      1039  1020  1 13:35 ?        00:00:07 grep --quiet --ignore-case 
http://schemas.dmtf.org/ovf/environment/1 /dev/sr1
  polkitd   1049     1  0 13:37 ?        00:00:00 /usr/lib/polkit-1/polkitd 
--no-debug
  root      1051     1  0 13:37 ?        00:00:00 /usr/sbin/wickedd --systemd 
--foreground
  root      1052     1  0 13:37 ?        00:00:00 
/usr/lib/systemd/systemd-logind
  root      1054     1  0 13:37 ?        00:00:00 /usr/sbin/wickedd-nanny 
--systemd --foreground
  root      1073     1  0 13:37 ?        00:00:00 /usr/bin/vmtoolsd
  root      1097     1  0 13:37 ?        00:00:00 /bin/sh 
/usr/lib/cloud-init/ds-identify
  root      1110  1097  1 13:37 ?        00:00:04 grep --quiet --ignore-case 
http://schemas.dmtf.org/ovf/environment/1 /dev/sr1
  root      1304     1  0 13:38 ?        00:00:00 /bin/sh 
/usr/lib/cloud-init/ds-identify
  root      1312  1304  1 13:38 ?        00:00:03 grep --quiet --ignore-case 
http://schemas.dmtf.org/ovf/environment/1 /dev/sr1
  root      1537     1  0 13:40 ?        00:00:00 /usr/bin/plymouth --wait
  root      1613     1  0 13:40 ?        00:00:00 /bin/sh 
/usr/lib/cloud-init/ds-identify
  root      1645  1613  0 13:40 ?        00:00:02 grep --quiet --ignore-case 
http://schemas.dmtf.org/ovf/environment/1 /dev/sr1
  …

  
  Grep use nearly 100% cpu, system very slow.

  top - 13:46:37 up 26 min,  2 users,  load average: 14.14, 15.03, 10.57
  Tasks: 225 total,   6 running, 219 sleeping,   0 stopped,   0 zombie
  %Cpu(s): 40.1 us, 49.3 sy,  0.0 ni,  0.0 id,  1.4 wa,  0.0 hi,  9.1 si,  0.0 
st
  KiB Mem :  1000916 total,    64600 free,   355880 used,   580436 buff/cache
  KiB Swap:  1288168 total,  1285600 free,     2568 used.   492688 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  4427 root      20   0   40100   3940   3084 R 99.90 0.394   0:27.41 top
  1016 root      20   0  197796   4852   3400 R 99.90 0.485   1:26.44 vmtoolsd
  1723 root      20   0    7256   1860   1556 D 99.90 0.186   0:28.44 grep
    484 root      20   0    7256   1684   1396 D 99.90 0.168   1:51.22 grep
  1278 root      20   0    7256   1856   1556 D 99.90 0.185   0:38.44 grep
  1398 root      20   0    7256   1860   1556 R 99.90 0.186   0:28.53 grep
  1061 root      20   0    7256   1856   1556 D 99.90 0.185   0:56.62 grep
  -------------------------debug log snip------------------------

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1806701/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to