Public bug reported:

Hello,
after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via ssh 
because cloud-init try to start the waagent and the boot process hang. So sshd 
is stopped.

We install a fresh CentOS 7.4 in the Azure cloud to provide a base image
template for our company and this will also happens in this VM.

#######
# yum info cloud-init:
Name        : cloud-init
Arch        : x86_64
Version     : 0.7.9
Release     : 9.el7.centos.2
Size        : 2.1 M
Repo        : installed
>From repo   : base

In CentOS 7.3 the cloud-init version is 0.7.5-10.el7.centos.1
waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is 
internal updated to 2.2.17 from waagent it self.

#######
To debug the failure I had to install rlogin before update:

yum remove firewalld -y
yum install rsh-server -y
systemctl enable rsh.socket
systemctl enable rlogin.socket
systemctl enable rexec.socket
echo "root:123" | chpasswd
echo "+ root" > ~/.rlogin
cat << EOF >> /etc/securetty
rsh
rexec
rlogin
EOF

reboot

yum update -y
reboot

#######
to unblock the process I have connect via rlogin and kill the waagent start:

# ps -ef | grep "waagent\|cloud"
root       993     1  0 14:52 ?        00:00:02 /usr/bin/python 
/usr/bin/cloud-init init
root      1134   993  0 14:52 ?        00:00:00 /bin/systemctl start 
waagent.service
root      1337  1222  0 15:56 pts/2    00:00:00 grep --color=auto waagent\|cloud

# kill 1134

Then cloud-init do magic and on the next reboot sshd start without any
trouble.

#######
To fail the VM again you can clear the config and reboot:
yum remove cloud-init WALinuxAgent -y
rm -f /etc/waagent.con*
rm -fr /etc/cloud/
rm -fr /var/lib/cloud/
rm -fr /var/lib/waagent/
rm -fr /var/log/waagent.lo*
rm -fr /var/log/cloud-init*
yum install cloud-init WALinuxAgent -y

cp -a /etc/waagent.conf /etc/waagent.conf.rpmsave
sed -i -e "s/Provisioning.Enabled.*/Provisioning.Enabled=n/g" /etc/waagent.conf
sed -i -e "s/Provisioning.UseCloudInit.*/Provisioning.UseCloudInit=y/g" 
/etc/waagent.conf
sed -i -e "s/Logs.Verbose.*/Logs.Verbose=y/g" /etc/waagent.conf

cp -a /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmsave
cat << EOF >> /etc/cloud/cloud.cfg

# From cloud-init docs
datasource:
  Azure:
    agent_command: [service, waagent, start]

debug:
  verbose: True

EOF

diff /etc/waagent.conf.rpmsave /etc/waagent.conf
diff /etc/cloud/cloud.cfg.rpmsave /etc/cloud/cloud.cfg

reboot

#######
I didn't know why the system hang.
Can you please review this.

** Affects: cloud-init
     Importance: Undecided
         Status: New

** Attachment added: "cloud-init_hangs.tar"
   
https://bugs.launchpad.net/bugs/1720160/+attachment/4958120/+files/cloud-init_hangs.tar

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1720160

Title:
  cloud-init wait for waagent on Azure CentOS 7.4 - no sshd start

Status in cloud-init:
  New

Bug description:
  Hello,
  after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via 
ssh because cloud-init try to start the waagent and the boot process hang. So 
sshd is stopped.

  We install a fresh CentOS 7.4 in the Azure cloud to provide a base
  image template for our company and this will also happens in this VM.

  #######
  # yum info cloud-init:
  Name        : cloud-init
  Arch        : x86_64
  Version     : 0.7.9
  Release     : 9.el7.centos.2
  Size        : 2.1 M
  Repo        : installed
  From repo   : base

  In CentOS 7.3 the cloud-init version is 0.7.5-10.el7.centos.1
  waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is 
internal updated to 2.2.17 from waagent it self.

  #######
  To debug the failure I had to install rlogin before update:

  yum remove firewalld -y
  yum install rsh-server -y
  systemctl enable rsh.socket
  systemctl enable rlogin.socket
  systemctl enable rexec.socket
  echo "root:123" | chpasswd
  echo "+ root" > ~/.rlogin
  cat << EOF >> /etc/securetty
  rsh
  rexec
  rlogin
  EOF

  reboot

  yum update -y
  reboot

  #######
  to unblock the process I have connect via rlogin and kill the waagent start:

  # ps -ef | grep "waagent\|cloud"
  root       993     1  0 14:52 ?        00:00:02 /usr/bin/python 
/usr/bin/cloud-init init
  root      1134   993  0 14:52 ?        00:00:00 /bin/systemctl start 
waagent.service
  root      1337  1222  0 15:56 pts/2    00:00:00 grep --color=auto 
waagent\|cloud

  # kill 1134

  Then cloud-init do magic and on the next reboot sshd start without any
  trouble.

  #######
  To fail the VM again you can clear the config and reboot:
  yum remove cloud-init WALinuxAgent -y
  rm -f /etc/waagent.con*
  rm -fr /etc/cloud/
  rm -fr /var/lib/cloud/
  rm -fr /var/lib/waagent/
  rm -fr /var/log/waagent.lo*
  rm -fr /var/log/cloud-init*
  yum install cloud-init WALinuxAgent -y

  cp -a /etc/waagent.conf /etc/waagent.conf.rpmsave
  sed -i -e "s/Provisioning.Enabled.*/Provisioning.Enabled=n/g" 
/etc/waagent.conf
  sed -i -e "s/Provisioning.UseCloudInit.*/Provisioning.UseCloudInit=y/g" 
/etc/waagent.conf
  sed -i -e "s/Logs.Verbose.*/Logs.Verbose=y/g" /etc/waagent.conf

  cp -a /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmsave
  cat << EOF >> /etc/cloud/cloud.cfg

  # From cloud-init docs
  datasource:
    Azure:
      agent_command: [service, waagent, start]

  debug:
    verbose: True

  EOF

  diff /etc/waagent.conf.rpmsave /etc/waagent.conf
  diff /etc/cloud/cloud.cfg.rpmsave /etc/cloud/cloud.cfg

  reboot

  #######
  I didn't know why the system hang.
  Can you please review this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1720160/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to