Hi, My lab ACS server (version 4.11.2.0) recently starts to die off a few hours after a restart, with following error message in the log:
2019-04-24 10:38:35,237 INFO [c.c.s.ConfigurationServerImpl] (main:null) Processing updateKeyPairs 2019-04-24 10:38:35,237 INFO [c.c.s.ConfigurationServerImpl] (main:null) Keypairs already in database, updating local copy 2019-04-24 10:38:35,241 INFO [c.c.s.ConfigurationServerImpl] (main:null) Going to update systemvm iso with generated keypairs if needed 2019-04-24 10:38:35,241 INFO [c.c.s.ConfigurationServerImpl] (main:null) Trying to inject public and private keys into systemvm iso 2019-04-24 10:38:35,288 INFO [c.c.s.ConfigurationServerImpl] (main:null) Injected public and private keys into systemvm iso with result : mount: could not find any free loop device 2019-04-24 10:38:35,288 WARN [c.c.s.ConfigurationServerImpl] (main:null) Failed to inject generated public key into systemvm iso mount: could not find any free loop device 2019-04-24 10:38:35,290 WARN [o.a.c.s.m.c.ResourceApplicationContext] (main:null) Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'cloudStackLifeCycle'; nested exception is com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated public key into systemvm iso mount: could not find any free loop device 2019-04-24 10:38:35,291 WARN [o.e.j.w.WebAppContext] (main:null) Failed startup of context o.e.j.w.WebAppContext@78a2da20{/client,file:///usr/share/cloudstack-management/webapp/,UNAVAILABLE}{/usr/share/cloudstack-management/webapp} org.springframework.context.ApplicationContextException: Failed to start bean 'cloudStackLifeCycle'; nested exception is com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated public key into systemvm iso mount: could not find any free loop device And sure enough, all /dev/loopX are in use: # losetup -a /dev/loop0: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop1: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop2: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop3: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop4: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop5: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop6: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop7: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) # Recent changes in the lab includes adding a VMware cluster, registered new systemvm templates for VMware, and created our own template for VMware. It looks like the updateKeyPairs process runs once an hour, and it failed to clean up the loopback device. So, in about 8 hours, the management server would run out loopback devices and dies. Any suggestions how I troubleshoot this further? Thanks Yiping