Hi, Andrija: Thanks for looking into it.
In my case, there is no log entry for "Failed to unmount old iso" messages. The loopback devices are not mounted at all, at least when I was checking. BTW, what's the mount point for ths lopback device for systemvm.iso? I can't tell if the loopback device being mounted at all, or failed to mount, or something else. The systemvm.iso has 644 permissions on the file system. I think the problem is that after some sort of failure, the loopback device should have been deleted, rather than left behind. Of course, I still need to figure out what caused the failure in the first place. Yiping On 4/24/19, 10:47 AM, "Andrija Panic" <andrija.pa...@shapeblue.com> wrote: Hi Yiping, Based on https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcloudstack%2Fblob%2F4.11.2.0%2Fscripts%2Fvm%2Fsystemvm%2Finjectkeys.sh&data=02%7C01%7Cyipzhang%40adobe.com%7C680d0eb544d441baa33808d6c8dce3b5%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636917248422370400&sdata=YEoGkKYT96QuMwUGZQ03YPLN%2BnXERXGyKn6NIxr0B9k%3D&reserved=0 , I would say to see what keeps these loop devices mounted - i.e. if you can unmount them manually. Based on code from above, unmount is run, but it might fail in your environment due to different things. Also check systemvm.iso permissions. Grep logs for " Failed to unmount old iso" lines... Best, Andrija andrija.pa...@shapeblue.com https://nam04.safelinks.protection.outlook.com/?url=www.shapeblue.com&data=02%7C01%7Cyipzhang%40adobe.com%7C680d0eb544d441baa33808d6c8dce3b5%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636917248422370400&sdata=iGv8Z834UeQz6n%2BNjp6ZG8vhlf107I8QzpoUGbv7qtc%3D&reserved=0 Amadeus House, Floral Street, London WC2E 9DPUK @shapeblue -----Original Message----- From: Yiping Zhang <yipzh...@adobe.com.INVALID> Sent: 24 April 2019 18:42 To: users@cloudstack.apache.org Subject: ACS management server running out of loop back devices Hi, My lab ACS server (version 4.11.2.0) recently starts to die off a few hours after a restart, with following error message in the log: 2019-04-24 10:38:35,237 INFO [c.c.s.ConfigurationServerImpl] (main:null) Processing updateKeyPairs 2019-04-24 10:38:35,237 INFO [c.c.s.ConfigurationServerImpl] (main:null) Keypairs already in database, updating local copy 2019-04-24 10:38:35,241 INFO [c.c.s.ConfigurationServerImpl] (main:null) Going to update systemvm iso with generated keypairs if needed 2019-04-24 10:38:35,241 INFO [c.c.s.ConfigurationServerImpl] (main:null) Trying to inject public and private keys into systemvm iso 2019-04-24 10:38:35,288 INFO [c.c.s.ConfigurationServerImpl] (main:null) Injected public and private keys into systemvm iso with result : mount: could not find any free loop device 2019-04-24 10:38:35,288 WARN [c.c.s.ConfigurationServerImpl] (main:null) Failed to inject generated public key into systemvm iso mount: could not find any free loop device 2019-04-24 10:38:35,290 WARN [o.a.c.s.m.c.ResourceApplicationContext] (main:null) Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'cloudStackLifeCycle'; nested exception is com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated public key into systemvm iso mount: could not find any free loop device 2019-04-24 10:38:35,291 WARN [o.e.j.w.WebAppContext] (main:null) Failed startup of context o.e.j.w.WebAppContext@78a2da20{/client,file:///usr/share/cloudstack-management/webapp/,UNAVAILABLE}{/usr/share/cloudstack-management/webapp} org.springframework.context.ApplicationContextException: Failed to start bean 'cloudStackLifeCycle'; nested exception is com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated public key into systemvm iso mount: could not find any free loop device And sure enough, all /dev/loopX are in use: # losetup -a /dev/loop0: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop1: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop2: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop3: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop4: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop5: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop6: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) /dev/loop7: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso) # Recent changes in the lab includes adding a VMware cluster, registered new systemvm templates for VMware, and created our own template for VMware. It looks like the updateKeyPairs process runs once an hour, and it failed to clean up the loopback device. So, in about 8 hours, the management server would run out loopback devices and dies. Any suggestions how I troubleshoot this further? Thanks Yiping