Re: ACS management server running out of loop back devices

Yiping Zhang Wed, 24 Apr 2019 12:07:08 -0700

Hi, Andrija:

Thanks for looking into it.


In my case,  there is no log entry for "Failed to unmount old iso" messages. 
The loopback devices are not mounted at all, at least when I was checking. BTW, 
 what's the mount point for ths lopback device for systemvm.iso? I can't tell 
if the loopback device being mounted at all, or failed to mount, or something 
else.  The systemvm.iso has 644 permissions on the file system.  I think the 
problem is that after some sort of failure,  the loopback device should have 
been deleted, rather than left behind.  Of course,  I still need to figure out 
what caused the failure in the first place.

Yiping

On 4/24/19, 10:47 AM, "Andrija Panic" <andrija.pa...@shapeblue.com> wrote:

    Hi Yiping,
    
    Based on 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcloudstack%2Fblob%2F4.11.2.0%2Fscripts%2Fvm%2Fsystemvm%2Finjectkeys.sh&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C680d0eb544d441baa33808d6c8dce3b5%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636917248422370400&amp;sdata=YEoGkKYT96QuMwUGZQ03YPLN%2BnXERXGyKn6NIxr0B9k%3D&amp;reserved=0
 , I would say to see what keeps these loop devices mounted - i.e. if you can 
unmount them manually.
    
    Based on code from above, unmount is run, but it might fail in your 
environment due to different things. Also check systemvm.iso permissions.
    
    Grep logs for " Failed to unmount old iso" lines...
    
    Best,
    Andrija
    
    andrija.pa...@shapeblue.com 
    
https://nam04.safelinks.protection.outlook.com/?url=www.shapeblue.com&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C680d0eb544d441baa33808d6c8dce3b5%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636917248422370400&amp;sdata=iGv8Z834UeQz6n%2BNjp6ZG8vhlf107I8QzpoUGbv7qtc%3D&amp;reserved=0
    Amadeus House, Floral Street, London  WC2E 9DPUK
    @shapeblue
      
     
    
    
    -----Original Message-----
    From: Yiping Zhang <yipzh...@adobe.com.INVALID> 
    Sent: 24 April 2019 18:42
    To: users@cloudstack.apache.org
    Subject: ACS management server running out of loop back devices
    
    Hi,
    
    My lab ACS server (version 4.11.2.0) recently starts to die off a few hours 
after a restart, with following error message in the log:
    
    
    2019-04-24 10:38:35,237 INFO  [c.c.s.ConfigurationServerImpl] (main:null) 
Processing updateKeyPairs
    
    2019-04-24 10:38:35,237 INFO  [c.c.s.ConfigurationServerImpl] (main:null) 
Keypairs already in database, updating local copy
    
    2019-04-24 10:38:35,241 INFO  [c.c.s.ConfigurationServerImpl] (main:null) 
Going to update systemvm iso with generated keypairs if needed
    
    2019-04-24 10:38:35,241 INFO  [c.c.s.ConfigurationServerImpl] (main:null) 
Trying to inject public and private keys into systemvm iso
    
    2019-04-24 10:38:35,288 INFO  [c.c.s.ConfigurationServerImpl] (main:null) 
Injected public and private keys into systemvm iso with result : mount: could 
not find any free loop device
    
    2019-04-24 10:38:35,288 WARN  [c.c.s.ConfigurationServerImpl] (main:null) 
Failed to inject generated public key into systemvm iso mount: could not find 
any free loop device
    
    2019-04-24 10:38:35,290 WARN  [o.a.c.s.m.c.ResourceApplicationContext] 
(main:null) Exception encountered during context initialization - cancelling 
refresh attempt: org.springframework.context.ApplicationContextException: 
Failed to start bean 'cloudStackLifeCycle'; nested exception is 
com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated 
public key into systemvm iso mount: could not find any free loop device
    
    2019-04-24 10:38:35,291 WARN  [o.e.j.w.WebAppContext] (main:null) Failed 
startup of context 
o.e.j.w.WebAppContext@78a2da20{/client,file:///usr/share/cloudstack-management/webapp/,UNAVAILABLE}{/usr/share/cloudstack-management/webapp}
    
    org.springframework.context.ApplicationContextException: Failed to start 
bean 'cloudStackLifeCycle'; nested exception is 
com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated 
public key into systemvm iso mount: could not find any free loop device
    
    
    And sure enough, all /dev/loopX are in use:
    
    
    # losetup -a
    
    /dev/loop0: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop1: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop2: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop3: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop4: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop5: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop6: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    
    /dev/loop7: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
    #
    
    Recent changes in the lab includes adding a VMware cluster, registered new 
systemvm templates for VMware, and created our own template for VMware.
    
    It looks like the updateKeyPairs process runs once an hour, and it failed 
to clean up the loopback device.  So, in about 8 hours, the management server 
would run out loopback devices and dies.
    
    Any suggestions how I troubleshoot this further?
    
    Thanks
    
    Yiping

Re: ACS management server running out of loop back devices

Reply via email to