[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2018-01-17 Thread Romain Kubany (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328610#comment-16328610
 ] 

Romain Kubany commented on CLOUDSTACK-7857:
---

The issue still exists on CloudStack 4.10, how can we help ?

> CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
> memory and large Dom0
> ---
>
> Key: CLOUDSTACK-7857
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Affects Versions: 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
>Reporter: Joris van Lieshout
>Priority: Major
>
> We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
> available memory using this formula:
> CitrixResourceBase.java
>   protected void fillHostInfo
>   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
> _xs_virtualization_factor);
> In our situation:
>   ram = 274841497600
>   dom0Ram = 4269801472
>   _xs_memory_used = 128 * 1024 * 1024L = 134217728
>   _xs_virtualization_factor = 63.0/64.0 = 0,984375
>   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
> This is in fact not the actual amount of memory available for instances. The 
> difference in our situation is a little less then 1GB. On this particular 
> hypervisor Dom0+Xen uses about 9GB.
> As the comment above the definition of XsMemoryUsed allready stated it's time 
> to review this logic. 
> "//Hypervisor specific params with generic value, may need to be overridden 
> for specific versions"
> The effect of this bug is that when you put a hypervisor in maintenance it 
> might try to move instances (usually small instances (<1GB)) to a host that 
> in fact does not have enought free memory.
> This exception is thrown:
> ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
> work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
> Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
> failed due to com.cloud.utils.exception.CloudRuntim
> eException: Unable to migrate VM(r-4482-VM) from 
> host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
>   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
>nameLabel: Async.VM.pool_migrate
>  nameDescription: 
>allowedOperations: []
>currentOperations: {}
>  created: Thu Nov 06 13:44:14 CET 2014
> finished: Thu Nov 06 13:44:14 CET 2014
>   status: failure
>   residentOn: com.xensource.xenapi.Host@b42882c6
> progress: 1.0
> type: 
>   result: 
>errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
>  otherConfig: {}
>subtaskOf: com.xensource.xenapi.Task@aaf13f6f
> subtasks: []
> at 
> com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
> at 
> com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2016-02-08 Thread Martin Emrich (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136932#comment-15136932
 ] 

Martin Emrich commented on CLOUDSTACK-7857:
---

CLOUDSTACK-3809 ist a potential duplicate.
I just had this again today on Cloudstack 4.7.1

> CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
> memory and large Dom0
> ---
>
> Key: CLOUDSTACK-7857
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
> Project: CloudStack
>  Issue Type: Bug
>  Security Level: Public(Anyone can view this level - this is the 
> default.) 
>Affects Versions: 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
>Reporter: Joris van Lieshout
>
> We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
> available memory using this formula:
> CitrixResourceBase.java
>   protected void fillHostInfo
>   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
> _xs_virtualization_factor);
> In our situation:
>   ram = 274841497600
>   dom0Ram = 4269801472
>   _xs_memory_used = 128 * 1024 * 1024L = 134217728
>   _xs_virtualization_factor = 63.0/64.0 = 0,984375
>   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
> This is in fact not the actual amount of memory available for instances. The 
> difference in our situation is a little less then 1GB. On this particular 
> hypervisor Dom0+Xen uses about 9GB.
> As the comment above the definition of XsMemoryUsed allready stated it's time 
> to review this logic. 
> "//Hypervisor specific params with generic value, may need to be overridden 
> for specific versions"
> The effect of this bug is that when you put a hypervisor in maintenance it 
> might try to move instances (usually small instances (<1GB)) to a host that 
> in fact does not have enought free memory.
> This exception is thrown:
> ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
> work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
> Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
> failed due to com.cloud.utils.exception.CloudRuntim
> eException: Unable to migrate VM(r-4482-VM) from 
> host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
>   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
>nameLabel: Async.VM.pool_migrate
>  nameDescription: 
>allowedOperations: []
>currentOperations: {}
>  created: Thu Nov 06 13:44:14 CET 2014
> finished: Thu Nov 06 13:44:14 CET 2014
>   status: failure
>   residentOn: com.xensource.xenapi.Host@b42882c6
> progress: 1.0
> type: 
>   result: 
>errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
>  otherConfig: {}
>subtaskOf: com.xensource.xenapi.Task@aaf13f6f
> subtasks: []
> at 
> com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
> at 
> com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2015-01-02 Thread Daan Hoogland (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262796#comment-14262796
 ] 

Daan Hoogland commented on CLOUDSTACK-7857:
---

Marking it as critical instead of blocker. Let's discuss on list if it needs to 
be a blocker anyway.

 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Critical

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
nameLabel: Async.VM.pool_migrate
  nameDescription: 
allowedOperations: []
currentOperations: {}
  created: Thu Nov 06 13:44:14 CET 2014
 finished: Thu Nov 06 13:44:14 CET 2014
   status: failure
   residentOn: com.xensource.xenapi.Host@b42882c6
 progress: 1.0
 type: none/
   result: 
errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
  otherConfig: {}
subtaskOf: com.xensource.xenapi.Task@aaf13f6f
 subtasks: []
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2014-11-18 Thread Joris van Lieshout (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216355#comment-14216355
 ] 

Joris van Lieshout commented on CLOUDSTACK-7857:


I'm not too familiar with mem overhead on other hypervisors. You would think 
the formula would be some what the same. I understand that ACS has to be as 
flexible as possible but what if the logic of calculating free mem is moved to 
the hypervisor plugin so the logic in calculating can be specific but the 
outcome used by generic processes the same? I'm not a developer so my apologies 
if my comment does not make any sense. 
In the end any hypervisor should be able to provide some information about 
available memory, either by calculation of with a direct metric. Perhaps this 
will always be something hypervisor specifies...?

 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Blocker

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
nameLabel: Async.VM.pool_migrate
  nameDescription: 
allowedOperations: []
currentOperations: {}
  created: Thu Nov 06 13:44:14 CET 2014
 finished: Thu Nov 06 13:44:14 CET 2014
   status: failure
   residentOn: com.xensource.xenapi.Host@b42882c6
 progress: 1.0
 type: none/
   result: 
errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
  otherConfig: {}
subtaskOf: com.xensource.xenapi.Task@aaf13f6f
 subtasks: []
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2014-11-17 Thread Joris van Lieshout (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214498#comment-14214498
 ] 

Joris van Lieshout commented on CLOUDSTACK-7857:


Hi Anthony,

I agree that that there is no reliable way to do this beforehand so isn't it 
better to do it whenever an instance is started on/migrated to a host, or 
recalculate the free memory metric every couple minutes (for instance as part 
of the stats collection cycle)? The formula that is used by XenCenter for this 
seems pretty easy and spot.

This would also reduce the number of times a retry mechanism has to kick in for 
other action as well. On that note, the retry mechanism you are referring to 
does not seem to apply to HA-workers created by the process that puts a host in 
maintenance. Also it feels to me that this is more of a workaround than a nice 
solution, mostly because host_free_mem can be recalculated quickly and easily 
when needed.

And concerning the allocation threshold. If I'm not mistaking this does not 
apply to HA-workers which is being used whenever you put at host into 
maintenance. Additionally the instance being migrated is already in the cluster 
so this threshold is not hit during PrepairForMaintenance. 

 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Blocker

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
nameLabel: Async.VM.pool_migrate
  nameDescription: 
allowedOperations: []
currentOperations: {}
  created: Thu Nov 06 13:44:14 CET 2014
 finished: Thu Nov 06 13:44:14 CET 2014
   status: failure
   residentOn: com.xensource.xenapi.Host@b42882c6
 progress: 1.0
 type: none/
   result: 
errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
  otherConfig: {}
subtaskOf: com.xensource.xenapi.Task@aaf13f6f
 subtasks: []
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 at 
 

[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2014-11-17 Thread Anthony Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214983#comment-14214983
 ] 

Anthony Xu commented on CLOUDSTACK-7857:


 The formula that is used by XenCenter for this seems pretty easy and spot.
This is too hypervisor-specific, we don't want to couple CloudStack with 
hypervisor too tight, but if hypervisor provides the memory overhead through 
API, we can use it.

 recalculate the free memory metric every couple minutes (for instance as part 
 of the stats collection cycle)? 
We were discussing it for a while.
I like this idea, but it is a big change,
1. right now, the memory capacity is based on memory size in service offering, 
not real memory, if we use real memory metric, then we add this to some place,
 UI, need to show allocated memory and real used memory
2. VM deployment planer needs to consider both.
3. how to handle memory thin provision.
4. other hypervisors may not be able to provide accurate memory metric, like 
KVM, the memory(cache) being used by host OS can be used by VM deployment, but 
the free memory reported by host OS doesn't include memory used by cache.


I think we can start with XS, since it is a big case, it is better to consider 
it as a new feature, use both allocated and real memory in host capacity.

Anthony 







 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Blocker

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
nameLabel: Async.VM.pool_migrate
  nameDescription: 
allowedOperations: []
currentOperations: {}
  created: Thu Nov 06 13:44:14 CET 2014
 finished: Thu Nov 06 13:44:14 CET 2014
   status: failure
   residentOn: com.xensource.xenapi.Host@b42882c6
 progress: 1.0
 type: none/
   result: 
errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
  otherConfig: {}
subtaskOf: com.xensource.xenapi.Task@aaf13f6f
 subtasks: []
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
 at 
 

[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2014-11-14 Thread Anthony Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213016#comment-14213016
 ] 

Anthony Xu commented on CLOUDSTACK-7857:


_xs_memory_used is used as memory virtualization overhead in this XS host, but 
the memory overhead varies a lot depending on the total host free memory, VM 
density, VM memory size, VM guest OS type ...

to me, there seems no way to know the precise memory virtualization overhead 
before you use the host to run VMs,


There are two ways CloudStack provides to mitigate this,

1. retry mechanism, 
   cloudstack uses retry in many places, like deployVM, startVM, migrateVM, 
2. threshold,
 cluster.memory.allocated.capacity.disablethreshold
you can use this per-cluster configuration to configure the free memory which 
can be used by CloudStack.

If you have other thought on this, please share with us.

Anthony










 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Blocker

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
nameLabel: Async.VM.pool_migrate
  nameDescription: 
allowedOperations: []
currentOperations: {}
  created: Thu Nov 06 13:44:14 CET 2014
 finished: Thu Nov 06 13:44:14 CET 2014
   status: failure
   residentOn: com.xensource.xenapi.Host@b42882c6
 progress: 1.0
 type: none/
   result: 
errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
  otherConfig: {}
subtaskOf: com.xensource.xenapi.Task@aaf13f6f
 subtasks: []
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message 

[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2014-11-13 Thread Joris van Lieshout (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209545#comment-14209545
 ] 

Joris van Lieshout commented on CLOUDSTACK-7857:


Hi Rohit, I did some digging around in the XenCenter code and found a possible 
solution there. But there is a challenge I think. The overhead is  dynamic 
based on the instances running on the host, and, at the moment, ACS calculates 
this overhead at host thread startup.

This is what I found in the XenCenter code:
https://github.com/xenserver/xenadmin/blob/a0d31920c5ac62eda9713228043a834ba7829986/XenModel/XenAPI-Extensions/Host.cs#L1071
==
public long xen_memory_calc
{
get
{
if (!Helpers.MidnightRideOrGreater(Connection))
{
Host_metrics host_metrics = 
Connection.Resolve(this.metrics);
if (host_metrics == null)
return 0;
long totalused = 0;
foreach (VM vm in Connection.ResolveAll(resident_VMs))
{
VM_metrics vmMetrics = 
vm.Connection.Resolve(vm.metrics);
if (vmMetrics != null)
totalused += vmMetrics.memory_actual;
}
return host_metrics.memory_total - totalused - 
host_metrics.memory_free;
}
long xen_mem = memory_overhead;
foreach (VM vm in Connection.ResolveAll(resident_VMs))
{
xen_mem += vm.memory_overhead;
if (vm.is_control_domain)
{
VM_metrics vmMetrics = 
vm.Connection.Resolve(vm.metrics);
if (vmMetrics != null)
xen_mem += vmMetrics.memory_actual;
}
}
return xen_mem;
}
}
=
We can skip the first part because, if I'm not mistaking, ACS only supports 
XS5.6 and up. XS5.6 = MidnightRide
In short the formula is something like this: xen_mem = host_memory_overhead + 
residentVMs_memory_overhead + dom0_memory_actual

Here is a list of xe commands that will get you the correct numbers to 
summarize. 
host_mem_overhead
xe host-list name-label=$HOSTNAME params=memory-overhead --minimal
residentVMs_memory_overhead 
xe vm-list resident-on=$(xe host-list name-label=$HOSTNAME --minimal) 
params=memory-overhead --minimal
dom0_memory_actual
xe vm-list resident-on=$(xe host-list name-label=$HOSTNAME --minimal) 
is-control-domain=true params=memory-actual --minimal

 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Blocker

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 

[jira] [Commented] (CLOUDSTACK-7857) CitrixResourceBase wrongly calculates total memory on hosts with a lot of memory and large Dom0

2014-11-12 Thread Rohit Yadav (JIRA)

[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208649#comment-14208649
 ] 

Rohit Yadav commented on CLOUDSTACK-7857:
-

Do you propose we reduce the value by 1024 MB? So, the corner case of 1GB VM 
migration won't fail, or is there a better way to calculate the RAM value?

 CitrixResourceBase wrongly calculates total memory on hosts with a lot of 
 memory and large Dom0
 ---

 Key: CLOUDSTACK-7857
 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
 Project: CloudStack
  Issue Type: Bug
  Security Level: Public(Anyone can view this level - this is the 
 default.) 
Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
Reporter: Joris van Lieshout
Priority: Blocker

 We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates 
 available memory using this formula:
 CitrixResourceBase.java
   protected void fillHostInfo
   ram = (long) ((ram - dom0Ram - _xs_memory_used) * 
 _xs_virtualization_factor);
 In our situation:
   ram = 274841497600
   dom0Ram = 4269801472
   _xs_memory_used = 128 * 1024 * 1024L = 134217728
   _xs_virtualization_factor = 63.0/64.0 = 0,984375
   (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
 This is in fact not the actual amount of memory available for instances. The 
 difference in our situation is a little less then 1GB. On this particular 
 hypervisor Dom0+Xen uses about 9GB.
 As the comment above the definition of XsMemoryUsed allready stated it's time 
 to review this logic. 
 //Hypervisor specific params with generic value, may need to be overridden 
 for specific versions
 The effect of this bug is that when you put a hypervisor in maintenance it 
 might try to move instances (usually small instances (1GB)) to a host that 
 in fact does not have enought free memory.
 This exception is thrown:
 ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9 
 work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
 com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to 
 Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration 
 failed due to com.cloud.utils.exception.CloudRuntim
 eException: Unable to migrate VM(r-4482-VM) from 
 host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:   
   uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
nameLabel: Async.VM.pool_migrate
  nameDescription: 
allowedOperations: []
currentOperations: {}
  created: Thu Nov 06 13:44:14 CET 2014
 finished: Thu Nov 06 13:44:14 CET 2014
   status: failure
   residentOn: com.xensource.xenapi.Host@b42882c6
 progress: 1.0
 type: none/
   result: 
errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
  otherConfig: {}
subtaskOf: com.xensource.xenapi.Task@aaf13f6f
 subtasks: []
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
 at 
 com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 at 
 org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 at 
 com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)