[
https://issues.apache.org/jira/browse/CLOUDSTACK-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214983#comment-14214983
]
Anthony Xu commented on CLOUDSTACK-7857:
----------------------------------------
> The formula that is used by XenCenter for this seems pretty easy and spot.
This is too hypervisor-specific, we don't want to couple CloudStack with
hypervisor too tight, but if hypervisor provides the memory overhead through
API, we can use it.
> recalculate the free memory metric every couple minutes (for instance as part
> of the stats collection cycle)?
We were discussing it for a while.
I like this idea, but it is a big change,
1. right now, the memory capacity is based on memory size in service offering,
not real memory, if we use real memory metric, then we add this to some place,
UI, need to show allocated memory and real used memory
2. VM deployment planer needs to consider both.
3. how to handle memory thin provision.
4. other hypervisors may not be able to provide accurate memory metric, like
KVM, the memory(cache) being used by host OS can be used by VM deployment, but
the free memory reported by host OS doesn't include memory used by cache.
I think we can start with XS, since it is a big case, it is better to consider
it as a new feature, "use both allocated and real memory in host capacity".
Anthony
> CitrixResourceBase wrongly calculates total memory on hosts with a lot of
> memory and large Dom0
> -----------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-7857
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7857
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Affects Versions: Future, 4.3.0, 4.4.0, 4.5.0, 4.3.1, 4.4.1, 4.6.0
> Reporter: Joris van Lieshout
> Priority: Blocker
>
> We have hosts with 256GB memory and 4GB dom0. During startup ACS calculates
> available memory using this formula:
> CitrixResourceBase.java
> protected void fillHostInfo
> ram = (long) ((ram - dom0Ram - _xs_memory_used) *
> _xs_virtualization_factor);
> In our situation:
> ram = 274841497600
> dom0Ram = 4269801472
> _xs_memory_used = 128 * 1024 * 1024L = 134217728
> _xs_virtualization_factor = 63.0/64.0 = 0,984375
> (274841497600 - 4269801472 - 134217728) * 0,984375 = 266211892800
> This is in fact not the actual amount of memory available for instances. The
> difference in our situation is a little less then 1GB. On this particular
> hypervisor Dom0+Xen uses about 9GB.
> As the comment above the definition of XsMemoryUsed allready stated it's time
> to review this logic.
> "//Hypervisor specific params with generic value, may need to be overridden
> for specific versions"
> The effect of this bug is that when you put a hypervisor in maintenance it
> might try to move instances (usually small instances (<1GB)) to a host that
> in fact does not have enought free memory.
> This exception is thrown:
> ERROR [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-09aca6e9
> work-8981) Terminating HAWork[8981-Migration-4482-Running-Migrating]
> com.cloud.utils.exception.CloudRuntimeException: Unable to migrate due to
> Catch Exception com.cloud.utils.exception.CloudRuntimeException: Migration
> failed due to com.cloud.utils.exception.CloudRuntim
> eException: Unable to migrate VM(r-4482-VM) from
> host(6805d06c-4d5b-4438-a245-7915e93041d9) due to Task failed! Task record:
> uuid: 645b63c8-1426-b412-7b6a-13d61ee7ab2e
> nameLabel: Async.VM.pool_migrate
> nameDescription:
> allowedOperations: []
> currentOperations: {}
> created: Thu Nov 06 13:44:14 CET 2014
> finished: Thu Nov 06 13:44:14 CET 2014
> status: failure
> residentOn: com.xensource.xenapi.Host@b42882c6
> progress: 1.0
> type: <none/>
> result:
> errorInfo: [HOST_NOT_ENOUGH_FREE_MEMORY, 272629760, 263131136]
> otherConfig: {}
> subtaskOf: com.xensource.xenapi.Task@aaf13f6f
> subtasks: []
> at
> com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:1840)
> at
> com.cloud.vm.VirtualMachineManagerImpl.migrateAway(VirtualMachineManagerImpl.java:2214)
> at
> com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:610)
> at
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.runWithContext(HighAvailabilityManagerImpl.java:865)
> at
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.access$000(HighAvailabilityManagerImpl.java:822)
> at
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread$1.run(HighAvailabilityManagerImpl.java:834)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:831)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)