Hi Luigi, I've updated the ticket, I will be implementing this for the next release.
Regards, -Tino -- Constantino Vázquez Blanco, MSc OpenNebula Major Contributor / Cloud Researcher www.OpenNebula.org | @tinova79 On Wed, Feb 9, 2011 at 3:14 PM, Luigi Fortunati <[email protected]> wrote: > Thanks Tino, > That is probably more a problem of libvirt, since VMWare IM Driver use it in > order to access information about the hosts. > In order to get information about the hosts OpenNebula launches a virsh > command and parses the output. > The script that does this work is located in $ONE_LOCATION/lib/remotes/im > and the output of the virsh command is: > oneadmin@custom2:~/lib/remotes/im$ virsh -c > esx://custom6.sns.it/?no_verify=1 nodeinfo > Enter username for custom6.sns.it [root]: > Enter root's password for custom6.sns.it: > CPU model: AMD Opteron(tm) Processor 246 > CPU(s): 2 > CPU frequency: 1992 MHz > CPU socket(s): 2 > Core(s) per socket: 1 > Thread(s) per core: 1 > NUMA cell(s): 2 > Memory size: 2096460 kB > I always get the same output, no matter how many VMs are running on the > cluster node. > That is why OpenNebula returns with an output like this: > oneadmin@custom2:~/var/96$ onehost show 1 > HOST 1 INFORMATION > > ID : 1 > NAME : custom6.sns.it > CLUSTER : default > STATE : MONITORING > IM_MAD : im_vmware > VM_MAD : vmm_vmware > TM_MAD : tm_vmware > HOST SHARES > > MAX MEM : 2096460 > USED MEM (REAL) : 0 > USED MEM (ALLOCATED) : 0 > MAX CPU : 200 > USED CPU (REAL) : 0 > USED CPU (ALLOCATED) : 0 > RUNNING VMS : 1 > MONITORING INFORMATION > > CPUSPEED=1992 > HYPERVISOR=vmware > TOTALCPU=200 > TOTALMEMORY=2096460 > OpenNebula polls cluster nodes periodically and gets only information about > hypervisor type, cpu frequency, total cpu, total memory size. > The limitation here is caused by libvirt (virsh) which is unable to return > more information about the actual usage of resources. > The integration of OpenNebula with Xen can rely on ssh access to the cluster > nodes. > The IM Driver for Xen hypervisors, launches xentop on every cluster node in > order to get information about the VMs and then parses the output. > As an example here is the output of commands xm and xentop (some info is > purged): > custom9:/ # xentop -bi2 > NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) > VCPUS NETS NETTX(k) NETRX(k) > Domain-0 -----r 102 0.0 1930260 93.7 no limit n/a > 2 0 0 0 > NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) > VCPUS NETS NETTX(k) NETRX(k) > Domain-0 -----r 102 0.3 1930260 93.7 no limit n/a > 2 0 0 0 > custom9:/ # xm info > host : custom9 > release : 2.6.34.7-0.5-xen > version : #1 SMP 2010-10-25 08:40:12 +0200 > machine : x86_64 > nr_cpus : 2 > nr_nodes : 2 > cores_per_socket : 1 > threads_per_core : 1 > cpu_mhz : 1991 > [...] > total_memory : 2011 > free_memory : 135 > free_cpus : 0 > max_free_memory : 1508 > max_para_memory : 1504 > max_hvm_memory : 1492 > [...] > The script $ONE_LOCATION/lib/remotes/im/xen.d/xen.rb parses those two > outputs and retrieves data about memory, cpu, and network usage. > I think that VMWare drivers are scarcely useful if they can't provide the > degree of information which can be achieved with xen hypervisors and > OpenNebula, I've tested the effects of this issue in my tests. > On Tue, Feb 8, 2011 at 6:34 PM, Tino Vazquez <[email protected]> wrote: >> >> Hi Luigi, >> >> There is a bug in the IM driver for VMware, is not reporting the Free >> memory at all. I've opened a ticket to keep track of the issue [1], it >> will be solved in the next release. >> >> Regards, >> >> -Tino >> >> [1] http://dev.opennebula.org/issues/481 >> >> -- >> Constantino Vázquez Blanco, MSc >> OpenNebula Major Contributor / Cloud Researcher >> www.OpenNebula.org | @tinova79 >> >> >> >> On Tue, Feb 8, 2011 at 12:56 PM, Luigi Fortunati >> <[email protected]> wrote: >> > Ok, I tried some tests today. >> > The hardware/software environment includes 2 cluster nodes (ESXi 4.1), >> > 2Gb >> > of RAM, 2 AMD Opteron 246 Processors (2GHz), trial version licenses. The >> > opennebula installation is self-contained. >> > 800MB of memory are used by the hypervisor itself (that info comes from >> > vSphere Client) so only 1,2 GB are free, but OpenNebula seems unaware of >> > that :-( >> > oneadmin@custom2:/srv/cloud/templates/vm$ onehost list >> > ID NAME CLUSTER RVM TCPU FCPU ACPU TMEM FMEM >> > STAT >> > 2 custom7.sns.it default 0 200 200 200 2G 0K >> > on >> > 1 custom6.sns.it default 0 200 200 200 2G 0K >> > on >> > oneadmin@custom2:/srv/cloud/templates/vm$ onehost show 1 >> > HOST 1 INFORMATION >> > >> > ID : 1 >> > NAME : custom6.sns.it >> > CLUSTER : default >> > STATE : MONITORED >> > IM_MAD : im_vmware >> > VM_MAD : vmm_vmware >> > TM_MAD : tm_vmware >> > HOST SHARES >> > >> > MAX MEM : 2096460 >> > USED MEM (REAL) : 0 >> > USED MEM (ALLOCATED) : 0 >> > MAX CPU : 200 >> > USED CPU (REAL) : 0 >> > USED CPU (ALLOCATED) : 0 >> > >> > In each test I tried to start 3 VM using a nonpersistent image. The >> > requirements of all of the three VM cannot be satisfied by a single >> > cluster >> > node. >> > FIRST TEST: >> > The VM template for the first test is: >> > NAME = "Debian Server" >> > CPU = 1 >> > MEMORY = 1024 >> > OS = [ ARCH = "i686" ] >> > DISK = [IMAGE="Debian Server"] >> > Only CPU and Memory info. >> > Here is the result: >> > oneadmin@custom2:/srv/cloud/templates/vm$ onevm list >> > ID USER NAME STAT CPU MEM HOSTNAME TIME >> > 66 oneadmin Debian S pend 0 0K 00 00:07:47 >> > 67 oneadmin Debian S pend 0 0K 00 00:07:45 >> > 68 oneadmin Debian S pend 0 0K 00 00:07:18 >> > Forever in "pending" state... The VMs don't get scheduled >> > oned.log doesn't report anything but resource polling informational >> > messages. >> > sched.log repeats this sequence: >> > Tue Feb 8 10:02:06 2011 [HOST][D]: Discovered Hosts (enabled): 1 2 >> > Tue Feb 8 10:02:06 2011 [VM][D]: Pending virtual machines : 66 67 68 >> > Tue Feb 8 10:02:06 2011 [RANK][W]: No rank defined for VM >> > Tue Feb 8 10:02:06 2011 [RANK][W]: No rank defined for VM >> > Tue Feb 8 10:02:06 2011 [RANK][W]: No rank defined for VM >> > Tue Feb 8 10:02:06 2011 [SCHED][I]: Select hosts >> > PRI HID >> > ------------------- >> > Virtual Machine: 66 >> > Virtual Machine: 67 >> > Virtual Machine: 68 >> > SECOND TEST: >> > VM template: >> > NAME = "Debian Server" >> > VCPU = 1 >> > MEMORY = 1024 >> > OS = [ ARCH = "i686" ] >> > DISK = [IMAGE="Debian Server"] >> > Only VCPU and MEMORY info. >> > Results: >> > oneadmin@custom2:/srv/cloud/templates/vm$ onevm list >> > ID USER NAME STAT CPU MEM HOSTNAME TIME >> > 76 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:07:40 >> > 77 oneadmin Debian S runn 0 0K custom6.sns.it 00 00:07:38 >> > 78 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:05:58 >> > Everything seems fine, but it's not since, as I said previously, each >> > host >> > has only 1.2 GB of memory free, so there's should be no space for two >> > VMs on >> > the same host. >> > oneadmin@custom2:/srv/cloud/templates/vm$ onehost list >> > ID NAME CLUSTER RVM TCPU FCPU ACPU TMEM FMEM >> > STAT >> > 2 custom7.sns.it default 2 200 200 200 2G 0K >> > on >> > 1 custom6.sns.it default 1 200 200 200 2G 0K >> > on >> > Both the hosts and the VMs report no useful info on the resource usage. >> > Logging to the VM of each console and executing "free -m" command I >> > checked >> > that every VM has 1GB of total memory allocated. So i decided to test >> > the GB >> > of memory on both VM at the same time using the utility called >> > "memtester" >> > which allocate a given amount of free memory using malloc and test it. >> > The >> > results reported memory access problems. >> > I decided here to go on and check if OpenNebula and VMWare ESXi fail to >> > allocate VMs exceeding the resource capacity of the hosts, by starting >> > two >> > more VMs (requiring 1VCPU and 1GB memory each). >> > Results: >> > oneadmin@custom2:~/var/79$ onevm list >> > ID USER NAME STAT CPU MEM HOSTNAME TIME >> > 76 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:54:47 >> > 77 oneadmin Debian S runn 0 0K custom6.sns.it 00 00:54:45 >> > 78 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:53:05 >> > 79 oneadmin Debian S boot 0 0K custom7.sns.it 00 00:10:22 >> > 80 oneadmin Debian S boot 0 0K custom7.sns.it 00 00:09:47 >> > The new VM are allocated on custom7 machine (why???) but remain frozen >> > on >> > "boot" state. >> > That is a problem because those two new VM should not be allocated to >> > any >> > cluster node. >> > THIRD TEST: >> > Here I followed Ruben suggestion... >> > The VM template: >> > oneadmin@custom2:/srv/cloud/templates/vm$ cat debian.vm >> > NAME = "Debian Server" >> > CPU = 1 >> > VCPU = 1 >> > MEMORY = 1024 >> > OS = [ ARCH = "i686" ] >> > DISK = [IMAGE="Debian Server"] >> > Both CPU/VCPU and MEMORY info. >> > Output with 3 VM: >> > oneadmin@custom2:~/var$ onevm list >> > ID USER NAME STAT CPU MEM HOSTNAME TIME >> > 81 oneadmin Debian S pend 0 0K 00 00:02:32 >> > 82 oneadmin Debian S pend 0 0K 00 00:02:30 >> > 83 oneadmin Debian S pend 0 0K 00 00:02:29 >> > As in FIRST TEST the VMs don't get scheduled and remain in "pending" >> > state. >> > sched.log repeats this message: >> > Tue Feb 8 12:00:05 2011 [HOST][D]: Discovered Hosts (enabled): 1 2 >> > Tue Feb 8 12:00:05 2011 [VM][D]: Pending virtual machines : 81 82 83 >> > Tue Feb 8 12:00:05 2011 [RANK][W]: No rank defined for VM >> > Tue Feb 8 12:00:05 2011 [RANK][W]: No rank defined for VM >> > Tue Feb 8 12:00:05 2011 [RANK][W]: No rank defined for VM >> > Tue Feb 8 12:00:05 2011 [SCHED][I]: Select hosts >> > PRI HID >> > ------------------- >> > Virtual Machine: 81 >> > Virtual Machine: 82 >> > Virtual Machine: 83 >> > Here I assumed that probably I should not declare the number of physical >> > CPU >> > in the VM template. >> > Another last test... >> > FOURTH TEST: >> > Here I disabled an host, custom6, and started 3 VMs. >> > The VM template is the one that worked before: >> > oneadmin@custom2:/srv/cloud/templates/vm$ cat debian.vm >> > NAME = "Debian Server" >> > VCPU = 1 >> > MEMORY = 1024 >> > OS = [ ARCH = "i686" ] >> > DISK = [IMAGE="Debian Server"] >> > Output: >> > oneadmin@custom2:~$ onehost list >> > ID NAME CLUSTER RVM TCPU FCPU ACPU TMEM FMEM >> > STAT >> > 2 custom7.sns.it default 3 200 200 200 2G 0K >> > on >> > 1 custom6.sns.it default 0 200 200 200 2G 0K >> > off >> > oneadmin@custom2:~$ onevm list >> > ID USER NAME STAT CPU MEM HOSTNAME TIME >> > 92 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:12:53 >> > 93 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:12:46 >> > 94 oneadmin Debian S runn 0 0K custom7.sns.it 00 00:12:46 >> > I verified if the VM were up and running by logging to the console of >> > each >> > one of them through vSphere Client and they were all running and >> > declaring >> > an amount of 1GB of total memory on each one of them. Since there is >> > less >> > than 1.2 GB of memory effectively free on a cluster node before the VMs >> > instantiation how can those VMs run consistently? Why OpenNebula >> > schedule >> > those VM on the same machine exceeding even the host resource capacity? >> > On Fri, Feb 4, 2011 at 11:04 PM, Ruben S. Montero <[email protected]> >> > wrote: >> >> >> >> Hi, >> >> You have to add also de CPU capacity for the VM (apart from the number >> >> of >> >> virtual cpus CPUs). The CPU value is used at the allocation phase. >> >> However >> >> you are specifying MEMORY and should be included in the allocated >> >> memeory >> >> (USED MEMORY in onehost show) So I guess there should be other problem >> >> with >> >> your template. >> >> Cheers >> >> Ruben >> >> >> >> On Fri, Feb 4, 2011 at 10:50 AM, Luigi Fortunati >> >> <[email protected]> wrote: >> >>> >> >>> I can post the VM template content on monday. However, as far as I >> >>> remember, the vm template was really simple: >> >>> NAME="Debian" >> >>> VCPU= 2 >> >>> MEMORY=1024 >> >>> DISK=[IMAGE="Debian5-i386"] >> >>> OS=[ARCH=i686] >> >>> The VMs can boot and run, I can log on console through vSphere Client >> >>> on >> >>> the newly created VMs. >> >>> I noticed that if you don't declare the number on VCPU the VM doesn't >> >>> get >> >>> scheduled on a cluster node. This option seems mandatory but I didn't >> >>> find >> >>> any mention about it on the documentation. >> >>> Another thing that seems mandatory is declaring the cpu architecture >> >>> as >> >>> i686, otherwise OpenNebula will return error when writing the >> >>> deployment.0 >> >>> file. >> >>> >> >>> On Thu, Feb 3, 2011 at 5:42 PM, Ruben S. Montero >> >>> <[email protected]> >> >>> wrote: >> >>>> >> >>>> Hi, >> >>>> I am not sure this is related to the VMware monitoring... Can you >> >>>> send >> >>>> the VM Templates? >> >>>> Thanks >> >>>> Ruben >> >>>> >> >>>> On Thu, Feb 3, 2011 at 5:10 PM, Luigi Fortunati >> >>>> <[email protected]> wrote: >> >>>>> >> >>>>> Hi, >> >>>>> I noticed a serious problem about the usage of VMWare ESXi 4.1 and >> >>>>> OpenNebula 2.0.1. >> >>>>> I'm actually using the VMWare driver addon which can be found on the >> >>>>> opennebula website (ver. 1.0) and libvirt (ver. 0.8.7). >> >>>>> It happens that OpenNebula can't get information about the usage of >> >>>>> resources on the cluster nodes. >> >>>>> By running 2 VM (each one requires 2 VCPU and 1 GB of memory) and >> >>>>> executing some commands I get this output. >> >>>>> oneadmin@custom2:~/src$ onehost list >> >>>>> ID NAME CLUSTER RVM TCPU FCPU ACPU TMEM >> >>>>> FMEM STAT >> >>>>> 2 custom7.sns.it default 0 200 200 200 2G >> >>>>> 0K off >> >>>>> 1 custom6.sns.it default 2 200 200 200 2G >> >>>>> 0K on >> >>>>> oneadmin@custom2:~/src$ onehost show 1 >> >>>>> HOST 1 INFORMATION >> >>>>> >> >>>>> ID : 1 >> >>>>> NAME : custom6.sns.it >> >>>>> CLUSTER : default >> >>>>> STATE : MONITORED >> >>>>> IM_MAD : im_vmware >> >>>>> VM_MAD : vmm_vmware >> >>>>> TM_MAD : tm_vmware >> >>>>> HOST SHARES >> >>>>> >> >>>>> MAX MEM : 2096460 >> >>>>> USED MEM (REAL) : 0 >> >>>>> USED MEM (ALLOCATED) : 0 >> >>>>> MAX CPU : 200 >> >>>>> USED CPU (REAL) : 0 >> >>>>> USED CPU (ALLOCATED) : 0 >> >>>>> RUNNING VMS : 2 >> >>>>> MONITORING INFORMATION >> >>>>> >> >>>>> CPUSPEED=1992 >> >>>>> HYPERVISOR=vmware >> >>>>> TOTALCPU=200 >> >>>>> TOTALMEMORY=2096460 >> >>>>> As you can see OpenNebula is unable to get correct information about >> >>>>> the usage of resources on the cluster nodes. >> >>>>> As these informations are used by the VM scheduler, OpenNebula is >> >>>>> unable to schedule the VM correctly. >> >>>>> I tried to create several VM and all of them were placed on the same >> >>>>> host even if the latter was unable to satisfy the resource >> >>>>> requirements of >> >>>>> all the VMs. >> >>>>> I think that this problem is strongly related to libvirt as >> >>>>> OpenNebula >> >>>>> use it to recover information about hosts and vm. >> >>>>> Do you get the same behavior? Do you know if there is a way to solve >> >>>>> this big issue? >> >>>>> -- >> >>>>> Luigi Fortunati >> >>>>> >> >>>>> _______________________________________________ >> >>>>> Users mailing list >> >>>>> [email protected] >> >>>>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org >> >>>>> >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Dr. Ruben Santiago Montero >> >>>> Associate Professor (Profesor Titular), Complutense University of >> >>>> Madrid >> >>>> >> >>>> URL: http://dsa-research.org/doku.php?id=people:ruben >> >>>> Weblog: http://blog.dsa-research.org/?author=7 >> >>> >> >>> >> >>> >> >>> -- >> >>> Luigi Fortunati >> >> >> >> >> >> >> >> -- >> >> Dr. Ruben Santiago Montero >> >> Associate Professor (Profesor Titular), Complutense University of >> >> Madrid >> >> >> >> URL: http://dsa-research.org/doku.php?id=people:ruben >> >> Weblog: http://blog.dsa-research.org/?author=7 >> > >> > >> > >> > -- >> > Luigi Fortunati >> > >> > _______________________________________________ >> > Users mailing list >> > [email protected] >> > http://lists.opennebula.org/listinfo.cgi/users-opennebula.org >> > >> > > > > > -- > Luigi Fortunati > _______________________________________________ Users mailing list [email protected] http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
