Re: [Users] DL380 G5 - Fails to Activate
I have a couple of old DL380 G5's and i am putting them into their own cluster for testing various things out. The install of 3.1 from dreyou goes fine onto them but when they try to activate i get the following Host xxx.xxx.net.uk moved to Non-Operational state as host does not meet the cluster's minimum CPU level. Missing CPU features : model_Conroe, nx KVM appears to run just fine on these host and their cpu's are Intel(R) Xeon(R) CPU5140 @ 2.33GHz Is it possible to add these in to a 3.1 cluster ?? and now i have managed to find a similar post # vdsClient -s 0 getVdsCaps | grep -i flags cpuFlags = fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,dca,lahf_lm,dts,tpr_shadow # virsh -r capabilities capabilities host uuid134bd567-da9f-43f9-8a2b-c259ed34f938/uuid cpu archx86_64/arch modelkvm32/model vendorIntel/vendor topology sockets='1' cores='2' threads='1'/ feature name='lahf_lm'/ feature name='lm'/ feature name='syscall'/ feature name='dca'/ feature name='pdcm'/ feature name='xtpr'/ feature name='cx16'/ feature name='ssse3'/ feature name='tm2'/ feature name='est'/ feature name='vmx'/ feature name='ds_cpl'/ feature name='monitor'/ feature name='dtes64'/ feature name='pbe'/ feature name='tm'/ feature name='ht'/ feature name='ss'/ feature name='acpi'/ feature name='ds'/ feature name='vme'/ /cpu power_management suspend_disk/ /power_management migration_features live/ uri_transports uri_transporttcp/uri_transport /uri_transports /migration_features topology cells num='1' cell id='0' cpus num='2' cpu id='0'/ cpu id='1'/ /cpus /cell /cells /topology /host guest os_typehvm/os_type arch name='i686' wordsize32/wordsize emulator/usr/libexec/qemu-kvm/emulator machinerhel6.3.0/machine machine canonical='rhel6.3.0'pc/machine machinerhel6.2.0/machine machinerhel6.1.0/machine machinerhel6.0.0/machine machinerhel5.5.0/machine machinerhel5.4.4/machine machinerhel5.4.0/machine domain type='qemu' /domain domain type='kvm' emulator/usr/libexec/qemu-kvm/emulator /domain /arch features cpuselection/ deviceboot/ pae/ nonpae/ acpi default='on' toggle='yes'/ apic default='on' toggle='no'/ /features /guest guest os_typehvm/os_type arch name='x86_64' wordsize64/wordsize emulator/usr/libexec/qemu-kvm/emulator machinerhel6.3.0/machine machine canonical='rhel6.3.0'pc/machine machinerhel6.2.0/machine machinerhel6.1.0/machine machinerhel6.0.0/machine machinerhel5.5.0/machine machinerhel5.4.4/machine machinerhel5.4.0/machine domain type='qemu' /domain domain type='kvm' emulator/usr/libexec/qemu-kvm/emulator /domain /arch features cpuselection/ deviceboot/ acpi default='on' toggle='yes'/ apic default='on' toggle='no'/ /features /guest /capabilities Hi - any clues here or am i out of luck with these hosts? thanks ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] DL380 G5 - Fails to Activate
2013/1/25 Tom Brown t...@ng23.net: I have a couple of old DL380 G5's and i am putting them into their own cluster for testing various things out. The install of 3.1 from dreyou goes fine onto them but when they try to activate i get the following Host xxx.xxx.net.uk moved to Non-Operational state as host does not meet the cluster's minimum CPU level. Missing CPU features : model_Conroe, nx KVM appears to run just fine on these host and their cpu's are Intel(R) Xeon(R) CPU5140 @ 2.33GHz Is it possible to add these in to a 3.1 cluster ?? and now i have managed to find a similar post # vdsClient -s 0 getVdsCaps | grep -i flags cpuFlags = fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,dca,lahf_lm,dts,tpr_shadow # virsh -r capabilities capabilities host uuid134bd567-da9f-43f9-8a2b-c259ed34f938/uuid cpu archx86_64/arch modelkvm32/model vendorIntel/vendor topology sockets='1' cores='2' threads='1'/ feature name='lahf_lm'/ feature name='lm'/ feature name='syscall'/ feature name='dca'/ feature name='pdcm'/ feature name='xtpr'/ feature name='cx16'/ feature name='ssse3'/ feature name='tm2'/ feature name='est'/ feature name='vmx'/ feature name='ds_cpl'/ feature name='monitor'/ feature name='dtes64'/ feature name='pbe'/ feature name='tm'/ feature name='ht'/ feature name='ss'/ feature name='acpi'/ feature name='ds'/ feature name='vme'/ /cpu power_management suspend_disk/ /power_management migration_features live/ uri_transports uri_transporttcp/uri_transport /uri_transports /migration_features topology cells num='1' cell id='0' cpus num='2' cpu id='0'/ cpu id='1'/ /cpus /cell /cells /topology /host guest os_typehvm/os_type arch name='i686' wordsize32/wordsize emulator/usr/libexec/qemu-kvm/emulator machinerhel6.3.0/machine machine canonical='rhel6.3.0'pc/machine machinerhel6.2.0/machine machinerhel6.1.0/machine machinerhel6.0.0/machine machinerhel5.5.0/machine machinerhel5.4.4/machine machinerhel5.4.0/machine domain type='qemu' /domain domain type='kvm' emulator/usr/libexec/qemu-kvm/emulator /domain /arch features cpuselection/ deviceboot/ pae/ nonpae/ acpi default='on' toggle='yes'/ apic default='on' toggle='no'/ /features /guest guest os_typehvm/os_type arch name='x86_64' wordsize64/wordsize emulator/usr/libexec/qemu-kvm/emulator machinerhel6.3.0/machine machine canonical='rhel6.3.0'pc/machine machinerhel6.2.0/machine machinerhel6.1.0/machine machinerhel6.0.0/machine machinerhel5.5.0/machine machinerhel5.4.4/machine machinerhel5.4.0/machine domain type='qemu' /domain domain type='kvm' emulator/usr/libexec/qemu-kvm/emulator /domain /arch features cpuselection/ deviceboot/ acpi default='on' toggle='yes'/ apic default='on' toggle='no'/ /features /guest /capabilities Hi - any clues here or am i out of luck with these hosts? thanks ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hi, how about the kvm-ok tool result? Is it responding: INFO: /dev/kvm exists KVM acceleration can be used Alex ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Glusterfs HA doubts
In oVirt 3.1 GlusterFS support was added. It was an easy way to replicate your virtual machine storage without too much hassle. There are two main howtos: * http://www.middleswarth.net/content/installing-ovirt-31-and-glusterfs-using-either-nfs-or-posix-native-file-system-engine (Robert Middleswarth) * http://blog.jebpages.com/archives/ovirt-3-1-glusterized/ (Jason Brooks). 1) What about performance? I've done some tests with rsync backups (even using the suggested --inplace rsync switch) that implies small files. These backups were done into local mounted glusterfs volumes. Backups instead of lasting about 2 hours they lasted like 15 hours long. Is there maybe something that only happens with small files and with big files performance is ok? 2) How to know the current status? In DRBD you know it checking a proc file if I remember it well. I remember too that GlusterFS doesn't have an equivalent thing and there's no evident way to know if all the files are synced. If you have tried it how do you know if both sets of virtual disks images are synced? 3) Mount dns resolution If you check Jason Brooks howto you will see that it uses a hostname for refering to nfs mount. If you want to perform HA you need your storage to be mounted and if the server1 host is down it doesn't help that the nfs mount point associated to the storage is server1:/vms/ and not server2:/vms/. Checking Middleswarth howto I think that he does the same thing. Let's explain a bit more so that understand. My example setup is the one where you have two host machines where you run a set of virtual machines on one and the other one doesn't have any virtual machine running. Where is the virtual machines storage located? It's located at the glusterfs volume. So the first one of the machines mounts the glusterfs volume as nfs (It's an example). If it uses its own hostname for the nfs mount then if itself goes down the second host isn't going to mount it when it's restarted in the HA mode. So the first one of the machines mounts the glusterfs volume as nfs (It's an example). If it uses the second host hostname for the nfs mount then if the second host goes down the virtual machine cannot access its virtual disks. A workaround for this situation which I have thought is to use /etc/hosts on both machines so that: whatever.domain.com resolves in both hosts to the host self's ip. I think that glusterfs has a way of mounting their share through -t glusterfs that somehow can ignore these hostnames problems but I haven't read it too much about it so I'm not too sure. 4) So my doubts basically are: * Has anyone setup a two host glusterfs HA oVirt cluster where storage is shared by a replicated Glusterfs volume that is shared and stored by both of them? * Does HA work when one of the host goes down? * Or does it complain about hostname as I suspect? * Any tips to ensure the best performance? Thank you. -- -- Adrián Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] DL380 G5 - Fails to Activate
I have a couple of old DL380 G5's and i am putting them into their own cluster for testing various things out. The install of 3.1 from dreyou goes fine onto them but when they try to activate i get the following Host xxx.xxx.net.uk moved to Non-Operational state as host does not meet the cluster's minimum CPU level. Missing CPU features : model_Conroe, nx KVM appears to run just fine on these host and their cpu's are Intel(R) Xeon(R) CPU5140 @ 2.33GHz Is it possible to add these in to a 3.1 cluster ?? and now i have managed to find a similar post # vdsClient -s 0 getVdsCaps | grep -i flags cpuFlags = fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,aperfmperf,pni,dtes64,monitor,ds_cpl,vmx,est,tm2,ssse3,cx16,xtpr,pdcm,dca,lahf_lm,dts,tpr_shadow # virsh -r capabilities capabilities host uuid134bd567-da9f-43f9-8a2b-c259ed34f938/uuid cpu archx86_64/arch modelkvm32/model vendorIntel/vendor topology sockets='1' cores='2' threads='1'/ feature name='lahf_lm'/ feature name='lm'/ feature name='syscall'/ feature name='dca'/ feature name='pdcm'/ feature name='xtpr'/ feature name='cx16'/ feature name='ssse3'/ feature name='tm2'/ feature name='est'/ feature name='vmx'/ feature name='ds_cpl'/ feature name='monitor'/ feature name='dtes64'/ feature name='pbe'/ feature name='tm'/ feature name='ht'/ feature name='ss'/ feature name='acpi'/ feature name='ds'/ feature name='vme'/ /cpu power_management suspend_disk/ /power_management migration_features live/ uri_transports uri_transporttcp/uri_transport /uri_transports /migration_features topology cells num='1' cell id='0' cpus num='2' cpu id='0'/ cpu id='1'/ /cpus /cell /cells /topology /host guest os_typehvm/os_type arch name='i686' wordsize32/wordsize emulator/usr/libexec/qemu-kvm/emulator machinerhel6.3.0/machine machine canonical='rhel6.3.0'pc/machine machinerhel6.2.0/machine machinerhel6.1.0/machine machinerhel6.0.0/machine machinerhel5.5.0/machine machinerhel5.4.4/machine machinerhel5.4.0/machine domain type='qemu' /domain domain type='kvm' emulator/usr/libexec/qemu-kvm/emulator /domain /arch features cpuselection/ deviceboot/ pae/ nonpae/ acpi default='on' toggle='yes'/ apic default='on' toggle='no'/ /features /guest guest os_typehvm/os_type arch name='x86_64' wordsize64/wordsize emulator/usr/libexec/qemu-kvm/emulator machinerhel6.3.0/machine machine canonical='rhel6.3.0'pc/machine machinerhel6.2.0/machine machinerhel6.1.0/machine machinerhel6.0.0/machine machinerhel5.5.0/machine machinerhel5.4.4/machine machinerhel5.4.0/machine domain type='qemu' /domain domain type='kvm' emulator/usr/libexec/qemu-kvm/emulator /domain /arch features cpuselection/ deviceboot/ acpi default='on' toggle='yes'/ apic default='on' toggle='no'/ /features /guest /capabilities Hi - any clues here or am i out of luck with these hosts? thanks ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hi, how about the kvm-ok tool result? Is it responding: INFO: /dev/kvm exists KVM acceleration can be used for posterity - http://www.linkedin.com/groups/Rhev-3-Beta-Proliant-DL380-2536011.S.93285316 This solved it for me - cheers ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] How to update VdcBootStrapUrl (not using DB) ?
I've recently updated my http://www.ovirt.org/User:Adrian15/oVirt_engine_migration oVirt engine migration howto with the http://www.ovirt.org/User:Adrian15/oVirt_engine_migration#Update_VdcBootStrapUrl Update VdcBootStrapUrl section. My next move is to move this section into the http://www.ovirt.org/How_to_change_engine_host_name How to change engine host name because I think it's a needed step. But I don't like that currently you have to issue a database update like this: psql -c update vdc_options set option_value = 'http://new.manager.com:80/Components/vds/' where option_name = 'VdcBootStrapUrl' -U postgres engine So I was wondering if there was a proper way like using a command like vdsClient or something similar. I mean so that in the future the vdc_options table gets renamed that the command is still the same. I CC jhernand because I think he wrote the original How to change engine host name at the mailing list and also answered with the VdcBootStrapUrl update sentence to someone how couldn't add a new host after I think restoring an ovirt-engine. Thank you. -- -- Adrián Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Best practice to resize a WM disk image
Hey, I wanted to report that trying to dd from the storage-side always makes the VM´s OS see two itdentically small HDD's. The only work-around I´ve found that works is to create a new, bigger drive, boot the VM from a live-CD and dd from there. When rebooted after completion, the VM´s OS then sees a bigger drive that you can extend your filesystem on. A little slower procedure, having the mirroring go over the network, but works, and that´s what´s important in the end:) /Karli mån 2013-01-14 klockan 08:37 + skrev Karli Sjöberg: ons 2013-01-09 klockan 13:04 -0500 skrev Yeela Kaplan: - Original Message - From: Karli Sjöberg karli.sjob...@slu.semailto:karli.sjob...@slu.se To: Yeela Kaplan ykap...@redhat.commailto:ykap...@redhat.com Cc: Rocky rockyba...@gmail.commailto:rockyba...@gmail.com, Users@ovirt.orgmailto:Users@ovirt.org Sent: Wednesday, January 9, 2013 4:30:35 PM Subject: Re: [Users] Best practice to resize a WM disk image ons 2013-01-09 klockan 09:13 -0500 skrev Yeela Kaplan: - Original Message - From: Karli Sjöberg karli.sjob...@slu.semailto:karli.sjob...@slu.se To: Yeela Kaplan ykap...@redhat.commailto:ykap...@redhat.com Cc: Rocky rockyba...@gmail.commailto:rockyba...@gmail.com , Users@ovirt.orgmailto:Users@ovirt.org Sent: Wednesday, January 9, 2013 1:56:32 PM Subject: Re: [Users] Best practice to resize a WM disk image tis 2013-01-08 klockan 11:03 -0500 skrev Yeela Kaplan: So, first of all, you should know that resizing a disk is not yet supported in oVirt. If you decide that you must use it anyway, you should know in advance that it's not recommended, and that your data is at risk when you perform these kind of actions. There are several ways to perform this. One of them is to create a second (larger) disk for the vm, run the vm from live cd and use dd to copy the first disk contents into the second one, and finally remove the first disk and make sure that the new disk is configured as your system disk. Here you guide for the dd operation to be done from within the guest system, but booted from live. Can this be done directly from the NFS storage itself instead? Karli, it can be done by using dd (or rsync), when your source is the volume of the current disk image and the destination is the volume of the new disk image created. You just have to find the images in the internals of the vdsm host, which is a bit more tricky and can cause more damage if done wrong. You mean since the VM's and disks are called like c3dbfb5f-7b3b-4602-961f-624c69618734 you have to query the api to figure out what´s what, but other than that, you´re saying it´ll just work, so that´s good to know, since I think letting the storage itself do the dd copy locally is going to be much much faster than through the VM, over the network. Thanks! Will it matter if the disks are Thin Provision or Preallocated? As long as it's done on the base volume it doesn't matter. Well, I´ve now tested the suggested procedure and didn´t really go all the way home. 1. Created a new, bigger virtual disk than the original, 40GB. 2. Booted Win2008R2 guest and could see from DiskManager that a new, bigger drive, 80GB, had appeared. 3. Shut guest down and issued a dd from old source to new, bigger destination. 4. When started, DiskManager now sees an offline, equally small drive as the original, 40GB. There is no free space in the new drive to expand with, Windows only sees it as beeing 40GB. Have tried Refresh and Rescan, but Windows just sees two identically small disks. Suggestions? The second, riskier, option is to export the vm to an export domain, resize the image volume size to the new larger size using qemu-img and also modify the vm's metadata in its ovf, as you can see this option is more complicated and requires deeper understanding and altering of the metadata... finally you'll need to import the vm back. - Original Message - From: Rocky rockyba...@gmail.commailto:rockyba...@gmail.com To: Yeela Kaplan ykap...@redhat.commailto:ykap...@redhat.com Cc: Users@ovirt.orgmailto:Users@ovirt.org Sent: Tuesday, January 8, 2013 11:30:00 AM Subject: Re: [Users] Best practice to resize a WM disk image Its just a theoretical question as I think the issue will come for us and other users. I think there can be one or more snapshots in the WM over the time. But if that is an issue we can always collapse them I think. If its a base image it should be RAW, right? In this case its on file storage (NFS). Regards //Ricky On 2013-01-08 10:07, Yeela Kaplan wrote: Hi Ricky, In order to give you a detailed answer I need additional details regarding the disk: - Is the disk image composed as a chain of volumes or just a base volume? (if it's a chain it will be more complicated, you might want to
Re: [Users] How to update VdcBootStrapUrl (not using DB) ?
On 01/25/2013 12:49 PM, Adrian Gibanel wrote: I've recently updated my http://www.ovirt.org/User:Adrian15/oVirt_engine_migration oVirt engine migration howto with the http://www.ovirt.org/User:Adrian15/oVirt_engine_migration#Update_VdcBootStrapUrl Update VdcBootStrapUrl section. My next move is to move this section into the http://www.ovirt.org/How_to_change_engine_host_name How to change engine host name because I think it's a needed step. But I don't like that currently you have to issue a database update like this: psql -c update vdc_options set option_value = 'http://new.manager.com:80/Components/vds/' where option_name = 'VdcBootStrapUrl' -U postgres engine So I was wondering if there was a proper way like using a command like vdsClient or something similar. I mean so that in the future the vdc_options table gets renamed that the command is still the same. I CC jhernand because I think he wrote the original How to change engine host name at the mailing list and also answered with the VdcBootStrapUrl update sentence to someone how couldn't add a new host after I think restoring an ovirt-engine. The alternative to the SQL statement is the engine-config tool: engine-config -s VdcBootStrapUrl=http://... In version 3.2 this parameter has been removed. -- Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta 3ºD, 28016 Madrid, Spain Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] latest vdsm cannot read ib device speeds causing storage attach fail
On Fri 25 Jan 2013 05:23:24 PM CST, Royce Lv wrote: I patched python source managers.py to retry recv() after EINTR, supervdsm works well and the issue gone. Even declared in python doc that:only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal http://docs.python.org/2/library/signal.html#module-signal module, even if the underlying thread implementation supports sending signals to individual threads). (http://docs.python.org/2/library/signal.html) But according to my test script bellow, the child process forker event if it is not the main thread, also got SIGCHILD, Will it be a python BUG or feature? Thanks for sharing your test code. After testing your script and looking into the python signal module, I found that the signal could be received by any thread! Python installs a signal handler wrapper for all user defined signal handler. The wrapper just adds the actual handler to the queue of pending calls, and the main thread check the pending calls and run the actual hander before executing the next instruction. So, the signal interruption can happens to any thread and therefore the wrapper could run in any thread. But the user-defined handler only runs in the main thread. And if the signal occurs in non main thread, the main thread will not be interrupted and the signal will only be handled when the main thread finishes the current instruction. Please correct me if I am wrong. I agree with Mark maybe we should use synchronised way to deal with child process instead of signal handler. import threading import signal import time import os from multiprocessing import Process,Pipe def _zombieReaper(signum, frame): print 'sigchild!!!' def child(): time.sleep(5) def sleepThread(): proc = Process(target=child) proc.start() pip,pop = Pipe() pip.recv()--This line will get IOError.EINTR by SIGCHLD def main(): signal.signal(signal.SIGCHLD, _zombieReaper) servThread = threading.Thread(target = sleepThread) servThread.setDaemon(True) servThread.start() time.sleep(30) if __name__ == '__main__': main() On 01/25/2013 03:20 PM, Mark Wu wrote: Great work! The default action for SIGCHLD is ignore, so there's no problems reported before a signal handler is installed by zombie reaper. But I still have one problem: the python multiprocessing.manager code is running a new thread and according to the implementation of python's signal, only the main thread can receive the signal. So how is the signal delivered to the server thread? On Fri 25 Jan 2013 12:30:39 PM CST, Royce Lv wrote: Hi, I reproduced this issue, and I believe it's a python bug. 1. How to reproduce: with the test case attached, put it under /usr/share/vdsm/tests/, run #./run_tests.sh superVdsmTests.py and this issue will be reproduced. 2.Log analyse: We notice a strange pattern in this log: connectStorageServer be called twice, first supervdsm call succeed, second fails becasue of validateAccess(). That is because for the first call validateAccess returns normally and leave a child there, when the second validateAccess call arrives and multirprocessing manager is receiving the method message, it is just the time first child exit and SIGCHLD comming, this signal interrupted multiprocessing receive system call, python managers.py should handle INTR and retry recv() like we do in vdsm but it's not, so the second one raise error. Thread-18::DEBUG::2013-01-22 10:41:03,570::misc::85::Storage.Misc.excCmd::(lambda) '/usr/bin/sudo -n /bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=3 192.168.0.1:/ovirt/silvermoon /rhev/data-center/mnt/192.168.0.1:_ovirt_silvermoon' (cwd None) Thread-18::DEBUG::2013-01-22 10:41:03,607::misc::85::Storage.Misc.excCmd::(lambda) '/usr/bin/sudo -n /bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=3 192.168.0.1:/ovirt/undercity /rhev/data-center/mnt/192.168.0.1:_ovirt_undercity' (cwd None) Thread-18::ERROR::2013-01-22 10:41:03,627::hsm::2215::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File /usr/share/vdsm/storage/hsm.py, line 2211, in connectStorageServer conObj.connect() File /usr/share/vdsm/storage/storageServer.py, line 303, in connect return self._mountCon.connect() File /usr/share/vdsm/storage/storageServer.py, line 209, in connect fileSD.validateDirAccess(self.getMountObj().getRecord().fs_file) File /usr/share/vdsm/storage/fileSD.py, line 55, in validateDirAccess (os.R_OK | os.X_OK)) File /usr/share/vdsm/supervdsm.py, line 81, in __call__ return callMethod() File /usr/share/vdsm/supervdsm.py, line 72, in lambda **kwargs) File string, line 2, in validateAccess File /usr/lib64/python2.6/multiprocessing/managers.py, line 740, in _callmethod raise convert_to_error(kind, result) the
Re: [Users] Community feedback on the new UI-plugin Framework
- Mensaje original - De: Oved Ourfalli ov...@redhat.com Hey all, We had an oVirt workshop this week, which included a few sessions about the new oVirt UI Plugin framework, including a Hackaton and a BOF session. Was there finally any video recorded of this workshop? If you find the feedback above true, or you have other comments that weren't mentioned here, please share it with us! 5. Everything should be a plugin. One trend among platform design is that everything should be a plugin. I'm not sure how it will go with oVirt but the idea is that: * Hosts tab * Virtual machines tab * GlusterFS volumes tab * Disks tab are each one of them a plugin. You should remark that as a side-effect you win two things: * People can check how standard plugins (you could make them not-uninstallable and not-being-able-to-disable) to learn how to build their own plugins * Your plugin system would be better because it would need to be improved to support all the current default plugins capabilities 6. Plugin logs Maybe an standard way of saving plugin logs through standard ovirt-engine logs and specific plugin logs. 7. Wiki Page Is there any wiki page about this UI-plugin framework so that I can add a New ideas or new features requests page link there and add these same ideas? I think that's all. Thank you, Oved P.S: I guess the slides will be uploaded sometime next week (I guess someone would have asked it soon... so now you have your answer :-) I will, for sure, take a look at them. -- -- Adrián Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] How to update VdcBootStrapUrl (not using DB) ?
- Mensaje original - De: Juan Hernandez jhern...@redhat.com Update VdcBootStrapUrl section. But I don't like that currently you have to issue a database update like this: psql -c update vdc_options set option_value = 'http://new.manager.com:80/Components/vds/' where option_name = 'VdcBootStrapUrl' -U postgres engine The alternative to the SQL statement is the engine-config tool: engine-config -s VdcBootStrapUrl=http://... In version 3.2 this parameter has been removed. And what does that mean exactly? The parameter is not longer needed because the url resolution is done in another way? Maybe other parametre in ovirt 3.2 needs to be updated? With which command then? So far I'm going to update the wiki with 'only needed in 3.1 version' reference and I will update it as needed. Thank you. -- Adrián Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] How to update VdcBootStrapUrl (not using DB) ?
On 01/25/2013 01:15 PM, Adrian Gibanel wrote: *De: *Juan Hernandez jhern...@redhat.com ** Update VdcBootStrapUrl section. But I don't like that currently you have to issue a database update like this: psql -c update vdc_options set option_value = 'http://new.manager.com:80/Components/vds/' where option_name = 'VdcBootStrapUrl' -U postgres engine The alternative to the SQL statement is the engine-config tool: engine-config -s VdcBootStrapUrl=http://... In version 3.2 this parameter has been removed. And what does that mean exactly? The parameter is not longer needed because the url resolution is done in another way? Maybe other parametre in ovirt 3.2 needs to be updated? With which command then? So far I'm going to update the wiki with 'only needed in 3.1 version' reference and I will update it as needed. Up to version 3.1 that parameter was needed because during host registration the host initiated connections to the engine web server to download several files (certificates, scripts, etc). This parameter was the base of some of those files (mostly scripts). Starting with version 3.2 the engine sends all the required files to the hosts using SSH, so there is no need for this parameter. See the following URL for more details: http://www.ovirt.org/Featrues/Bootstrap_Improvements -- Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta 3ºD, 28016 Madrid, Spain Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] BFA FC driver no stable on Fedora
Hi Mike thanks for your reply. I'm using oVirt Node Hypervisor release 2.5.5 (0.1.fc17). On the latest kernel on fedora17 the problem is still here. I've other issue on this node image, so I will test on fc18 and vdsm from repo. Kevin 2013/1/23 Mike Burns mbu...@redhat.com On Wed, 2013-01-23 at 10:25 +0100, Kevin Maziere Aubry wrote: Hi I've spent hours to make my Brocade FC card working on Fedora17 or Ovirt Node build. In fact the card are randomly seen by the system, which is really painfull. So I've downloaded, compile and installed the latest driver from Brocade, and now when I load the module the card is seen. So I've installed : bfa_util_linux_noioctl-3.2.0.0-0.noarch bfa_driver_linux-3.2.0.0-0.noarch Hi Kevin, A few things: 1. what version of ovirt-node are you using? 2. You can use the plugin tooling to install an updated kmod package as long as it's in rpm format. This will ensure that it gets into both the system and the initramfs. It's an offline process that will produce a new iso that you can install or upgrade to. The tool is called edit-node and is available in the ovirt-node-tools rpm (from the ovirt.org repos) 3. Do you know if it's fixed in a recent release of the kernel on Fedora? If it is, then we can spin an updated version of the image and pick up the fix directly from fedora. Mike And the module info are : # modinfo bfa filename: /lib/modules/3.3.4-5.fc17.x86_64/kernel/drivers/scsi/bfa.ko version:3.2.0.0 author: Brocade Communications Systems, Inc. description:Brocade Fibre Channel HBA Driver fcpim ipfc license:GPL srcversion: 5C0FBDF3571ABCA9632B9CA alias: pci:v1657d0022sv*sd*bc0Csc04i00* alias: pci:v1657d0021sv*sd*bc0Csc04i00* alias: pci:v1657d0014sv*sd*bc0Csc04i00* alias: pci:v1657d0017sv*sd*bc*sc*i* alias: pci:v1657d0013sv*sd*bc*sc*i* depends:scsi_transport_fc vermagic: 3.3.4-5.fc17.x86_64 SMP mod_unload parm: os_name:OS name of the hba host machine (charp) parm: os_patch:OS patch level of the hba host machine (charp) parm: host_name:Hostname of the hba host machine (charp) parm: num_rports:Max number of rports supported per port (physical/logical), default=1024 (int) parm: num_ioims:Max number of ioim requests, default=2000 (int) parm: num_tios:Max number of fwtio requests, default=0 (int) parm: num_tms:Max number of task im requests, default=128 (int) parm: num_fcxps:Max number of fcxp requests, default=64 (int) parm: num_ufbufs:Max number of unsolicited frame buffers, default=64 (int) parm: reqq_size:Max number of request queue elements, default=256 (int) parm: rspq_size:Max number of response queue elements, default=64 (int) parm: num_sgpgs:Number of scatter/gather pages, default=2048 (int) parm: rport_del_timeout:Rport delete timeout, default=90 secs, Range[0] (int) parm: bfa_lun_queue_depth:Lun queue depth, default=32, Range[0] (int) parm: bfa_io_max_sge:Max io scatter/gather elements , default=255 (int) parm: log_level:Driver log level, default=3, Range[Critical:1|Error:2|Warning:3|Info:4] (int) parm: ioc_auto_recover:IOC auto recovery, default=1, Range[off:0|on:1] (int) parm: linkup_delay:Link up delay, default=30 secs for boot port. Otherwise 10 secs in RHEL4 0 for [RHEL5, SLES10, ESX40] Range[0] (int) parm: msix_disable_cb:Disable Message Signaled Interrupts for Brocade-415/425/815/825 cards, default=0, Range[false:0|true:1] (int) parm: msix_disable_ct:Disable Message Signaled Interrupts if possible for Brocade-1010/1020/804/1007/1741 cards, default=0, Range[false:0|true:1] (int) parm: fdmi_enable:Enables fdmi registration, default=1, Range[false:0|true:1] (int) parm: pcie_max_read_reqsz:PCIe max read request size, default=0 (use system setting), Range[128|256|512|1024|2048|4096] (int) parm: max_xfer_size:default=32MB, Range[64k|128k|256k|512k| 1024k|2048k] (int) parm: max_rport_logins:Max number of logins to initiator and target rports on a port (physical/logical), default=1024 (int) I guess that I could be a possible to update the driver inside the Ovirt Node build ? Kevin -- Kevin Mazière Responsable Infrastructure Alter Way – Hosting 1 rue Royal - 227 Bureaux de la Colline 92213 Saint-Cloud Cedex Tél : +33 (0)1 41 16 38 41 Mob : +33 (0)7 62 55 57 05 http://www.alterway.fr ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users -- Kevin Mazière Responsable Infrastructure Alter Way –
Re: [Users] best disk type for WIn XP guests
On Thu, Jan 24, 2013 at 9:12 AM, Vadim Rozenfeld wrote: On Wednesday, January 23, 2013 06:17:16 PM Gianluca Cecchi wrote: Hello, I have a WIn XP guest configured with one ide disk. I would like to pass to virtio. Is it supported/usable for Win XP as a disk type on oVirt? What else are using other ones in case, apart IDE? My attempt is to add a second 1Gb disk configured as virtio and then if successful change disk type for the first disk too. But when powering up the guest it finds new hardware for the second disk, I point it to the directory WXP\X86 of the iso using virtio-win-1.1.16.vfd It finds the viostor.xxx files but at the end it fails installing the driver (see https://docs.google.com/file/d/0BwoPbcrMv8mvMUQ2SWxYZWhSV0E/edit ) Any help/suggestion is welcome. Error code 39 means that OS cannot load the device driver. On 32 bit platforms it usually happens with corrupted installation media or platform/architecture mismatch. Vadim. Actually, despite on what I wrote, I was trying to use the files on CD iso: virtio-win-0.1-49.iso under the WXP\X86 directory. And it seems they are not good Instead if I correctly use files on floppy image virtio-win-1.1.16.vfd under i386/WinXP directory, it lets me coplete driver installation and convert boot IDE disk into VirtIO Thanks Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Run once for windows xp vm very slow: correct?
Hello, I have a windows XP vm on f18 oVirt all-in-one and rpm from nightly repo 3.2.0-1.20130123.git2ad65d0. disk and nic are VirtIO. When I run it normally (spice) I almost immediately get the icon to open spice connection and the status of VM becomes Powering Up. And in spice window I can see the boot process, that completes in less than 2 minutes When I select Run once it remains for about 10 minutes in executing phase: see this image for timings comparison: https://docs.google.com/file/d/0BwoPbcrMv8mvb3FIeHExVHFibms/edit and in vm line, the status appears as down, so that I don't get the icon to connect to console. Only when it completes after 10 minutes, I get console link and I find the VM already at its final desktop prompt Is this expected or should I send anything to debug/investigate? ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Local storage domain fails to attach after host reboot
On 24.01.2013 18:05, Patrick Hurrelmann wrote: Hi list, after rebooting one host (single host dc with local storage) the local storage domain can't be attached again. The host was set to maintenance mode and all running vms were shutdown prior the reboot. Vdsm keeps logging the following errors: Thread-1266::ERROR::2013-01-24 17:51:46,042::task::853::TaskManager.Task::(_setError) Task=`a0c11f61-8bcf-4f76-9923-43e8b9cc1424`::Unexpected error Traceback (most recent call last): File /usr/share/vdsm/storage/task.py, line 861, in _run return fn(*args, **kargs) File /usr/share/vdsm/logUtils.py, line 38, in wrapper res = f(*args, **kwargs) File /usr/share/vdsm/storage/hsm.py, line 817, in connectStoragePool return self._connectStoragePool(spUUID, hostID, scsiKey, msdUUID, masterVersion, options) File /usr/share/vdsm/storage/hsm.py, line 859, in _connectStoragePool res = pool.connect(hostID, scsiKey, msdUUID, masterVersion) File /usr/share/vdsm/storage/sp.py, line 641, in connect self.__rebuild(msdUUID=msdUUID, masterVersion=masterVersion) File /usr/share/vdsm/storage/sp.py, line 1109, in __rebuild self.masterDomain = self.getMasterDomain(msdUUID=msdUUID, masterVersion=masterVersion) File /usr/share/vdsm/storage/sp.py, line 1448, in getMasterDomain raise se.StoragePoolMasterNotFound(self.spUUID, msdUUID) StoragePoolMasterNotFound: Cannot find master domain: 'spUUID=c9b86219-0d51-44c3-a7de-e0fe07e2c9e6, msdUUID=00ed91f3-43be-41be-8c05-f3786588a1ad' and Thread-1268::ERROR::2013-01-24 17:51:49,073::task::853::TaskManager.Task::(_setError) Task=`95b7f58b-afe0-47bd-9ebd-21d3224f5165`::Unexpected error Traceback (most recent call last): File /usr/share/vdsm/storage/task.py, line 861, in _run return fn(*args, **kargs) File /usr/share/vdsm/logUtils.py, line 38, in wrapper res = f(*args, **kwargs) File /usr/share/vdsm/storage/hsm.py, line 528, in getSpmStatus pool = self.getPool(spUUID) File /usr/share/vdsm/storage/hsm.py, line 265, in getPool raise se.StoragePoolUnknown(spUUID) StoragePoolUnknown: Unknown pool id, pool not connected: ('c9b86219-0d51-44c3-a7de-e0fe07e2c9e6',) while engine logs: 2013-01-24 17:51:46,050 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-43) [49026692] Command org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 304 mMessage Cannot find master domain: 'spUUID=c9b86219-0d51-44c3-a7de-e0fe07e2c9e6, msdUUID=00ed91f3-43be-41be-8c05-f3786588a1ad' Vdsm and engine logs are also attached. I set the affected host back to maintenance. How can I recover from this and attach the storage domain again? If more information is needed, please do not hesitate to request it. This is on CentOS 6.3 using Dreyou's rpms. Installed versions on host: vdsm.x86_64 4.10.0-0.44.14.el6 vdsm-cli.noarch 4.10.0-0.44.14.el6 vdsm-python.x86_64 4.10.0-0.44.14.el6 vdsm-xmlrpc.noarch 4.10.0-0.44.14.el6 Engine: ovirt-engine.noarch 3.1.0-3.19.el6 ovirt-engine-backend.noarch 3.1.0-3.19.el6 ovirt-engine-cli.noarch 3.1.0.7-1.el6 ovirt-engine-config.noarch 3.1.0-3.19.el6 ovirt-engine-dbscripts.noarch 3.1.0-3.19.el6 ovirt-engine-genericapi.noarch 3.1.0-3.19.el6 ovirt-engine-jbossas711.x86_64 1-0 ovirt-engine-notification-service.noarch3.1.0-3.19.el6 ovirt-engine-restapi.noarch 3.1.0-3.19.el6 ovirt-engine-sdk.noarch 3.1.0.5-1.el6 ovirt-engine-setup.noarch 3.1.0-3.19.el6 ovirt-engine-tools-common.noarch3.1.0-3.19.el6 ovirt-engine-userportal.noarch 3.1.0-3.19.el6 ovirt-engine-webadmin-portal.noarch 3.1.0-3.19.el6 ovirt-image-uploader.noarch 3.1.0-16.el6 ovirt-iso-uploader.noarch 3.1.0-16.el6 ovirt-log-collector.noarch 3.1.0-16.el6 Thanks and regards Patrick Ok, managed to solve it. I force removed the datacenter and reinstalled the host. I added a new local storage to it and re-created the vms (disk images were moved and renamed from old non working local storage). So this host is up an running again. Regards Patrick -- Lobster LOGsuite GmbH, Münchner Straße 15a, D-82319 Starnberg HRB 178831, Amtsgericht München Geschäftsführer: Dr. Martin Fischer, Rolf Henrich ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Run once for windows xp vm very slow: correct?
On Fri, Jan 25, 2013 at 6:10 PM, Gianluca Cecchi wrote: When I select Run once it remains for about 10 minutes in executing phase: see this image for timings comparison: https://docs.google.com/file/d/0BwoPbcrMv8mvb3FIeHExVHFibms/edit Sorry, I was not complete. It happens only if I attach a floppy as option of run once. If I select run once and don't attach anything, it boots with same speed as in normal way.. Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Run once for windows xp vm very slow: correct?
2013/1/25 Gianluca Cecchi gianluca.cec...@gmail.com: On Fri, Jan 25, 2013 at 6:10 PM, Gianluca Cecchi wrote: When I select Run once it remains for about 10 minutes in executing phase: see this image for timings comparison: https://docs.google.com/file/d/0BwoPbcrMv8mvb3FIeHExVHFibms/edit Sorry, I was not complete. It happens only if I attach a floppy as option of run once. If I select run once and don't attach anything, it boots with same speed as in normal way.. Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users Hi Gianluca, I once faced the same situation when I tried to boot a VM with an ISO image that was corrupted. It hanged and I couldn't get to the SPICE console. Only when I tried to install that same ISO but with VNC I could see the error messages about the corrupted iso. Alex ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Community feedback on the new UI-plugin Framework
On 24/01/2013 17:41, Oved Ourfalli wrote: Hey all, We had an oVirt workshop this week, which included a few sessions about the new oVirt UI Plugin framework, including a Hackaton and a BOF session. I've gathered some feedback we got from the different participants about the framework, and what they would like to see in the future of it. 1. People liked the fact that it is a simple framework, allowing you to do nice extensions rapidly, without the need to know complex technologies (simple javascript knowledge is all you need to know). 2. People want the framework to provide tools for adding UI components (main/sub tabs, dialogs, etc.) that aren't URL based, but are based on components we currently have in oVirt, such as grids, key-value pairs (such as the general sub-tab), action buttons in these custom tabs and etc. The main reason for that is to easily develop a plugin with an oVirt-like look-and-feel. Chris Morrissey from Netapp showed a very nice plugin he wrote that did have an oVirt-like look-and-feel, but it wasn't easy and it required him to to develop something specific for the plugin to interact with, in the 3rd party application (something similar to the work we did in the oVirt-Foreman UI-plugin). that would be great, i hope some will be contributed by ui plugin developers for others to benefit from. 3. Support adding tasks to the system - plugins may trigger asynchronous tasks behind the scene, both oVirt and external ones. oVirt tasks and their progress will be reflected in the tasks management view, but if the flows contain external tasks as well, then it would be hard to track through the oVirt UI. 4. Plugin management * The ability to see what plugins are installed... install new plugins and remove existing ones. * Change the plugin configuration through webadmin * Distinguish between public plugin configuration entries (entries the user to change), to private ones (entries it can't). showing which plugins are installed is probably easy. deployment of plugins could be a bit more tricky for distro's assuming code is distributed via packages, which are deployed by a root user. I guess that this point will be relevant for engine-plugins as well (once support for such plugins will be available) so we should consider providing a similar solution for both. Also, Chris pointed out that it should be taken into consideration as well when working on supporting HA-oVirt-engine, as plugins are vital part of the oVirt environment. If you find the feedback above true, or you have other comments that weren't mentioned here, please share it with us! Thank you, Oved P.S: I guess the slides will be uploaded sometime next week (I guess someone would have asked it soon... so now you have your answer :-) ) ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Fwd: Re: latest vdsm cannot read ib device speeds causing storage attach fail
Hi, I reproduced this issue, and I believe it's a python bug. 1. How to reproduce: with the test case attached, put it under /usr/share/vdsm/tests/, run #./run_tests.sh superVdsmTests.py and this issue will be reproduced. 2.Log analyse: We notice a strange pattern in this log: connectStorageServer be called twice, first supervdsm call succeed, second fails becasue of validateAccess(). That is because for the first call validateAccess returns normally and leave a child there, when the second validateAccess call arrives and multirprocessing manager is receiving the method message, it is just the time first child exit and SIGCHLD comming, this signal interrupted multiprocessing receive system call, python managers.py should handle INTR and retry recv() like we do in vdsm but it's not, so the second one raise error. Thread-18::DEBUG::2013-01-22 10:41:03,570::misc::85::Storage.Misc.excCmd::(lambda) '/usr/bin/sudo -n /bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=3 192.168.0.1:/ovirt/silvermoon /rhev/data-center/mnt/192.168.0.1:_ovirt_silvermoon' (cwd None) Thread-18::DEBUG::2013-01-22 10:41:03,607::misc::85::Storage.Misc.excCmd::(lambda) '/usr/bin/sudo -n /bin/mount -t nfs -o soft,nosharecache,timeo=600,retrans=6,nfsvers=3 192.168.0.1:/ovirt/undercity /rhev/data-center/mnt/192.168.0.1:_ovirt_undercity' (cwd None) Thread-18::ERROR::2013-01-22 10:41:03,627::hsm::2215::Storage.HSM::(connectStorageServer) Could not connect to storageServer Traceback (most recent call last): File /usr/share/vdsm/storage/hsm.py, line 2211, in connectStorageServer conObj.connect() File /usr/share/vdsm/storage/storageServer.py, line 303, in connect return self._mountCon.connect() File /usr/share/vdsm/storage/storageServer.py, line 209, in connect fileSD.validateDirAccess(self.getMountObj().getRecord().fs_file) File /usr/share/vdsm/storage/fileSD.py, line 55, in validateDirAccess (os.R_OK | os.X_OK)) File /usr/share/vdsm/supervdsm.py, line 81, in __call__ return callMethod() File /usr/share/vdsm/supervdsm.py, line 72, in lambda **kwargs) File string, line 2, in validateAccess File /usr/lib64/python2.6/multiprocessing/managers.py, line 740, in _callmethod raise convert_to_error(kind, result) the vdsm side receive RemoteError because of supervdsm server multiprocessing manager raise error KIND='TRACEBACK' RemoteError: The upper part is the trace back from the client side, the following part is from server side: --- Traceback (most recent call last): File /usr/lib64/python2.6/multiprocessing/managers.py, line 214, in serve_client request = recv() IOError: [Errno 4] Interrupted system call --- Corresponding Python source code:managers.py(Server side) def serve_client(self, conn): ''' Handle requests from the proxies in a particular process/thread ''' util.debug('starting server thread to service %r', threading.current_thread().name) recv = conn.recv send = conn.send id_to_obj = self.id_to_obj while not self.stop: try: methodname = obj = None request = recv()--this line been interrupted by SIGCHLD ident, methodname, args, kwds = request obj, exposed, gettypeid = id_to_obj[ident] if methodname not in exposed: raise AttributeError( 'method %r of %r object is not in exposed=%r' % (methodname, type(obj), exposed) ) function = getattr(obj, methodname) try: res = function(*args, **kwds) except Exception, e: msg = ('#ERROR', e) else: typeid = gettypeid and gettypeid.get(methodname, None) if typeid: rident, rexposed = self.create(conn, typeid, res) token = Token(typeid, self.address, rident) msg = ('#PROXY', (rexposed, token)) else: msg = ('#RETURN', res) except AttributeError: if methodname is None: msg = ('#TRACEBACK', format_exc()) else: try: fallback_func = self.fallback_mapping[methodname] result = fallback_func( self, conn, ident, obj, *args, **kwds ) msg = ('#RETURN', result) except Exception: msg = ('#TRACEBACK', format_exc()) except EOFError:
Re: [Users] default mutipath.conf config for fedora 18 invalid
On Thu, Jan 24, 2013 at 10:44:48AM -0500, Yeela Kaplan wrote: Hi, I've tested the new patch on fedora 18 vdsm host (created iscsi storage domain, attached, activated) and it works well. Even though multipath.conf no longer uses getuid_callout to recognize the device's wwid, it still knows how to deal with the attribute's existence in the conf file when running multipath command (only output is to stdout which we don't use anyway, stderr empty and rc=0). The relevant patch is: http://gerrit.ovirt.org/#/c/10824/ Given your verification, and the fact that this patch is a step forward, I've taken it into vdsm master and acked it for ovirt-3.2. I trust Ben Marzinski to shout at us loudly if keeping the outdated verb is terribly wrong. I'd expect to see a future patch, adding getuid_callout only for multipath versions that actually need it. ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users