Re: [vdsm] About vdsmd init script
Hi, on 05/28/2013 17:26, Yaniv Bronheim wrote: Hey, I think that libvirt_configure part can be an external module (maybe python module) that can be initiated by vdsm-tool, It should work with template or conf.default as you mentioned, and we should call it before starting the service, I think it should be a module as it also should include all the part of libvirtd_sysv2upstart, libvirtd_reconfigure, libvirtd_configure, test_conflicting_conf scripts. Also, keep in mind that we plan to split vdsm to 2 services - one for vdsmd and one for supervdsmd, both should be initiated at startup and should be depended on eachother (http://gerrit.ovirt.org/#/c/11051/). Yes. After supervdsm starts as service, we can add dependency declarations easily. It's not conflicting with refactoring vdsm init script. I can help to review the supervdsm patch to make it done faster. The other parts that you want to take out of vdsmd script are: shutdown-conflict-srv - could be also as part of the tool nwfliter, dummybr - both python scripts that we run, why not part of the tool as well? start_needed_srv, load-needed-modules - only sysv and debian need it if I understand correctly. systemd,upstat,openrc can use their init script parameters. so why take them out? in each start function we'll start and load the needed services and modules. systemd,upstat,openrc don't need custom start function anyway. The Debian ships with /lib/lsb/init-functions, and Red Hat family (such as CentOS, RHEL6) ship with /etc/init.d/functions. To print the error message and daemonize the service process, we call different utility functions in different system thought they are all SysV. The service script boilerplate in Debian is different from Red Hat family as well. So we want provide dedicated init script for respective systems. To re-use start_needed_srv and load-needed-modules in different SysV init scripts, I move them out. gencerts, syslog_available, tune_system, test_space_and_lo, prepare_dirs - can be scripts that we run before start as you did. Regards, Yaniv Bronhaim. I agree some of the initialize operations can be moved to vdsm-tool. I think we can do this in future patch after we port VDSM init script to Ubuntu. I'd prefer start small, not to do all the things in one batch. Once we have VDSM run on Ubuntu, we can improve it step by step. -- Thanks and best regards! Zhou Zheng Sheng / 周征晟 E-mail: zhshz...@linux.vnet.ibm.com Telephone: 86-10-82454397 ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] oVirt updates - April 28th, 2013
1. From the Web - interview with Theron on oVirt (chinese) http://www.infoq.com/cn/news/2013/05/conrey-on-ovirt - interview with Dave Neary about his work (and oVirt) http://www.techradar.com/news/software/what-went-wrong-with-meego-nokia-lost-faith-in-the-project--1147770 - Nagios monitoring plugin check_rhev3 1.2 released http://lists.ovirt.org/pipermail/users/2013-May/014389.html - a blog on how to do HA for engine (written for rhev, should be relevant to oVirt as well) http://captainkvm.com/2013/05/providing-high-availability-for-rhev-m/ 2. Video - youtube available for IBM's session on connected Communities, Innovative Technologies: OpenStack, oVirt, and KVM http://www.youtube.com/watch?v=Pg7ShV-HvCE - fog/foreman by ohad levy (fog supports oVirt) http://www.youtube.com/watch?v=JgaQ_ekR2JA 3. Conferences - FOSDEM presentations page uploaded http://www.ovirt.org/FOSDEM_2013 - some of the Shangahi presentations uploaded http://www.ovirt.org/Intel_Workshop_May_2013 - upcoming - LinuxCon Japan oVirt session in LinuxCon Japan (this week) - upcoming - oVirt Developer days (with KVM Forum) Edinburgh, UK - October 21 - 23, 2013 4. Other - help test the new oVirt installer and developer setup environments http://www.ovirt.org/OVirtEngineDevelopmentEnvironment - RC for oVirt Node 3.0.0 is now available (but not compatible with ovirt-engine yet) Thanks, Itamar ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Potential Bug in cpopen
On Wed 29 May 2013 12:02:47 PM CST, Zhou Zheng Sheng wrote: Hi, Recently Jenkins unit test sometimes fails on PidStatTests.test, for example http://gerrit.ovirt.org/#/c/14670/ After it execCmd a sleep command with sync=False, in /proc/[xxxpid]/stat we should see the name is sleep, but in this case we get python, which means there is a possible race condition. The most possible situation is that execCmd returns before the child process execvp the sleep, then the parent process reads the stat and sees the process name is still python. cpopen is designed to avoid this kind of race. It uses a pipe to synchronize the child and parent. It sets the FD_CLOEXEC on the child side of the pipe, so that once execvp succeeds, the child pipe is closed. If execvp fails, child writes error code to the pipe. The parent reads the other end of the pipe like follow. if (read(errnofd[0], childErrno, sizeof(int)) == sizeof(int)) { PyErr_SetString(PyExc_OSError, strerror(childErrno)); goto fail; } It assumes that when read returns, the return value is either sizeof(int), which indicates an error in the child side, or 0, which indicates the child side of pipe is closed and execvp succeeds. However this assumption may not always be true. If the parent process gets a signal, the read invocation would be interrupt, and the code treats this interruption the same as the case of execvp succeeds. If the system is very busy like our Jenkins slave concurrently executing jobs, it's possible the parent gets interrupted before the child execvp succeeds, so cpopen returns to execCmd and cause the race. To produce this problem, we can add a sleep(10); before the exec invocation in cpopen.c, it simulates a busy/slow system. Then in a Python interpreter, register a signal handler and calls execCmd. from vdsm.utils import execCmd from vdsm import utils def handler(signum, frame): ... print 'Signal handler called with signal', signum ... import signal signal.signal(signal.SIGALRM, handler) signal.signal(signal.SIGCHLD, handler) p = execCmd(['sleep', '3'], sync=False, sudo=False); s = utils.pidStat(p.pid); print s; p.wait() Then kill -SIGCHLD or kill -SIGALRM to this Python interpreter process, we can see the output. Signal handler called with signal 17 (9541, 'python', 'S', 9509, 9509, 6843, 34817, 9509, 4218944, 66, 0, 0, 0, 0, 0, 0, 0, 20, 0, 1, 0, 663767, 217919488, 1676, 18446744073709551615L, 4194304, 4197004, 140735665131088, 140735665124456, 268226504400, 0, 0, 16781312, 73730, 18446744071579398713L, 0, 0, 17, 3, 0, 0, 0, 0, 0, 6294952, 6297616, 19152896, 140735665132425, 140735665132432, 140735665132432, 140735665135592, 0) True I have not found other ways to produce it, just found this method. Not sure it is a bug. Is it reasonable to check the read() return value and retry on EAGAIN/EINTR to fix this? Good catch! If you can't find out how the race is invoked on jenkins slave, you could just add a patch to fix the EAGAIN/EINTR issue in cpopen, and add some fake patches to invoke multiple jenkins jobs. Then you could find if the fix of EAGAIN/EINTR helps. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Migration regression on master
On 05/29/2013 07:17 AM, Dan Kenigsberg wrote: On Tue, May 28, 2013 at 11:54:45AM -0400, Giuseppe Vallarelli wrote: | - Original Message - | | From: Assaf Muller amul...@redhat.com | | To: Michal Skrivanek michal.skriva...@redhat.com | | Cc: vdsm-devel@lists.fedorahosted.org Development | | vdsm-devel@lists.fedorahosted.org | | Sent: Thursday, May 23, 2013 2:12:47 PM | | Subject: Re: [vdsm] Migration regression on master | | | | As you can see in a previous patch set I checked if the alias attribute | | exists instead of assuming it exists. | | I then changed my mind with Dan's blessing, and decided to assume it does | | exist, exactly for cases like this. | | Even if we check if the alias exists, what do we do if we find out it | | doesn't? We're at a problem and need to understand why the alias doesn't | | exist because it should - For all devices. | | | | We definitely need to deal with this issue - Can you provide the domxml of | | the VM during creation, and during migration? Hi Assaf, Peter today has reproduced the same error and provided me the output log, you can find it here: http://etherpad.ovirt.org/p/migration-errors Would you add the original vmCreate line, and domxml, that was used to create the VM at the source host? http://pastebin.test.redhat.com/17 http://pastebin.test.redhat.com/16 -- Peter V. Saveliev ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] supervdsm broken on master
Hey, I'm happy to see it merged :) I assumed that part of the spec we create this directory.. http://gerrit.ovirt.org/15170 - fix the issue, And also, don't forget about /var/log/vdsm/supervdsm.log to check such errors Thanks! Yaniv. - Original Message - From: Dan Kenigsberg dan...@redhat.com To: vdsm-devel@lists.fedorahosted.org Cc: Yaniv Bronheim ybron...@redhat.com Sent: Wednesday, May 29, 2013 6:31:46 PM Subject: supervdsm broken on master I've just taken Yaniv's http://gerrit.ovirt.org/11051 Supervdsm as external service to master, but unfortunately, I decided to test it myself only afterwards. Currently, supervdsm fails to start unless you manually create the directory /var/run/vdsm/ as root. The fix should not be complex, but I'm on the run. I'm confident that Yaniv would fix it soon. Regards, Dan. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel