On Mon, Nov 12, 2018 at 7:03 AM Germano Veit Michel <germ...@redhat.com>
wrote:

>
>
> On Fri, Nov 9, 2018 at 3:25 PM Germano Veit Michel <germ...@redhat.com>
> wrote:
>
>>
>>
>> On Wed, Nov 7, 2018 at 9:34 PM Nir Soffer <nsof...@redhat.com> wrote:
>>
>>> On Wed, Nov 7, 2018 at 6:27 AM Germano Veit Michel <germ...@redhat.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have seen CI failing with some "random" errors which makes submitting
>>>> a series of patches a bit annoying, as some patch always fails with some
>>>> error, then next run another patch fails.
>>>>
>>>> One of these problems is TestMount.testSymlinkMount. Looks like its
>>>> getting EBUSY when doing a rmdir. So probably something still using it?
>>>>
>>>> =================================== FAILURES 
>>>> ===================================
>>>> __________________________ TestMount.testSymlinkMount 
>>>> __________________________
>>>>
>>>> args = (<storage.mount_test.TestMount testMethod=testSymlinkMount>,)
>>>> kwargs = {}
>>>>
>>>>     @wraps(f)
>>>>     def wrapper(*args, **kwargs):
>>>>         if cond:
>>>>             raise SkipTest(reason)
>>>> >       return f(*args, **kwargs)
>>>>
>>>> testValidation.py:274:
>>>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>>>> _ _ _
>>>> testValidation.py:333: in wrapper
>>>>     return f(*args, **kwargs)
>>>> storage/mount_test.py:167: in testSymlinkMount
>>>>     m.umount()
>>>> /usr/lib64/python2.7/contextlib.py:35: in __exit__
>>>>     self.gen.throw(type, value, traceback)
>>>> testlib.py:218: in namedTemporaryDir
>>>>     shutil.rmtree(tmpDir)
>>>> /usr/lib64/python2.7/shutil.py:261: in rmtree
>>>>     rmtree(fullname, ignore_errors, onerror)
>>>> /usr/lib64/python2.7/shutil.py:270: in rmtree
>>>>     onerror(os.rmdir, path, sys.exc_info())
>>>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>>>> _ _ _
>>>>
>>>> path = '/var/tmp/tmpOYMiHx/mountpoint', ignore_errors = False
>>>> onerror = <function onerror at 0x7fd68afc9938>
>>>>
>>>>     def rmtree(path, ignore_errors=False, onerror=None):
>>>>         """Recursively delete a directory tree.
>>>>
>>>>         If ignore_errors is set, errors are ignored; otherwise, if onerror
>>>>         is set, it is called to handle the error with arguments (func,
>>>>         path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
>>>>         path is the argument to that function that caused it to fail; and
>>>>         exc_info is a tuple returned by sys.exc_info().  If ignore_errors
>>>>         is false and onerror is None, an exception is raised.
>>>>
>>>>         """
>>>>         if ignore_errors:
>>>>             def onerror(*args):
>>>>                 pass
>>>>         elif onerror is None:
>>>>             def onerror(*args):
>>>>                 raise
>>>>         try:
>>>>             if os.path.islink(path):
>>>>                 # symlinks to directories are forbidden, see bug #1669
>>>>                 raise OSError("Cannot call rmtree on a symbolic link")
>>>>         except OSError:
>>>>             onerror(os.path.islink, path, sys.exc_info())
>>>>             # can't continue even if onerror hook returns
>>>>             return
>>>>         names = []
>>>>         try:
>>>>             names = os.listdir(path)
>>>>         except os.error, err:
>>>>             onerror(os.listdir, path, sys.exc_info())
>>>>         for name in names:
>>>>             fullname = os.path.join(path, name)
>>>>             try:
>>>>                 mode = os.lstat(fullname).st_mode
>>>>             except os.error:
>>>>                 mode = 0
>>>>             if stat.S_ISDIR(mode):
>>>>                 rmtree(fullname, ignore_errors, onerror)
>>>>             else:
>>>>                 try:
>>>>                     os.remove(fullname)
>>>>                 except os.error, err:
>>>>                     onerror(os.remove, fullname, sys.exc_info())
>>>>         try:
>>>> >           os.rmdir(path)
>>>> E           OSError: [Errno 16] Device or resource busy: 
>>>> '/var/tmp/tmpOYMiHx/mountpoint'
>>>>
>>>> /usr/lib64/python2.7/shutil.py:268: OSError
>>>> ----------------------------- Captured stderr call 
>>>> -----------------------------
>>>> 2018-11-07 00:41:55,518 DEBUG (MainThread) [storage.Misc.excCmd] 
>>>> /usr/bin/taskset --cpu-list 0-1 /sbin/mkfs.ext2 -F 
>>>> /var/tmp/tmpOYMiHx/backing.img (cwd None) (commands:68)
>>>> 2018-11-07 00:41:55,561 DEBUG (MainThread) [storage.Misc.excCmd] SUCCESS: 
>>>> <err> = 'mke2fs 1.44.2 (14-May-2018)\n'; <rc> = 0 (commands:89)
>>>> 2018-11-07 00:41:55,562 INFO  (MainThread) [storage.Mount] mounting 
>>>> /var/tmp/tmpOYMiHx/link_to_image at /var/tmp/tmpOYMiHx/mountpoint 
>>>> (mount:204)
>>>> 2018-11-07 00:41:55,563 DEBUG (MainThread) [root] /usr/bin/taskset 
>>>> --cpu-list 0-1 /usr/bin/mount -o loop /var/tmp/tmpOYMiHx/link_to_image 
>>>> /var/tmp/tmpOYMiHx/mountpoint (cwd None) (commands:68)
>>>> 2018-11-07 00:41:55,598 DEBUG (MainThread) [root] SUCCESS: <err> = ''; 
>>>> <rc> = 0 (commands:89)
>>>> 2018-11-07 00:41:55,599 DEBUG (MainThread) [storage.Mount] 
>>>> /var/tmp/tmpOYMiHx/mountpoint mounted: 0.03 seconds (utils:454)
>>>> 2018-11-07 00:41:55,599 DEBUG (MainThread) [root] /usr/bin/taskset 
>>>> --cpu-list 0-1 /sbin/udevadm settle --timeout=5 (cwd None) (commands:68)
>>>> 2018-11-07 00:41:55,624 DEBUG (MainThread) [root] SUCCESS: <err> = ''; 
>>>> <rc> = 0 (commands:89)
>>>> 2018-11-07 00:41:55,625 DEBUG (MainThread) [storage.Mount] Waiting for 
>>>> udev mount events: 0.03 seconds (utils:454)
>>>> 2018-11-07 00:41:55,628 INFO  (MainThread) [storage.Mount] unmounting 
>>>> /var/tmp/tmpOYMiHx/mountpoint (mount:212)
>>>> 2018-11-07 00:41:55,629 DEBUG (MainThread) [root] /usr/bin/taskset 
>>>> --cpu-list 0-1 /usr/bin/umount /var/tmp/tmpOYMiHx/mountpoint (cwd None) 
>>>> (commands:68)
>>>> 2018-11-07 00:41:55,658 DEBUG (MainThread) [root] FAILED: <err> = 'umount: 
>>>> /var/tmp/tmpOYMiHx/mountpoint: target is busy.\n'; <rc> = 32 (commands:89)
>>>>
>>>>
>>> The test is doing:
>>>
>>> 163             m.mount(mntOpts="loop")
>>> 164             try:
>>> 165                 self.assertTrue(m.isMounted())
>>> 166             finally:
>>> 167                 m.umount()
>>>
>>> So nothing can use this mount - except systemd, creating a mount unit
>>> for every mount
>>> created by "mount".
>>>
>>> If you can reproduce this with calling "mount" and "umount" in a loop,
>>> we can open
>>> a systemd bug for this.
>>>
>>> Adding little sleep before unmounting will probably avoid this issue. In
>>> production we
>>> never unmount write after mount.
>>>
>>> A possible fix is to use m.umount(force=True, lazy=True) - this is how
>>> we unmount
>>> file based storage mounts in lib/vdsm/storage/storageServer.py:
>>>
>>> 201     def disconnect(self):
>>> 202         self._mount.umount(True, True)
>>>
>>> Nir
>>>
>>
>> Thanks Nir!
>>
>> I'll see if I can reproduce it next week. I have only seen this one on
>> FC28, so it could be a FC28 bug.
>> Yesterday the rebase CI also failed on 2 patches, but the errors were
>> different and I did not have time to investigate.
>>
>
> Hit it again, but now on testLoopMount() instead of testSymlinkMount().
>
> Tried reproducing it on F29 (not 28), did not hit it.
>
> #!/bin/bash
>
> truncate -s 100M /tmp/a
> mkfs.ext2 /tmp/a
> mkdir /tmp/b
> while [ True ]
> do
>     mount -o loop /tmp/a /tmp/b
>     umount /tmp/b
> done
>
> Do you want me to submit a patch with m.umount(force=True, lazy=True) for
> all those tests? Would it be correct to do this?
>

I reproduce these issue today in the CI, and look at git history.

We had force=True in the past, but it was removed in:

commit c8a216677e82aa61aab0c98a1021265121f8db57
Author: Nir Soffer <nsof...@redhat.com>
Date:   Sun Apr 24 01:05:33 2016 +0300

    mount: Remove pointless force=True usage

    Using force=True when unmounting loop device is pointless - this option
    will cause umount(8) to call the umount(2) syscall with the MNT_FORCE
    flag, which is useful only for NFS mounts.

    Change-Id: I31ff97d882f292f9c6325903ed9072b7146f682e
    Signed-off-by: Nir Soffer <nsof...@redhat.com>
    Reviewed-on: https://gerrit.ovirt.org/56524
    Reviewed-by: Dan Kenigsberg <dan...@redhat.com>
    Reviewed-by: Allon Mureinik <amure...@redhat.com>
    Continuous-Integration: Jenkins CI

We fixed similar issue later in:

commit d158b1a6f54ba67d81cb163593e3c330c77d36fd
Author: Nir Soffer <nsof...@redhat.com>
Date:   Sun Apr 24 22:36:47 2016 +0300

    mount: Wait for systemd temporary mount unit

    Systemd is monitoring mountpoints created during runtime by creating a
    temporary mount units. For example:

        # systemctl status /tmp/mount-1.DUNxySxI2R/wrk
        ● tmp-mount\x2d1.DUNxySxI2R-wrk.mount - /tmp/mount-1.DUNxySxI2R/wrk
           Loaded: loaded (/proc/self/mountinfo)
           Active: active (mounted) since Sun 2016-04-24 22:37:55 IDT; 14ms
ago
            Where: /tmp/mount-1.DUNxySxI2R/wrk
             What: /dev/loop2

    Mounting and unmounting quickly as we do in the tests, or when creating
    a floppy image may confuse systemd, and lead to various errors:

    - mountpoint is not mounted when trying to access files or unmount
    - mountpoint is busy when trying to unmount

    These are probably systemd or mount bugs. Until these issues are
    resolve, we need to find a workaround to avoid failures in the tests or
    when creating a floppy images.

    This patch add udevadm.settle() call after successful mount and umount
    operations.

    Change-Id: I9b0318eb534f781f127882983527e1a221820478
    Signed-off-by: Nir Soffer <nsof...@redhat.com>
    Reviewed-on: https://gerrit.ovirt.org/56544
    Reviewed-by: Dan Kenigsberg <dan...@redhat.com>
    Continuous-Integration: Jenkins CI
    Reviewed-by: Francesco Romani <from...@redhat.com>

Based on this, I posted
https://gerrit.ovirt.org/c/95719/

Please try to rebase your patches on top of this, and report if it fixes
this issue.

Nir
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/OQTMHDXUATEFHY3MB47QLM766FEOGOKP/

Reply via email to