Comment: Hi Canonical, @taco-screen-team
The attached patches are for Zesty, Xenial, and Debian sid (which I plan to submit if the UID/GID allocation request is granted). afaik @cjwatson is the maintainer of base-passwd on Debian, and could review/grant/deny the allocation request. Per Ubuntu Policy, we'd need it ack'ed on Debian first. Then the libvirt patches.. all packages were verified. For Z, X, and sid, the test-case follows this pattern, and has correct behavior/results. Thanks! ** Description changed: - ---Problem Description--- - We setup 2 Ubuntu KVM host with the same mount point and try to migration the guest between 2 HOST. The migration is success, the guest appear on other Host after the migration but it shows some I/O error on the guest. + <...> - On the first host, run this - root@micro:~# virsh migrate --live --domain microg5 qemu+ssh://10.33.10.115/system --verbose --undefinesource --persistent --timeout 60 - Migration: [100 %] + Please see comments for the problem description, and summary of + originally bridged comments in the description. - The guest appear on other HOST: - root@tiny:~# virsh list --all - Id Name State + Sorry about the inconvenience. - 2 tinyg1 running - 3 tinyg2 running - 5 tinyg4 running - 6 tinyg5 running - 7 tinyg6 running - 9 tinyg3 running - 12 microg5 running <<< this guest is from HOST "Micro" - - Checking status of the guest, I can see this error.... - root@microg5:~# dmesg |tail -20 - [ 60.818955] blk_update_request: I/O error, dev vdc, sector 96749232 - [ 60.819113] Aborting journal on device vdc2-8. - [ 60.820121] blk_update_request: I/O error, dev vdc, sector 9084320 - [ 60.820643] EXT4-fs warning (device vdc2): ext4_end_bio:329: I/O error -5 writing to inode 393279 (offset 0 size 0 starting block 1135541) - [ 60.820652] Buffer I/O error on device vdc2, logical block 1133492 - [ 60.820655] EXT4-fs (vdc2): previous I/O error to superblock detected - [ 60.821394] blk_update_request: I/O error, dev vdc, sector 96747520 - [ 60.821397] blk_update_request: I/O error, dev vdc, sector 96747520 - [ 60.821402] Buffer I/O error on dev vdc2, logical block 12091392, lost sync page write - [ 60.821466] JBD2: Error -5 detected when updating journal superblock for vdc2-8. - [ 60.822214] blk_update_request: I/O error, dev vdc, sector 16384 - [ 60.822216] blk_update_request: I/O error, dev vdc, sector 16384 - [ 60.822218] Buffer I/O error on dev vdc2, logical block 0, lost sync page write - [ 60.822227] EXT4-fs error (device vdc2): ext4_journal_check_start:56: Detected aborted journal - [ 60.822228] EXT4-fs (vdc2): Remounting filesystem read-only - [ 60.822229] EXT4-fs (vdc2): previous I/O error to superblock detected - [ 60.823201] blk_update_request: I/O error, dev vdc, sector 16384 - [ 60.823203] blk_update_request: I/O error, dev vdc, sector 16384 - [ 60.823204] Buffer I/O error on dev vdc2, logical block 0, lost sync page write - [ 96.736959] nfsd4: failed to purge old clients from recovery directory v4recovery - root@microg5:~# - @haochanh - haochanh commented 18 days ago - - Moving the guest back to original host successfully but we still see the I/O error - root@tiny:~# virsh migrate --live --domain microg5 qemu+ssh://10.33.9.187/system --verbose --undefinesource --persistent --timeout 60 - Migration: [100 %] - root@tiny:~# virsh list --all - Id Name State - - 2 tinyg1 running - 3 tinyg2 running - 5 tinyg4 running - 6 tinyg5 running - 7 tinyg6 running - 9 tinyg3 running - - On the orginal host: - root@micro:~# virsh list --all - Id Name State - - 2 microg1 running - 3 microg2 running - 4 microg3 running - 5 microg4 running - 9 microg6 running - 16 microg5 running - - Here is our config: both HOST (micro & tiny) are sharing the same NFS mount /kvm_nfs/ - Micro KVM: - root@micro:~# ls -l /kvm_nfs/microg6.raw.img - -rw-r--r-- 1 nobody 4294967294 107374182400 Aug 3 18:23 /kvm_nfs/microg6.raw.img - - Tiny KVM: - root@tiny:~# ls -l /kvm_nfs/microg6.raw.img - -rw-r--r-- 1 nobody 4294967294 107374182400 Aug 3 2016 /kvm_nfs/microg6.raw.img - - We try to do the migration the guest microg6 from "micro" to "tiny" - root@micro:~# virsh domblklist microg6 - Target Source - - vda /kvm_nfs/microg6.raw.img - - root@micro:~# virsh migrate --live --domain microg6 qemu+ssh://10.33.10.115/system --verbose --undefinesource --persistent --timeout 60 - Migration: [100 %] <<<< it successfully goes to tiny KVM. - - We can see guest "microg6" on tiny KVM now. - root@tiny:~# virsh domblklist microg6 - Target Source - - vda /kvm_nfs/microg6.raw.img - - Checking on the guest "microg6", we see these error..... - root@microg6:~# dmesg |tail - [24371.936814] blk_update_request: I/O error, dev vda, sector 16384 - [24371.936900] Buffer I/O error on dev vda2, logical block 0, lost sync page write - [24373.661328] blk_update_request: I/O error, dev vda, sector 16416 - [24373.661552] Buffer I/O error on dev vda2, logical block 4, lost async page write - [24373.661778] Buffer I/O error on dev vda2, logical block 6, lost async page write - [24373.662023] Buffer I/O error on dev vda2, logical block 13107201, lost async page write - [24373.662253] Buffer I/O error on dev vda2, logical block 21004406, lost async page write - [24373.662427] Buffer I/O error on dev vda2, logical block 21010963, lost async page write - [24373.662713] Buffer I/O error on dev vda2, logical block 21495820, lost async page write - [24373.662957] Buffer I/O error on dev vda2, logical block 16777222, lost async page write - - Both sharing the same NFS mount - root@tiny:~# df -T |grep "kvm_nfs" - tmp-lte:/kvm_lpm nfs4 1238417408 786754560 388756480 67% /kvm_nfs - root@tiny:~# rsh micro "df -T |grep kvm_nfs" - tmp-lte:/kvm_lpm nfs4 1238417408 786754560 388756480 67% /kvm_nfs - - root@micro:~# df -T |grep "kvm_nfs" - tmp-lte:/kvm_lpm nfs4 1238417408 786754560 388756480 67% /kvm_nfs - - The problem is due to a configuration problem in the NFSv4 server and/or clients. - Probably related to the NFSv4 ID <-> Name Mapping (idmap). - - Thus, this is not an I/O-related problem. - - The solution/requirement is to make sure that the libvirt-qemu user has the same UID/GID on all systems that it might be migrated too. - (workaround is to use NFSv3, for example, which has no ID mapping). - - "NFSv4 mount incorrectly shows all files with ownership as nobody:nobody" - https://access.redhat.com/solutions/33455 - - - Ensure the client and server have matching UID's and GID's. - It is a common misconception that the UID's and GID's can differ when using NFSv4. - - The sole purpose of id mapping is to map an id to a name and vice-versa. - ID mapping is not intended as some sort of replacement for managing id's. - - - On a non-Ubuntu linux, if the above settings have been applied and UID/GID's are matched on server and client - and users are still being mapped to nobody:nobody than a clearing of the idmapd cache may be required: - - # nfsidmap -c - - There are some articles on google on 'linux how to change uid'. - - - Details: - --- - - The user 'developer' on NFS server has UID 529. - - [developer@tmp-lte ~]$ grep developer /etc/passwd - developer:x:529:531:developer login id:/home/developer:/bin/bash - - Create an user 'developer' on tiny w/ UID non-529. - The change owner operation to the 'developer' user does not work (user remains 'nobody'). - - root@tiny:~# useradd --uid 1234 developer - - root@tiny:~# chown developer /kvm_nfs/test.mauricfo - - root@tiny:~# ls -l /kvm_nfs/test.mauricfo - -rw-r--r-- 1 nobody root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo - - Remove the user 'developer' for the next test. - - root@tiny:~# userdel developer - - Create an user 'developer' on tiny w/ UID 529 (same as in NFS server). - The change owner operation to the 'developer' user DOES work (user is no longer 'nobody'). - - - root@tiny:~# useradd --uid=529 developer - - root@tiny:~# chown developer /kvm_nfs/test.mauricfo - - root@tiny:~# ls -l /kvm_nfs/test.mauricfo - -rw-r--r-- 1 developer root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo - - - And that works on tiny, but does NOT work on micro UNTIL you clear the nfsidmap cache. - - root@micro:~# ls -l /kvm_nfs/test.mauricfo - -rw-r--r-- 1 nobody root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo - - root@micro:~# useradd --uid 529 developer - - root@micro:~# chown developer /kvm_nfs/test.mauricfo - - root@micro:~# ls -l /kvm_nfs/test.mauricfo - -rw-r--r-- 1 nobody root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo - - root@micro:~# nfsidmap -c - nfsidmap: clearing '2a251810 I------ 1 perm 1f030000 0 0 keyring .id_resolver: 8' - - root@micro:~# ls -l /kvm_nfs/test.mauricfo - -rw-r--r-- 1 developer root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo - - - So, this is clearly NFSv4-specific. - - - Details 2: - --- - - The idmapd uses the domain from 'hostname --domain', which is the same - across all hosts. - - [developer@tmp-lte ~]$ head -n5 /etc/idmapd.conf - [General] - #Verbosity = 0 - # The following should be set to the local NFSv4 domain name - # The default is the host's DNS domain name. - #Domain = local.domain.edu - - [developer@tmp-lte ~]$ hostname --domain - isst.aus.stglabs.ibm.com - - root@tiny:~# hostname --domain - isst.aus.stglabs.ibm.com - - root@micro:~# hostname --domain - isst.aus.stglabs.ibm.com - - Started the nfs-idmapd.service in tiny and micro (apt-get install nfs-server; systemctl start nfs-idmapd.service). - Increased verbosity in /etc/idmapd.conf to 3. - Verified that the domain values used by nfs-idmapd service are the same on all hosts (journalctl -u nfs-idmapd) - - - References: - --- - - These links discuss a bit about this problem (user/group - nobody/nogroup); in case they're useful for the next person. - - - https://access.redhat.com/solutions/33455 - - - Ensure the client and server have matching UID's and GID's. - It is a common misconception that the UID's and GID's can differ when using NFSv4. - The sole purpose of id mapping is to map an id to a name and vice-versa. - ID mapping is not intended as some sort of replacement for managing id's. - - - On non-Ubuntu linux, if the above settings have been applied and UID/GID's are matched on server and client - and users are still being mapped to nobody:nobody than a clearing of the idmapd cache may be required: - - # nfsidmap -c - - https://help.ubuntu.com/community/NFSv4Howto - - If all directory listings show just "nobody" and "nogroup" instead of real user and group names, - then you might want to check the Domain parameter set in /etc/idmapd.conf. - - NFSv4 client and server should be in the same domain. - Other operating systems might derive the NFSv4 domain name from the domain name mentioned in /etc/resolv.conf (e.g. Solaris 10). - - http://www.enterprisenetworkingplanet.com/netos/article.php/3644471/Implement-NFSv4-Domains-and-Authentication.htm - http://unix.stackexchange.com/questions/138479/mounting-nfs-owners-are-nobodynogroup - https://www.novell.com/support/kb/doc.php?id=7005060 - https://community.netapp.com/t5/Network-Storage-Protocols-Discussions/NFSv4-Linux-client-Netapp-Server-gt-Problem-with-id-mapping/td-p/17895 - - The reservation of an UID/GID in Debian/Ubuntu follows an allocation process governed by Debian. - I have submitted an allocation request, and will prepare the patches for libvirt-qemu in Debian and Ubuntu. - - Hi Chanh, - - Can you please test the libvirt-bin & libvirt0 packages? - http://ausgsa.ibm.com/~mauricfo/public/bugs/bz145069/v2/ - - Please confirm if they resolve the issue. - - Thanks! - - - Details - --- - - The test packages assume that the UID & GID 64055 will be allocated by - Debian, and user this value for libvirt-qemu user/group. - - You can remove any trace of the current libvirt-bin package (which creates user/group) and user/group with: - # apt-get purge libvirt-bin - # userdel libvirt-qemu - # groupdel libvirt-qemu - - Existing files assigned to libvirt-qemu user/group will not have its - permissions changed, so not sure that's a clean transition from an - existing system/files, but the permissions should be correct for all the - installed and new files created from there on. - - Hi Mauricio, - Yes, after applied this patch, the migration is working fine without any IO error. - - I am able to migrate between 2 Host (Tiny & Micro) using the NFS mount method without any IO issue. - root@micro:~# id libvirt-qemu - uid=64055(libvirt-qemu) gid=117(kvm) groups=117(kvm),64055(libvirt-qemu) - - root@tiny:~# id libvirt-qemu - uid=64055(libvirt-qemu) gid=116(kvm) groups=116(kvm),64055(libvirt-qemu) - - Hi Canonical, - @taco-screen-team - - The attached patches are for Zesty, Xenial, and Debian sid (which I plan - to submit if the UID/GID allocation request is granted). - - afaik @cjwatson is the maintainer of base-passwd on Debian, and could review/grant/deny the allocation request. - Per Ubuntu Policy, we'd need it ack'ed on Debian first. - - Then the libvirt patches.. all packages were verified. - For Z, X, and sid, the test-case follows this pattern, and has correct behavior/results. - - Thanks! - - - Test-case - === - - Package is not installed -- no libvirt-qemu user/group: - --- - - # getent passwd libvirt-qemu - # - - # getent group libvirt-qemu - # - - - Package is installed -- libvirt-qemu user/group is created w/ allocated uid/gid: - --- - - # dpkg -i libvirt{-bin,0}_1.3.1-1ubuntu10.5uidgid1_*.deb - - # getent passwd libvirt-qemu - libvirt-qemu:x:64055:117:Libvirt Qemu,,,:/var/lib/libvirt:/bin/false - - # getent group libvirt-qemu - libvirt-qemu:x:64055:libvirt-qemu - - - Package is uninstalled -- libvirt-qemu user/group is removed: - --- - - # apt-get --yes purge libvirt-bin - - # getent passwd libvirt-qemu - # - - # getent group libvirt-qemu - # - - - Package is installed with uid/gid taken -- libvirt-qemu user/group is created with other uid/gid: - --- - - # useradd --uid 64055 testuser - # groupadd --gid 64055 testgroup - - # dpkg -i libvirt{-bin,0}_1.3.1-1ubuntu10.5uidgid1_*.deb - # echo $? - 0 - - # getent passwd libvirt-qemu - libvirt-qemu:x:113:117:Libvirt Qemu,,,:/var/lib/libvirt:/bin/false - - # getent group libvirt-qemu - libvirt-qemu:x:118:libvirt-qemu + <...> -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1637601 Title: UbuntuKVM: migration using NFS mount fails #190 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1637601/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
