Comment:

Hi Canonical,
@taco-screen-team

The attached patches are for Zesty, Xenial, and Debian sid (which I plan
to submit if the UID/GID allocation request is granted).

afaik @cjwatson is the maintainer of base-passwd on Debian, and could 
review/grant/deny the allocation request.
Per Ubuntu Policy, we'd need it ack'ed on Debian first.

Then the libvirt patches.. all packages were verified.
For Z, X, and sid, the test-case follows this pattern, and has correct 
behavior/results.

Thanks!

** Description changed:

- ---Problem Description---
- We setup 2 Ubuntu KVM host with the same mount point and try to migration the 
guest between 2 HOST. The migration is success, the guest appear on other Host 
after the migration but it shows some I/O error on the guest.
+ <...>
  
- On the first host, run this
- root@micro:~# virsh migrate --live --domain microg5 
qemu+ssh://10.33.10.115/system --verbose --undefinesource --persistent 
--timeout 60
- Migration: [100 %]
+ Please see comments for the problem description, and summary of
+ originally bridged comments in the description.
  
- The guest appear on other HOST:
- root@tiny:~# virsh list --all
- Id Name State
+ Sorry about the inconvenience.
  
- 2 tinyg1 running
- 3 tinyg2 running
- 5 tinyg4 running
- 6 tinyg5 running
- 7 tinyg6 running
- 9 tinyg3 running
- 12 microg5 running <<< this guest is from HOST "Micro"
- 
- Checking status of the guest, I can see this error....
- root@microg5:~# dmesg |tail -20
- [ 60.818955] blk_update_request: I/O error, dev vdc, sector 96749232
- [ 60.819113] Aborting journal on device vdc2-8.
- [ 60.820121] blk_update_request: I/O error, dev vdc, sector 9084320
- [ 60.820643] EXT4-fs warning (device vdc2): ext4_end_bio:329: I/O error -5 
writing to inode 393279 (offset 0 size 0 starting block 1135541)
- [ 60.820652] Buffer I/O error on device vdc2, logical block 1133492
- [ 60.820655] EXT4-fs (vdc2): previous I/O error to superblock detected
- [ 60.821394] blk_update_request: I/O error, dev vdc, sector 96747520
- [ 60.821397] blk_update_request: I/O error, dev vdc, sector 96747520
- [ 60.821402] Buffer I/O error on dev vdc2, logical block 12091392, lost sync 
page write
- [ 60.821466] JBD2: Error -5 detected when updating journal superblock for 
vdc2-8.
- [ 60.822214] blk_update_request: I/O error, dev vdc, sector 16384
- [ 60.822216] blk_update_request: I/O error, dev vdc, sector 16384
- [ 60.822218] Buffer I/O error on dev vdc2, logical block 0, lost sync page 
write
- [ 60.822227] EXT4-fs error (device vdc2): ext4_journal_check_start:56: 
Detected aborted journal
- [ 60.822228] EXT4-fs (vdc2): Remounting filesystem read-only
- [ 60.822229] EXT4-fs (vdc2): previous I/O error to superblock detected
- [ 60.823201] blk_update_request: I/O error, dev vdc, sector 16384
- [ 60.823203] blk_update_request: I/O error, dev vdc, sector 16384
- [ 60.823204] Buffer I/O error on dev vdc2, logical block 0, lost sync page 
write
- [ 96.736959] nfsd4: failed to purge old clients from recovery directory 
v4recovery
- root@microg5:~#
- @haochanh
- haochanh commented 18 days ago
- 
- Moving the guest back to original host successfully but we still see the I/O 
error
- root@tiny:~# virsh migrate --live --domain microg5 
qemu+ssh://10.33.9.187/system --verbose --undefinesource --persistent --timeout 
60
- Migration: [100 %]
- root@tiny:~# virsh list --all
- Id Name State
- 
- 2 tinyg1 running
- 3 tinyg2 running
- 5 tinyg4 running
- 6 tinyg5 running
- 7 tinyg6 running
- 9 tinyg3 running
- 
- On the orginal host:
- root@micro:~# virsh list --all
- Id Name State
- 
- 2 microg1 running
- 3 microg2 running
- 4 microg3 running
- 5 microg4 running
- 9 microg6 running
- 16 microg5 running
- 
- Here is our config: both HOST (micro & tiny) are sharing the same NFS mount 
/kvm_nfs/
- Micro KVM:
- root@micro:~# ls -l /kvm_nfs/microg6.raw.img
- -rw-r--r-- 1 nobody 4294967294 107374182400 Aug 3 18:23 
/kvm_nfs/microg6.raw.img
- 
- Tiny KVM:
- root@tiny:~# ls -l /kvm_nfs/microg6.raw.img
- -rw-r--r-- 1 nobody 4294967294 107374182400 Aug 3 2016 
/kvm_nfs/microg6.raw.img
- 
- We try to do the migration the guest microg6 from "micro" to "tiny"
- root@micro:~# virsh domblklist microg6
- Target Source
- 
- vda /kvm_nfs/microg6.raw.img
- 
- root@micro:~# virsh migrate --live --domain microg6 
qemu+ssh://10.33.10.115/system --verbose --undefinesource --persistent 
--timeout 60
- Migration: [100 %] <<<< it successfully goes to tiny KVM.
- 
- We can see guest "microg6" on tiny KVM now.
- root@tiny:~# virsh domblklist microg6
- Target Source
- 
- vda /kvm_nfs/microg6.raw.img
- 
- Checking on the guest "microg6", we see these error.....
- root@microg6:~# dmesg |tail
- [24371.936814] blk_update_request: I/O error, dev vda, sector 16384
- [24371.936900] Buffer I/O error on dev vda2, logical block 0, lost sync page 
write
- [24373.661328] blk_update_request: I/O error, dev vda, sector 16416
- [24373.661552] Buffer I/O error on dev vda2, logical block 4, lost async page 
write
- [24373.661778] Buffer I/O error on dev vda2, logical block 6, lost async page 
write
- [24373.662023] Buffer I/O error on dev vda2, logical block 13107201, lost 
async page write
- [24373.662253] Buffer I/O error on dev vda2, logical block 21004406, lost 
async page write
- [24373.662427] Buffer I/O error on dev vda2, logical block 21010963, lost 
async page write
- [24373.662713] Buffer I/O error on dev vda2, logical block 21495820, lost 
async page write
- [24373.662957] Buffer I/O error on dev vda2, logical block 16777222, lost 
async page write
- 
- Both sharing the same NFS mount
- root@tiny:~# df -T |grep "kvm_nfs"
- tmp-lte:/kvm_lpm nfs4 1238417408 786754560 388756480 67% /kvm_nfs
- root@tiny:~# rsh micro "df -T |grep kvm_nfs"
- tmp-lte:/kvm_lpm nfs4 1238417408 786754560 388756480 67% /kvm_nfs
- 
- root@micro:~# df -T |grep "kvm_nfs"
- tmp-lte:/kvm_lpm nfs4 1238417408 786754560 388756480 67% /kvm_nfs
- 
- The problem is due to a configuration problem in the NFSv4 server and/or 
clients.
- Probably related to the NFSv4 ID <-> Name Mapping (idmap).
- 
- Thus, this is not an I/O-related problem.
- 
- The solution/requirement is to make sure that the libvirt-qemu user has the 
same UID/GID on all systems that it might be migrated too.
- (workaround is to use NFSv3, for example, which has no ID mapping).
- 
-       "NFSv4 mount incorrectly shows all files with ownership as 
nobody:nobody"
-       https://access.redhat.com/solutions/33455
- 
-        - Ensure the client and server have matching UID's and GID's. 
-          It is a common misconception that the UID's and GID's can differ 
when using NFSv4. 
- 
-          The sole purpose of id mapping is to map an id to a name and 
vice-versa. 
-          ID mapping is not intended as some sort of replacement for managing 
id's.
- 
-        - On a non-Ubuntu linux, if the above settings have been applied and 
UID/GID's are matched on server and client 
-          and users are still being mapped to nobody:nobody than a clearing of 
the idmapd cache may be required:
- 
-            # nfsidmap -c
- 
- There are some articles on google on 'linux how to change uid'.
- 
- 
- Details:
- ---
- 
- The user 'developer' on NFS server has UID 529.
- 
-       [developer@tmp-lte ~]$ grep developer /etc/passwd
-       developer:x:529:531:developer login id:/home/developer:/bin/bash
- 
- Create an user 'developer' on tiny w/ UID non-529.
- The change owner operation to the 'developer' user does not work (user 
remains 'nobody').
- 
-         root@tiny:~# useradd --uid 1234 developer
- 
-         root@tiny:~# chown developer /kvm_nfs/test.mauricfo
- 
-       root@tiny:~# ls -l /kvm_nfs/test.mauricfo 
-       -rw-r--r-- 1 nobody root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo
- 
- Remove the user 'developer' for the next test.
- 
-         root@tiny:~# userdel developer
- 
- Create an user 'developer' on tiny w/ UID 529 (same as in NFS server).
- The change owner operation to the 'developer' user DOES work (user is no 
longer 'nobody').
- 
- 
-       root@tiny:~# useradd --uid=529 developer
- 
-         root@tiny:~# chown developer /kvm_nfs/test.mauricfo
- 
-       root@tiny:~# ls -l /kvm_nfs/test.mauricfo 
-       -rw-r--r-- 1 developer root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo
- 
- 
- And that works on tiny, but does NOT work on micro UNTIL you clear the 
nfsidmap cache.
- 
-       root@micro:~# ls -l /kvm_nfs/test.mauricfo 
-       -rw-r--r-- 1 nobody root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo
- 
-         root@micro:~# useradd --uid 529 developer
- 
-         root@micro:~# chown developer /kvm_nfs/test.mauricfo
- 
-       root@micro:~# ls -l /kvm_nfs/test.mauricfo 
-       -rw-r--r-- 1 nobody root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo
- 
-       root@micro:~# nfsidmap -c
-       nfsidmap: clearing '2a251810 I------     1 perm 1f030000     0     0 
keyring   .id_resolver: 8'
- 
-       root@micro:~# ls -l /kvm_nfs/test.mauricfo 
-       -rw-r--r-- 1 developer root 18 Oct 11 17:17 /kvm_nfs/test.mauricfo
- 
- 
- So, this is clearly NFSv4-specific.
- 
- 
- Details 2:
- ---
- 
- The idmapd uses the domain from 'hostname --domain', which is the same
- across all hosts.
- 
-       [developer@tmp-lte ~]$ head -n5 /etc/idmapd.conf 
-       [General]
-       #Verbosity = 0
-       # The following should be set to the local NFSv4 domain name
-       # The default is the host's DNS domain name.
-       #Domain = local.domain.edu
- 
-       [developer@tmp-lte ~]$ hostname --domain
-       isst.aus.stglabs.ibm.com
- 
-       root@tiny:~# hostname --domain
-       isst.aus.stglabs.ibm.com
- 
-       root@micro:~# hostname --domain
-       isst.aus.stglabs.ibm.com
- 
- Started the nfs-idmapd.service in tiny and micro (apt-get install nfs-server; 
systemctl start nfs-idmapd.service).
- Increased verbosity in /etc/idmapd.conf to 3.
- Verified that the domain values used by nfs-idmapd service are the same on 
all hosts (journalctl -u nfs-idmapd)
- 
- 
- References:
- ---
- 
- These links discuss a bit about this problem (user/group
- nobody/nogroup); in case they're useful for the next person.
- 
- 
- https://access.redhat.com/solutions/33455
- 
-  - Ensure the client and server have matching UID's and GID's. 
-    It is a common misconception that the UID's and GID's can differ when 
using NFSv4. 
-    The sole purpose of id mapping is to map an id to a name and vice-versa. 
-    ID mapping is not intended as some sort of replacement for managing id's.
- 
-  - On non-Ubuntu linux, if the above settings have been applied and UID/GID's 
are matched on server and client 
-    and users are still being mapped to nobody:nobody than a clearing of the 
idmapd cache may be required:
- 
-    # nfsidmap -c
- 
- https://help.ubuntu.com/community/NFSv4Howto
- 
-       If all directory listings show just "nobody" and "nogroup" instead of 
real user and group names, 
-       then you might want to check the Domain parameter set in 
/etc/idmapd.conf. 
- 
-       NFSv4 client and server should be in the same domain. 
-       Other operating systems might derive the NFSv4 domain name from the 
domain name mentioned in /etc/resolv.conf (e.g. Solaris 10).
- 
- 
http://www.enterprisenetworkingplanet.com/netos/article.php/3644471/Implement-NFSv4-Domains-and-Authentication.htm
- 
http://unix.stackexchange.com/questions/138479/mounting-nfs-owners-are-nobodynogroup
- https://www.novell.com/support/kb/doc.php?id=7005060
- 
https://community.netapp.com/t5/Network-Storage-Protocols-Discussions/NFSv4-Linux-client-Netapp-Server-gt-Problem-with-id-mapping/td-p/17895
- 
- The reservation of an UID/GID in Debian/Ubuntu follows an allocation process 
governed by Debian.
- I have submitted an allocation request, and will prepare the patches for 
libvirt-qemu in Debian and Ubuntu.
- 
- Hi Chanh,
- 
- Can you please test the libvirt-bin & libvirt0 packages?
-     http://ausgsa.ibm.com/~mauricfo/public/bugs/bz145069/v2/
- 
- Please confirm if they resolve the issue.
- 
- Thanks!
- 
- 
- Details
- ---
- 
- The test packages assume that the UID & GID 64055 will be allocated by
- Debian, and user this value for libvirt-qemu user/group.
- 
- You can remove any trace of the current libvirt-bin package (which creates 
user/group) and user/group with:
- # apt-get purge libvirt-bin
- # userdel libvirt-qemu
- # groupdel libvirt-qemu
- 
- Existing files assigned to libvirt-qemu user/group will not have its
- permissions changed, so not sure that's a clean transition from an
- existing system/files, but the permissions should be correct for all the
- installed and new files created from there on.
- 
- Hi Mauricio,
- Yes, after applied this patch, the migration is working fine without any IO 
error.  
- 
- I am able to migrate between 2 Host (Tiny & Micro) using the NFS mount method 
without any IO issue. 
- root@micro:~# id libvirt-qemu
- uid=64055(libvirt-qemu) gid=117(kvm) groups=117(kvm),64055(libvirt-qemu)
- 
- root@tiny:~# id libvirt-qemu
- uid=64055(libvirt-qemu) gid=116(kvm) groups=116(kvm),64055(libvirt-qemu)
- 
- Hi Canonical,
- @taco-screen-team
- 
- The attached patches are for Zesty, Xenial, and Debian sid (which I plan
- to submit if the UID/GID allocation request is granted).
- 
- afaik @cjwatson is the maintainer of base-passwd on Debian, and could 
review/grant/deny the allocation request. 
- Per Ubuntu Policy, we'd need it ack'ed on Debian first.
- 
- Then the libvirt patches.. all packages were verified.
- For Z, X, and sid, the test-case follows this pattern, and has correct 
behavior/results.
- 
- Thanks!
- 
- 
- Test-case
- ===
- 
- Package is not installed -- no libvirt-qemu user/group:
- ---
- 
- # getent passwd libvirt-qemu
- # 
- 
- # getent group libvirt-qemu
- # 
- 
- 
- Package is installed -- libvirt-qemu user/group is created w/ allocated 
uid/gid:
- ---
- 
- # dpkg -i libvirt{-bin,0}_1.3.1-1ubuntu10.5uidgid1_*.deb
- 
- # getent passwd libvirt-qemu
- libvirt-qemu:x:64055:117:Libvirt Qemu,,,:/var/lib/libvirt:/bin/false
- 
- # getent group libvirt-qemu
- libvirt-qemu:x:64055:libvirt-qemu
- 
- 
- Package is uninstalled -- libvirt-qemu user/group is removed:
- ---
- 
- # apt-get --yes purge libvirt-bin
- 
- # getent passwd libvirt-qemu
- # 
- 
- # getent group libvirt-qemu
- # 
- 
- 
- Package is installed with uid/gid taken -- libvirt-qemu user/group is created 
with other uid/gid:
- ---
- 
- # useradd --uid 64055 testuser
- # groupadd --gid 64055 testgroup
- 
- # dpkg -i libvirt{-bin,0}_1.3.1-1ubuntu10.5uidgid1_*.deb
- # echo $?
- 0
- 
- # getent passwd libvirt-qemu
- libvirt-qemu:x:113:117:Libvirt Qemu,,,:/var/lib/libvirt:/bin/false
- 
- # getent group libvirt-qemu
- libvirt-qemu:x:118:libvirt-qemu
+ <...>

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1637601

Title:
  UbuntuKVM: migration using NFS mount fails #190

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1637601/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to