RE: [PATCH 0/2] virtiofsd: Enable xattr if xattrmap is used
Thank you both for the suggestions, I had not took a look recently, I will check today in my afternoon. Cheers. > -Original Message- > From: Dr. David Alan Gilbert > Sent: Thursday, May 6, 2021 10:53 AM > To: Venegas Munoz, Jose Carlos > Cc: virtio...@redhat.com; qemu-devel@nongnu.org; vgo...@redhat.com > Subject: Re: [PATCH 0/2] virtiofsd: Enable xattr if xattrmap is used > > * Carlos Venegas (jose.carlos.venegas.mu...@intel.com) wrote: > > > > Using xattrmap for Kata Containers we found that xattr is should be > > used or xattrmap wont work. These patches enable xattr when -o > > xattrmap is used. Also, they add help for the xattrmap option on > > `virtiofsd - > -help` output. > > Queued. You might like to submit some more patches to give the error that > Greg was suggesting and/or update some docs. > > Dave > > > Carlos Venegas (2): > > virtiofsd: Allow use "-o xattrmap" without "-o xattr" > > virtiofsd: Add help for -o xattr-mapping > > > > tools/virtiofsd/helper.c | 3 +++ > > tools/virtiofsd/passthrough_ll.c | 1 + > > 2 files changed, 4 insertions(+) > > > > -- > > 2.25.1 > > > > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
Cool, thanks I will bring some results this week. On 17/11/20 16:24, "Vivek Goyal" wrote: On Tue, Nov 17, 2020 at 04:00:09PM +0000, Venegas Munoz, Jose Carlos wrote: > > Not sure what the default is for 9p, but comparing > > default to default will definitely not be apples to apples since this > > mode is nonexistent in 9p. > > In Kata we are looking for the best config for fs compatibility and performance. So even if are no apples to apples, > we are for the best config for both and comparing the best that each of them can do. Can we run tests in more than one configuration. Right now you are using cache=mmap for virtio-9p and cache=auto for virtiofs and as Miklos said this is not apples to apples comparison. So you can continue to run above but also run additional tests if time permits. virtio-9p virtio-fs cache=mmap cache=none + DAX cache=none cache=none cache=loose cache=always Given you are using cache=mmap for virtio-9p, "cache=none + DAX" is somewhat equivalent of that. Provides strong coherency as well as allow for mmap() to work. Now kernel virtiofs DAX support got merged in 5.10-rc1. For qemu, you can use virtio-fs-dev branch. https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev Thanks Vivek
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
For all the cases the memory for the guest is 2G. On 17/11/20 12:55, "Vivek Goyal" wrote: On Tue, Nov 17, 2020 at 04:00:09PM +0000, Venegas Munoz, Jose Carlos wrote: > > Not sure what the default is for 9p, but comparing > > default to default will definitely not be apples to apples since this > > mode is nonexistent in 9p. > > In Kata we are looking for the best config for fs compatibility and performance. So even if are no apples to apples, > we are for the best config for both and comparing the best that each of them can do. > > In the case of Kata for 9pfs(this is the config we have found has better performance and fs compatibility in general) we have: > ``` > -device virtio-9p-pci # device type > ,disable-modern=false > ,fsdev=extra-9p-kataShared # attr: device id for fsdev > ,mount_tag=kataShared # attr: tag on how will be found de sharedfs > ,romfile= > -fsdev local #local: Simply lets QEMU call the individual VFS functions (more or less) directly on host. > ,id=extra-9p-kataShared > ,path=${SHARED_PATH} # attrs: path to share > ,security_model=none # > #passthrough: Files are stored using the same credentials as they are created on the guest. This requires QEMU to run as # root. > #none: Same as "passthrough" except the sever won't report failures if it fails to set file attributes like ownership # > # (chown). This makes a passthrough like security model usable for people who run kvm as non root. > ,multidevs=remap > ``` > > The mount options are: > ``` > trans=virtio > ,version=9p2000.L > ,cache=mmap > ,"nodev" # Security: The nodev mount option specifies that the filesystem cannot contain special devices. > ,"msize=8192" # msize: Maximum packet size including any headers. > ``` How much RAM you are giving to these containers when using virtio-9p? Vivek
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
> Not sure what the default is for 9p, but comparing > default to default will definitely not be apples to apples since this > mode is nonexistent in 9p. In Kata we are looking for the best config for fs compatibility and performance. So even if are no apples to apples, we are for the best config for both and comparing the best that each of them can do. In the case of Kata for 9pfs(this is the config we have found has better performance and fs compatibility in general) we have: ``` -device virtio-9p-pci # device type ,disable-modern=false ,fsdev=extra-9p-kataShared # attr: device id for fsdev ,mount_tag=kataShared # attr: tag on how will be found de sharedfs ,romfile= -fsdev local #local: Simply lets QEMU call the individual VFS functions (more or less) directly on host. ,id=extra-9p-kataShared ,path=${SHARED_PATH} # attrs: path to share ,security_model=none # #passthrough: Files are stored using the same credentials as they are created on the guest. This requires QEMU to run as # root. #none: Same as "passthrough" except the sever won't report failures if it fails to set file attributes like ownership # # (chown). This makes a passthrough like security model usable for people who run kvm as non root. ,multidevs=remap ``` The mount options are: ``` trans=virtio ,version=9p2000.L ,cache=mmap ,"nodev" # Security: The nodev mount option specifies that the filesystem cannot contain special devices. ,"msize=8192" # msize: Maximum packet size including any headers. ``` Aditionally we use this patch https://github.com/kata-containers/packaging/blob/stable-1.12/qemu/patches/5.0.x/0001-9p-removing-coroutines-of-9p-to-increase-the-I-O-per.patch In kata for virtiofs I am testing use: ``` -chardev socket ,id=ID-SOCKET ,path=.../vhost-fs.sock # Path to vhost socket -device vhost-user-fs-pci # ,chardev=ID-SOCKET ,tag=kataShared ,romfile= -object memory-backend-file # force use of memory sharable with virtiofsd. ,id=dimm1 ,size=2048M ,mem-path=/dev/shm ,share=on ``` Virtiofsd: ``` -o cache=auto -o no_posix_lock # enable/disable remote posix lock --thread-pool-size=0 ``` And virtiofs mount options: ``` source:\"kataShared\" fstype:\"virtiofs\" ``` With this patch, comparing this two configurations, I have seen better performance with virtiofs in different hosts. - Carlos - On 12/11/20 3:06, "Miklos Szeredi" wrote: On Fri, Nov 6, 2020 at 11:35 PM Vivek Goyal wrote: > > On Fri, Nov 06, 2020 at 08:33:50PM +, Venegas Munoz, Jose Carlos wrote: > > Hi Vivek, > > > > I have tested with Kata 1.12-apha0, the results seems that are better for the use fio config I am tracking. > > > > The fio config does randrw: > > > > fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=200M --readwrite=randrw --rwmixread=75 > > > > Hi Carlos, > > Thanks for the testing. > > So basically two conclusions from your tests. > > - for virtiofs, --thread-pool-size=0 is performing better as comapred > to --thread-pool-size=1 as well as --thread-pool-size=64. Approximately > 35-40% better. > > - virtio-9p is still approximately 30% better than virtiofs > --thread-pool-size=0. > > As I had done the analysis that this particular workload (mixed read and > write) is bad with virtiofs because after every write we are invalidating > attrs and cache so next read ends up fetching attrs again. I had posted > patches to gain some of the performance. > > https://lore.kernel.org/linux-fsdevel/20200929185015.gg220...@redhat.com/ > > But I got the feedback to look into implementing file leases instead. Hmm, the FUSE_AUTO_INVAL_DATA feature is buggy, how about turning it off for now? 9p doesn't have it, so no point in enabling it for virtiofs by default. Also I think some confusion comes from cache=auto being the default for virtiofs.Not sure what the default is for 9p, but comparing default to default will definitely not be apples to apples since this mode is nonexistent in 9p. 9p:cache=none <-> virtiofs:cache=none 9p:cache=loose <-> virtiofs:cache=always "9p:cache=mmap" and "virtiofs:cache=auto" have no match. Untested patch attached. Thanks, Miklos
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
Hi Vivek, I have tested with Kata 1.12-apha0, the results seems that are better for the use fio config I am tracking. The fio config does randrw: fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=200M --readwrite=randrw --rwmixread=75 - I can see better results with this patch - 9pfs is still better in the case of Kata because of the use of: https://github.com/kata-containers/packaging/blob/stable-1.12/qemu/patches/5.0.x/0001-9p-removing-coroutines-of-9p-to-increase-the-I-O-per.patch Results: ./fio-results-run_virtiofs_tread_pool_0 READ: bw=42.8MiB/s (44.8MB/s), 42.8MiB/s-42.8MiB/s (44.8MB/s-44.8MB/s), io=150MiB (157MB), run=3507-3507msec WRITE: bw=14.3MiB/s (14.9MB/s), 14.3MiB/s-14.3MiB/s (14.9MB/s-14.9MB/s), io=49.0MiB (52.4MB), run=3507-3507msec ./fio-results-run_9pfs READ: bw=55.1MiB/s (57.8MB/s), 55.1MiB/s-55.1MiB/s (57.8MB/s-57.8MB/s), io=150MiB (157MB), run=2722-2722msec WRITE: bw=18.4MiB/s (19.3MB/s), 18.4MiB/s-18.4MiB/s (19.3MB/s-19.3MB/s), io=49.0MiB (52.4MB), run=2722-2722msec ./fio-results-run_virtiofs_tread_pool_1 READ: bw=34.5MiB/s (36.1MB/s), 34.5MiB/s-34.5MiB/s (36.1MB/s-36.1MB/s), io=150MiB (157MB), run=4354-4354msec WRITE: bw=11.5MiB/s (12.0MB/s), 11.5MiB/s-11.5MiB/s (12.0MB/s-12.0MB/s), io=49.0MiB (52.4MB), run=4354-4354msec ./fio-results-run_virtiofs_tread_pool_64 READ: bw=32.3MiB/s (33.8MB/s), 32.3MiB/s-32.3MiB/s (33.8MB/s-33.8MB/s), io=150MiB (157MB), run=4648-4648msec WRITE: bw=10.8MiB/s (11.3MB/s), 10.8MiB/s-10.8MiB/s (11.3MB/s-11.3MB/s), io=49.0MiB (52.4MB), run=4648-4648msec Next: - run https://github.com/rhvgoyal/virtiofs-tests for tread_pool_0, tread_pool_64, and 9pfs - Test https://lore.kernel.org/linux-fsdevel/20201009181512.65496-1-vgo...@redhat.com/ All the testing for kata is based in: https://github.com/jcvenegas/mrunner/blob/master/scripts/bash_workloads/build-qemu-and-run-fio-test.sh I ran this using an azure VM. Cheers, Carlos On 05/11/20 13:53, "Vivek Goyal" wrote: On Thu, Nov 05, 2020 at 02:44:16PM -0500, Vivek Goyal wrote: > Right now we create a thread pool and main thread hands over the request > to thread in thread pool to process. Number of threads in thread pool > can be managed by option --thread-pool-size. > > There is a chance that in case of some workloads, we might get better > performance if we don't handover the request to a different thread > and process in the context of thread receiving the request. > > To implement that, redefine the meaning of --thread-pool-size=0 to > mean that don't use a thread pool. Instead process the request in > the context of thread receiving request from the queue. > > I can't think how --thread-pool-size=0 is useful and hence using > that. If it is already useful somehow, I could look at defining > a new option say "--no-thread-pool". > > I think this patch will be used more as a debug help to do comparison > when it is more effecient to do not hand over the requests to a > thread pool. I ran virtiofs-tests to comapre --thread-pool-size=0 and --thread-pool-size=64. And results seem to be all over the place. In some cases thread pool seems to perform batter and in other cases no-thread-pool seems to perform better. But in general it looks like that except for the case of libaio workload, no-thread-pool is helping. Thanks Vivek NAMEWORKLOADBandwidth IOPS thread-pool seqread-psync 682.4mb 170.6k no-thread-pool seqread-psync 679.3mb 169.8k thread-pool seqread-psync-multi 2415.9mb603.9k no-thread-pool seqread-psync-multi 2528.5mb632.1k thread-pool seqread-mmap591.7mb 147.9k no-thread-pool seqread-mmap595.6mb 148.9k thread-pool seqread-mmap-multi 2195.3mb548.8k no-thread-pool seqread-mmap-multi 2286.1mb571.5k thread-pool seqread-libaio 329.1mb 82.2k no-thread-pool seqread-libaio 271.5mb 67.8k thread-pool seqread-libaio-multi1387.1mb346.7k no-thread-pool seqread-libaio-multi1508.2mb377.0k thread-pool randread-psync 59.0mb 14.7k no-thread-pool randread-psync 78.5mb 19.6k thread-pool randread-psync-multi226.4mb 56.6k no-thread-pool randread-psync-multi289.2mb 72.3k thread-pool randread
Re: tools/virtiofs: Multi threading seems to hurt performance
Hi Folks, Sorry for the delay about how to reproduce `fio` data. I have some code to automate testing for multiple kata configs and collect info like: - Kata-env, kata configuration.toml, qemu command, virtiofsd command. See: https://github.com/jcvenegas/mrunner/ Last time we agreed to narrow the cases and configs to compare virtiofs and 9pfs The configs where the following: - qemu + virtiofs(cache=auto, dax=0) a.ka. `kata-qemu-virtiofs` WITOUT xattr - qemu + 9pfs a.k.a `kata-qemu` Please take a look to the html and raw results I attach in this mail. ## Can I say that the current status is: - As David tests and Vivek points, for the fio workload you are using, seems that the best candidate should be cache=none - In the comparison I took cache=auto as Vivek suggested, this make sense as it seems that will be the default for kata. - Even if for this case cache=none works better, Can I assume that cache=auto dax=0 will be better than any 9pfs config? (once we find the root cause) - Vivek is taking a look to mmap mode from 9pfs, to see how different is with current virtiofs implementations. In 9pfs for kata, this is what we use by default. ## I'd like to identify what should be next on the debug/testing? - Should I try to narrow by only trying to with qemu? - Should I try first with a new patch you already have? - Probably try with qemu without static build? - Do the same test with thread-pool-size=1? Please let me know how can I help. Cheers. On 22/09/20 12:47, "Vivek Goyal" wrote: On Tue, Sep 22, 2020 at 11:25:31AM +0100, Dr. David Alan Gilbert wrote: > * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: > > Hi, > > I've been doing some of my own perf tests and I think I agree > > about the thread pool size; my test is a kernel build > > and I've tried a bunch of different options. > > > > My config: > > Host: 16 core AMD EPYC (32 thread), 128G RAM, > > 5.9.0-rc4 kernel, rhel 8.2ish userspace. > > 5.1.0 qemu/virtiofsd built from git. > > Guest: Fedora 32 from cloud image with just enough extra installed for > > a kernel build. > > > > git cloned and checkout v5.8 of Linux into /dev/shm/linux on the host > > fresh before each test. Then log into the guest, make defconfig, > > time make -j 16 bzImage, make clean; time make -j 16 bzImage > > The numbers below are the 'real' time in the guest from the initial make > > (the subsequent makes dont vary much) > > > > Below are the detauls of what each of these means, but here are the > > numbers first > > > > virtiofsdefault4m0.978s > > 9pdefault 9m41.660s > > virtiofscache=none10m29.700s > > 9pmmappass 9m30.047s > > 9pmbigmsize 12m4.208s > > 9pmsecnone 9m21.363s > > virtiofscache=noneT1 7m17.494s > > virtiofsdefaultT1 3m43.326s > > > > So the winner there by far is the 'virtiofsdefaultT1' - that's > > the default virtiofs settings, but with --thread-pool-size=1 - so > > yes it gives a small benefit. > > But interestingly the cache=none virtiofs performance is pretty bad, > > but thread-pool-size=1 on that makes a BIG improvement. > > Here are fio runs that Vivek asked me to run in my same environment > (there are some 0's in some of the mmap cases, and I've not investigated > why yet). cache=none does not allow mmap in case of virtiofs. That's when you are seeing 0. >virtiofs is looking good here in I think all of the cases; > there's some division over which cinfig; cache=none > seems faster in some cases which surprises me. I know cache=none is faster in case of write workloads. It forces direct write where we don't call file_remove_privs(). While cache=auto goes through file_remove_privs() and that adds a GETXATTR request to every WRITE request. Vivek results.tar.gz Description: results.tar.gz Title: vitiofs 9pfs: fio comparsion vitiofs 9pfs: fio comparsionqemu + virtiofs(cache=auto, dax=0) a.ka. kata-qemu-virtiofsqemu + 9pfs a.k.a kata-qemuPlatformPacket : c1.small.x86-01 PROC1 x Intel E3-1240 v3 RAM32GB DISK2 x 120GB SSD NIC2 x 1Gbps Bonded Port Nproc: 8EnvNamekata-qemu-virtiofskata-qemuKata version1.12.0-alpha11.12.0-alpha1Qemu versionversion 5.0.0 (kata-static)5.0.0 (kata-static)Qemu code repohttps://gitlab.com/virtio-fs/qemu.githttps://github.com/qemu/qemuQemu tagqemu5.0-virtiofs-with51bits-daxv5.0.0Kernel codehttps://gitlab.com/virtio-fs/linux.githttps://cdn.kernel.org/pub/linux/kernel/v4.x/kernel tagkata-v5.6-april-09-2020v5.4.60OS:18.04.2 LTS (Bionic Beaver)Host kernel:4.15.0-50-generic #54-Ubuntufio workload:fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 --output=/output/fio.tx Results: kata-qemu(9pfs):READ: bw=211MiB/s (222