> -----Original Message----- > From: Hao Xiang <hao.xi...@bytedance.com> > Sent: Thursday, January 11, 2024 2:40 PM > To: Fabiano Rosas <faro...@suse.de> > Cc: Liu, Yuan1 <yuan1....@intel.com>; Bryan Zhang > <bryan.zh...@bytedance.com>; qemu-devel@nongnu.org; > marcandre.lur...@redhat.com; pet...@redhat.com; quint...@redhat.com; > peter.mayd...@linaro.org; berra...@redhat.com > Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce unimplemented > 'qatzip' compression method > > On Mon, Jan 8, 2024 at 12:28 PM Fabiano Rosas <faro...@suse.de> wrote: > > > > "Liu, Yuan1" <yuan1....@intel.com> writes: > > > > >> -----Original Message----- > > >> From: Hao Xiang <hao.xi...@bytedance.com> > > >> Sent: Saturday, January 6, 2024 7:53 AM > > >> To: Fabiano Rosas <faro...@suse.de> > > >> Cc: Bryan Zhang <bryan.zh...@bytedance.com>; qemu-devel@nongnu.org; > > >> marcandre.lur...@redhat.com; pet...@redhat.com; > > >> quint...@redhat.com; peter.mayd...@linaro.org; Liu, Yuan1 > > >> <yuan1....@intel.com>; berra...@redhat.com > > >> Subject: Re: [External] Re: [PATCH 3/5] migration: Introduce > > >> unimplemented 'qatzip' compression method > > >> > > >> On Fri, Jan 5, 2024 at 12:07 PM Fabiano Rosas <faro...@suse.de> > wrote: > > >> > > > >> > Bryan Zhang <bryan.zh...@bytedance.com> writes: > > >> > > > >> > +cc Yuan Liu, Daniel Berrangé > > >> > > > >> > > Adds support for 'qatzip' as an option for the multifd > > >> > > compression method parameter, but copy-pastes the no-op logic > > >> > > to leave the actual methods effectively unimplemented. This is > > >> > > in preparation of a subsequent commit that will implement > > >> > > actually using QAT for compression and decompression. > > >> > > > > >> > > Signed-off-by: Bryan Zhang <bryan.zh...@bytedance.com> > > >> > > Signed-off-by: Hao Xiang <hao.xi...@bytedance.com> > > >> > > --- > > >> > > hw/core/qdev-properties-system.c | 6 ++- > > >> > > migration/meson.build | 1 + > > >> > > migration/multifd-qatzip.c | 81 > > >> ++++++++++++++++++++++++++++++++ > > >> > > migration/multifd.h | 1 + > > >> > > qapi/migration.json | 5 +- > > >> > > 5 files changed, 92 insertions(+), 2 deletions(-) create mode > > >> > > 100644 migration/multifd-qatzip.c > > >> > > > > >> > > diff --git a/hw/core/qdev-properties-system.c > > >> > > b/hw/core/qdev-properties-system.c > > >> > > index 1a396521d5..d8e48dcb0e 100644 > > >> > > --- a/hw/core/qdev-properties-system.c > > >> > > +++ b/hw/core/qdev-properties-system.c > > >> > > @@ -658,7 +658,11 @@ const PropertyInfo > > >> > > qdev_prop_fdc_drive_type = { const PropertyInfo > qdev_prop_multifd_compression = { > > >> > > .name = "MultiFDCompression", > > >> > > .description = "multifd_compression values, " > > >> > > - "none/zlib/zstd", > > >> > > + "none/zlib/zstd" > > >> > > +#ifdef CONFIG_QATZIP > > >> > > + "/qatzip" > > >> > > +#endif > > >> > > + , > > >> > > .enum_table = &MultiFDCompression_lookup, > > >> > > .get = qdev_propinfo_get_enum, > > >> > > .set = qdev_propinfo_set_enum, diff --git > > >> > > a/migration/meson.build b/migration/meson.build index > > >> > > 92b1cc4297..e20f318379 100644 > > >> > > --- a/migration/meson.build > > >> > > +++ b/migration/meson.build > > >> > > @@ -40,6 +40,7 @@ if get_option('live_block_migration').allowed() > > >> > > system_ss.add(files('block.c')) endif > > >> > > system_ss.add(when: zstd, if_true: files('multifd-zstd.c')) > > >> > > +system_ss.add(when: qatzip, if_true: > > >> > > +files('multifd-qatzip.c')) > > >> > > > > >> > > specific_ss.add(when: 'CONFIG_SYSTEM_ONLY', > > >> > > if_true: files('ram.c', diff --git > > >> > > a/migration/multifd-qatzip.c b/migration/multifd-qatzip.c new > file > > >> > > mode 100644 index 0000000000..1733bbddb7 > > >> > > --- /dev/null > > >> > > +++ b/migration/multifd-qatzip.c > > >> > > @@ -0,0 +1,81 @@ > > >> > > +/* > > >> > > + * Multifd QATzip compression implementation > > >> > > + * > > >> > > + * Copyright (c) Bytedance > > >> > > + * > > >> > > + * Authors: > > >> > > + * Bryan Zhang <bryan.zh...@bytedance.com> > > >> > > + * Hao Xiang <hao.xi...@bytedance.com> > > >> > > + * > > >> > > + * This work is licensed under the terms of the GNU GPL, version > 2 or > > >> later. > > >> > > + * See the COPYING file in the top-level directory. > > >> > > + */ > > >> > > + > > >> > > +#include "qemu/osdep.h" > > >> > > +#include "exec/ramblock.h" > > >> > > +#include "exec/target_page.h" > > >> > > +#include "qapi/error.h" > > >> > > +#include "migration.h" > > >> > > +#include "options.h" > > >> > > +#include "multifd.h" > > >> > > + > > >> > > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) > { > > >> > > + return 0; > > >> > > +} > > >> > > + > > >> > > +static void qatzip_send_cleanup(MultiFDSendParams *p, Error > **errp) > > >> > > +{}; > > >> > > + > > >> > > +static int qatzip_send_prepare(MultiFDSendParams *p, Error > **errp) > > >> > > +{ > > >> > > + MultiFDPages_t *pages = p->pages; > > >> > > + > > >> > > + for (int i = 0; i < p->normal_num; i++) { > > >> > > + p->iov[p->iovs_num].iov_base = pages->block->host + p- > > >> >normal[i]; > > >> > > + p->iov[p->iovs_num].iov_len = p->page_size; > > >> > > + p->iovs_num++; > > >> > > + } > > >> > > + > > >> > > + p->next_packet_size = p->normal_num * p->page_size; > > >> > > + p->flags |= MULTIFD_FLAG_NOCOMP; > > >> > > + return 0; > > >> > > +} > > >> > > + > > >> > > +static int qatzip_recv_setup(MultiFDRecvParams *p, Error **errp) > { > > >> > > + return 0; > > >> > > +} > > >> > > + > > >> > > +static void qatzip_recv_cleanup(MultiFDRecvParams *p) {}; > > >> > > + > > >> > > +static int qatzip_recv_pages(MultiFDRecvParams *p, Error **errp) > { > > >> > > + uint32_t flags = p->flags & MULTIFD_FLAG_COMPRESSION_MASK; > > >> > > + > > >> > > + if (flags != MULTIFD_FLAG_NOCOMP) { > > >> > > + error_setg(errp, "multifd %u: flags received %x flags > > >> expected %x", > > >> > > + p->id, flags, MULTIFD_FLAG_NOCOMP); > > >> > > + return -1; > > >> > > + } > > >> > > + for (int i = 0; i < p->normal_num; i++) { > > >> > > + p->iov[i].iov_base = p->host + p->normal[i]; > > >> > > + p->iov[i].iov_len = p->page_size; > > >> > > + } > > >> > > + return qio_channel_readv_all(p->c, p->iov, p->normal_num, > > >> > > +errp); } > > >> > > + > > >> > > +static MultiFDMethods multifd_qatzip_ops = { > > >> > > + .send_setup = qatzip_send_setup, > > >> > > + .send_cleanup = qatzip_send_cleanup, > > >> > > + .send_prepare = qatzip_send_prepare, > > >> > > + .recv_setup = qatzip_recv_setup, > > >> > > + .recv_cleanup = qatzip_recv_cleanup, > > >> > > + .recv_pages = qatzip_recv_pages }; > > >> > > + > > >> > > +static void multifd_qatzip_register(void) { > > >> > > + multifd_register_ops(MULTIFD_COMPRESSION_QATZIP, > > >> > > +&multifd_qatzip_ops); } > > >> > > + > > >> > > +migration_init(multifd_qatzip_register); > > >> > > diff --git a/migration/multifd.h b/migration/multifd.h index > > >> > > a835643b48..5600f7fc82 100644 > > >> > > --- a/migration/multifd.h > > >> > > +++ b/migration/multifd.h > > >> > > @@ -33,6 +33,7 @@ int multifd_queue_page(QEMUFile *f, RAMBlock > > >> > > *block, ram_addr_t offset); #define MULTIFD_FLAG_NOCOMP (0 << 1) > > >> > > #define MULTIFD_FLAG_ZLIB (1 << 1) #define MULTIFD_FLAG_ZSTD (2 > << > > >> > > 1) > > >> > > +#define MULTIFD_FLAG_QATZIP (3 << 1) > > >> > > > > >> > > /* This value needs to be a multiple of qemu_target_page_size() > */ > > >> > > #define MULTIFD_PACKET_SIZE (512 * 1024) diff --git > > >> > > a/qapi/migration.json b/qapi/migration.json index > > >> > > 6d5a4b0489..e3cc195aed 100644 > > >> > > --- a/qapi/migration.json > > >> > > +++ b/qapi/migration.json > > >> > > @@ -625,11 +625,14 @@ > > >> > > # > > >> > > # @zstd: use zstd compression method. > > >> > > # > > >> > > +# @qatzip: use qatzip compression method. > > >> > > +# > > >> > > # Since: 5.0 > > >> > > ## > > >> > > { 'enum': 'MultiFDCompression', > > >> > > 'data': [ 'none', 'zlib', > > >> > > - { 'name': 'zstd', 'if': 'CONFIG_ZSTD' } ] } > > >> > > + { 'name': 'zstd', 'if': 'CONFIG_ZSTD' }, > > >> > > + { 'name': 'qatzip', 'if': 'CONFIG_QATZIP'} ] } > > >> > > > >> > In another thread adding support to another Intel accelerator (IAA) > we > > >> > decided that it was better to select the offloading as an > accelerator > > >> > method to multifd zlib rather than as an entirely new compression > > >> > format. Take a look at that discussion: > > >> > https://lore.kernel.org/r/ztfcnqbbqlmsu...@redhat.com > > >> > > >> We had some early discussion with Intel folks (probably a different > team > > >> than the one with the above patchset). The understanding at the time > is > > >> that QAT is good at both bulk data compression and decompression. IAA > is > > >> good at decompression with smaller data size but compression > performance > > >> is not the best. In multifd, we are compressing up to 128 4k pages at > a > > >> time and potentially this can increase by configuring the packet > size, at > > >> the time we thought QAT could be a better fit in the multifd live > > >> migration scenario. We would like to hear more from Intel. > > >> > > >> From our benchmark testing, with two QAT devices, we can get deflate > > >> compression throughout to around 7GB/s with ~160% CPU. That's beating > the > > >> current software implementation (zlib and zstd) by a lot. We are > still > > >> tuning the configuration in QEMU live migration now. > > >> > > >> > > > >> > As I understand it, QAT + QATzip would be compatible with both zlib > > >> > and IAA + QPL, so we'd add another accelerator method like this: > > >> > > > >> > https://lore.kernel.org/r/20240103112851.908082-3- > yuan1....@intel.com > > >> > > > >> > > >> I quickly read over the IAA patchset and I saw this: > > >> > > >> "However, due to some reasons, QPL is currently not compatible with > the > > >> existing Zlib method that Zlib compressed data can be decompressed by > QPl > > >> and vice versa." > > >> > > >> The above probably means the current zlib software implementation and > IAA > > >> are not compatible. > > >> > > >> For QAT, although, both Intel's QATzip and zlib are internally using > > >> deflate, QATzip only supports deflate with a 4 byte header, deflate > > >> wrapped by Gzip header and footer, or deflate wrapped by Intel® QAT > > >> Gzip* extension header and footer. None of the headers can be > recognized > > >> by zlib software implementation is my understanding. So if we want to > make > > >> them compatible with zlib, the QATzip library needs to support that. > > > > > > The QPL library currently cannot support the Z_SYNC_FLULSH operation > of zlib steaming. > > > This is the reason why it is not compatible with zlib. > > > > > When doing the QPL compression, if we set both QPL_FLAG_FIRST and > QPL_FLAG_LAST flags and run a single job for all pages in a multifd > packet, would it generate the same format as Z_SYNC_FLULSH in the > current zlib-based code?
Yes, When we want to use an IAA physics engine for data compression, We can use a QPL job and set the QPL_FLAG_FIRST and QPL_FLAG_LAST flags, which will generate compressed data according to the deflate method. Z_SYNC_FLUSH not only flushes the buffer, but also maintains data alignment and maintains the current compression state. QPL is currently unable to do this. > > > > I had understood from previous discussion that we'd be able to at least > > support compression with QPL and decompression with the existing > > zlib-based code. Is that not correct? I was about to suggest > > experimenting with the window size in the existing code to hopefully > > solve the 4kb window size issue. If there are other limitations, then > > that will not be enough. > > > > > Also, can you point to the source of that statement about Z_SYNC_FLUSH, > > I couldn't find more information about it in the documentation. > > > > >> > All that, of course, assuming we even want to support both > > >> > accelerators. They're addressing the same problem after all. I > wonder > > >> > how we'd choose a precedence, since both seem to be present in the > > >> > same processor family. > > >> > > > >> > > > >> > > >> That's an interesting question :-) I think overall performance > (throughput > > >> and CPU overhead) should both be considered. IAA and QAT accelerators > > >> don't present on all systems. We Bytedance choose to have both on our > > >> platform when purchasing from Intel.