On 7/10/20 6:50 AM, Richard W.M. Jones wrote:
Turn the existing nbdkit-gzip-plugin into a filter so it can be
applied on top of files or other sources:

   nbdkit file --filter=gzip file.gz
   nbdkit curl --filter=gzip https://example.com/disk.gz

Because of the nature of the gzip format which is not blocked based
and thus not seekable, this filter caches the whole uncompressed file
in a hidden temporary file.  This is required in order to implement
.get_size.  See this link for a more detailed explanation:
https://stackoverflow.com/a/9213826

This commit deprecates nbdkit-gzip-plugin and suggests removal in
nbdkit 1.26.
---

Nice - this one seems like a fairly straight-forward conversion.


+++ b/filters/tar/nbdkit-tar-filter.pod
@@ -42,11 +42,13 @@ server use:
   nbdkit -r curl https://example.com/file.tar \
          --filter=tar tar-entry=disk.img
-=head2 Open an xz-compressed tar file (read-only)
+=head2 Open an gzip-compressed tar file (read-only)

Should this heading s/gzip-// ?

This filter cannot handle compressed tar files itself, but you can
-combine it with L<nbdkit-xz-filter(1)>:
+combine it with L<nbdkit-gzip-filter(1)> or L<nbdkit-xz-filter(1)>:
+ nbdkit file filename.tar.gz \
+        --filter=tar tar-entry=disk.img --filter=gzip
   nbdkit file filename.tar.xz \
          --filter=tar tar-entry=disk.img --filter=xz

+++ b/tests/Makefile.am
@@ -557,23 +557,6 @@ EXTRA_DIST += test-floppy.sh
  TESTS += test-full.sh
  EXTRA_DIST += test-full.sh
-# gzip plugin test.
-if HAVE_MKE2FS_WITH_D
-if HAVE_ZLIB
-LIBGUESTFS_TESTS += test-gzip
-check_DATA += disk.gz
-CLEANFILES += disk.gz

Is it worth keeping this around until we actually retire the plugin?

+++ b/filters/gzip/gzip.c
@@ -0,0 +1,347 @@

+/* The first thread to call gzip_prepare uncompresses the whole plugin. */
+static int
+do_uncompress (struct nbdkit_next_ops *next_ops, void *nxdata)
+{
+  int64_t compressed_size;
+  z_stream strm;
+  int zerr;
+  const char *tmpdir;
+  size_t len;
+  char *template;
+  CLEANUP_FREE char *in_block = NULL, *out_block = NULL;
+
+  /* This was the same buffer size as used in the old plugin.  As far
+   * as I know it was chosen at random.
+   */
+  const size_t block_size = 128 * 1024;
+
+  assert (size == -1);
+
+  /* Get the size of the underlying plugin. */
+  compressed_size = next_ops->get_size (nxdata);

If we wanted, we could save this...


+/* Similar to above, whatever the plugin says, extents are not
+ * supported.
+ */
+static int
+gzip_can_extents (struct nbdkit_next_ops *next_ops, void *nxdata,
+                  void *handle)

Well, since we stored it as a temp file, we could actually report extents off of lseek(SEEK_HOLE). But that can be added separately, if we want it (it goes back to your notion of a library for file-based accessor functions).

+{
+  return 0;
+}
+
+/* We are already operating as a cache regardless of the plugin's
+ * underlying .can_cache, but it's easiest to just rely on nbdkit's
+ * behavior of calling .pread for caching.
+ */
+static int
+gzip_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata,
+                void *handle)
+{
+  return NBDKIT_CACHE_EMULATE;
+}
+
+/* Get the file size. */
+static int64_t
+gzip_get_size (struct nbdkit_next_ops *next_ops, void *nxdata,
+               void *handle)
+{
+  /* This must be true because gzip_prepare must have been called. */
+  assert (size >= 0);
+
+  /* We must call underlying get_size even though we don't use the
+   * result, because it caches the plugin size in server/backend.c.
+   */
+  if (next_ops->get_size (nxdata) == -1)
+    return -1;

...and verify here that the underlying plugin isn't changing size. But probably not necessary.

With the tar filter, you HAD to call next_ops->get_size() per connection, because you called next_ops->pread() per connection. But with this filter, you don't actually have to call next_ops->get_size(), because your only use of next_ops->pread() is during the initial decompression, and the rest of this filter reads from local cache rather than deferring to the plugin.

Overall looks good.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

_______________________________________________
Libguestfs mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/libguestfs

Reply via email to