Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-20 Thread Richard W.M. Jones
On Sat, Jan 19, 2019 at 11:39:41AM +, Richard W.M. Jones wrote:
> 
> Attached is a NON-working patch to nbdkit-partitioning-plugin which
> adds logical partition support.  I don't think I've fully understood
> how the EBR fields are supposed to be initialized (or else they don't
> work how is described in online documentation).  This actually causes
> parted to print an internal error, while fdisk gives a warning and the
> size of the logical partition is wrong.
> 
> Anyway I've run out of time to work on it this weekend, but I may be
> able to have another look next week.

Sheesh.  Apparently the type byte in the link field must
be the extended partition type byte (eg. 0x5 or 0xf), NOT
the type of the next partition.

This fixes it:

$ git diff
diff --git a/plugins/partitioning/partition-mbr.c 
b/plugins/partitioning/partition-mbr.c
index 6f76432..6b256d1 100644
--- a/plugins/partitioning/partition-mbr.c
+++ b/plugins/partitioning/partition-mbr.c
@@ -129,7 +129,7 @@ create_mbr_layout (void)
  */
 region.start = enext->start - eptr0->start;
 region.len = rnext->end - enext->start + 1;
-create_mbr_partition_table_entry (, false, files[i+1].mbr_id,
+create_mbr_partition_table_entry (, false, 0xf,
   [i-3][0x1ce]);
   }
 }


I'll try to come up with a better patch with tests etc.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-19 Thread Richard W.M. Jones

Attached is a NON-working patch to nbdkit-partitioning-plugin which
adds logical partition support.  I don't think I've fully understood
how the EBR fields are supposed to be initialized (or else they don't
work how is described in online documentation).  This actually causes
parted to print an internal error, while fdisk gives a warning and the
size of the logical partition is wrong.

Anyway I've run out of time to work on it this weekend, but I may be
able to have another look next week.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
>From 8d6dd4288064e9b880eccb1a6b1b7a6b03f2ba96 Mon Sep 17 00:00:00 2001
From: "Richard W.M. Jones" 
Date: Sat, 19 Jan 2019 10:30:28 +
Subject: [PATCH] partitioning: Support MBR logical partitions.

XXX NEEDS TESTS
---
 .../nbdkit-partitioning-plugin.pod|  25 +---
 plugins/partitioning/virtual-disk.h   |  15 +-
 plugins/partitioning/partition-gpt.c  |   2 +-
 plugins/partitioning/partition-mbr.c  | 134 +++---
 plugins/partitioning/partitioning.c   |  31 ++--
 plugins/partitioning/virtual-disk.c   |  42 +-
 6 files changed, 184 insertions(+), 65 deletions(-)

diff --git a/plugins/partitioning/nbdkit-partitioning-plugin.pod 
b/plugins/partitioning/nbdkit-partitioning-plugin.pod
index 57a1133..edb0776 100644
--- a/plugins/partitioning/nbdkit-partitioning-plugin.pod
+++ b/plugins/partitioning/nbdkit-partitioning-plugin.pod
@@ -27,23 +27,12 @@ access use the I<-r> flag.
 
 =head2 Partition table type
 
-You can choose either an MBR partition table, which is limited to 4
-partitions, or a GPT partition table.  In theory GPT supports an
-unlimited number of partitions.
-
-The rule for selecting the partition table type is:
+Using the C parameter you can choose either an MBR or
+a GPT partition table.  If this parameter is I present then:
 
 =over 4
 
-=item C parameter on the command line
-
-⇒ MBR is selected
-
-=item C parameter on the command line
-
-⇒ GPT is selected
-
-=item else, number of files E 4
+=item number of files E 4
 
 ⇒ GPT
 
@@ -131,7 +120,9 @@ C<./> (absolute paths do not need modification).
 =item B
 
 Add an MBR (DOS-style) partition table.  The MBR format is maximally
-compatible with all clients, but only supports up to 4 partitions.
+compatible with all clients.  If there are E 4 partitions then an
+extended partition is created
+(L).
 
 =item B
 
@@ -163,10 +154,6 @@ a Linux filesystem.
 
 =head1 LIMITS
 
-This plugin only supports B MBR partitions, hence the limit
-of 4 partitions with MBR.  This might be increased in future if we
-implement support for logical/extended MBR partitions.
-
 Although this plugin can create GPT partition tables containing more
 than 128 GPT partitions (in fact, unlimited numbers of partitions),
 some clients will not be able to handle this.
diff --git a/plugins/partitioning/virtual-disk.h 
b/plugins/partitioning/virtual-disk.h
index 3860f46..f59df70 100644
--- a/plugins/partitioning/virtual-disk.h
+++ b/plugins/partitioning/virtual-disk.h
@@ -34,6 +34,7 @@
 #ifndef NBDKIT_VIRTUAL_DISK_H
 #define NBDKIT_VIRTUAL_DISK_H
 
+#include 
 #include 
 #include 
 
@@ -103,7 +104,7 @@ extern struct file *files;
 extern size_t nr_files;
 
 extern struct regions regions;
-extern unsigned char *primary, *secondary;
+extern unsigned char *primary, *secondary, **ebr;
 
 /* Main entry point called after files array has been populated. */
 extern int create_virtual_disk_layout (void);
@@ -114,16 +115,16 @@ extern int create_virtual_disk_layout (void);
 extern int parse_guid (const char *str, char *out)
   __attribute__((__nonnull__ (1, 2)));
 
-/* Internal functions for creating MBR and GPT layouts.  These are
- * published here because the GPT code calls into the MBR code, but
- * are not meant to be called from the main plugin.
+/* Internal function for creating a single MBR PTE.  The GPT code
+ * calls this for creating the protective MBR.
  */
-extern void create_mbr_partition_table (unsigned char *out)
-  __attribute__((__nonnull__ (1)));
 extern void create_mbr_partition_table_entry (const struct region *,
-  int bootable, int partition_id,
+  bool bootable, int partition_id,
   unsigned char *)
   __attribute__((__nonnull__ (1, 4)));
+
+/* Create MBR or GPT layout. */
+extern void create_mbr_layout (void);
 extern void create_gpt_layout (void);
 
 #endif /* NBDKIT_VIRTUAL_DISK_H */
diff --git a/plugins/partitioning/partition-gpt.c 
b/plugins/partitioning/partition-gpt.c
index 5fb7602..be52e72 100644
--- a/plugins/partitioning/partition-gpt.c
+++ b/plugins/partitioning/partition-gpt.c
@@ 

Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-19 Thread Richard W.M. Jones
On Fri, Jan 18, 2019 at 04:47:42PM -0600, Eric Blake wrote:
> It matches the code, but I just learned the code is buggy for anything
> larger than 5.  According to
> http://tldp.org/HOWTO/Large-Disk-HOWTO-13.html, MBR Extended/Logical
> partitions form a linked-list, something like:
> 
> MBR:  EBR1  EBR2
> +--+  +--+  +--+
> + Part 1   +   |->+ Part 5   +   |->+ Part 6   +
> + Part 2   +   |  + Next |  + 0+
> + Part 3   +   |  + 0+  + 0+
> + Part 4 ---  + 0+  + 0+
> +--+  +--+  +--+

Wikipedia's description is a bit better:

https://en.wikipedia.org/wiki/Extended_boot_record

In fact it's not strictly a linked list because the start record of
each entry is relative to the extended partition start.  It's all a
bit weird.

But you're right, nbdkit can neither read (nbdkit-partition-filter)
nor write (nbdkit-partitioning-plugin) MBR logical partitions so we
can't test interop or tell people to use nbdkit for this corner case.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW



Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-18 Thread Eric Blake
On 1/18/19 7:47 AM, Vladimir Sementsov-Ogievskiy wrote:

> 
>  > Only expose MBR partition @var{num}.  Understands physical partitions
>  > 1-4 and logical partitions 5-8.
> 
> I'm afraid, I'm too lazy to sort out these things I don't know, so just 
> believe. It at least
> corresponds to limits in code that it should be 1 <= partition <= 8 ;)

It matches the code, but I just learned the code is buggy for anything
larger than 5.  According to
http://tldp.org/HOWTO/Large-Disk-HOWTO-13.html, MBR Extended/Logical
partitions form a linked-list, something like:

MBR:  EBR1  EBR2
+--+  +--+  +--+
+ Part 1   +   |->+ Part 5   +   |->+ Part 6   +
+ Part 2   +   |  + Next |  + 0+
+ Part 3   +   |  + 0+  + 0+
+ Part 4 ---  + 0+  + 0+
+--+  +--+  +--+

In reality, ANY of part 1-4 can point to the first EBR, as long as there
is at most one primary partition pointing to the extended partition -
even crazier is that there are different magic numbers in historical
use, such as type 5 for Part 3 and type 85 for Part 4, where MS-DOS
would treat Partition 3 as the extended partition and ignore Part 4, but
Linux would follow both chains (where the second extended partition thus
allowed Linux access to logical partitions residing in space beyond what
DOS could access).  But once you start the extended partition chain, all
logical partitions within the chain each have their own EBR table with
one entry for the partition and the next entry pointing to the next EBR,
meaning you can have more than 8 logical partitions (provided your disk
is big enough).

But our find_partition() code stupidly assumes that if there is any
extended partition type in the MBR (recognizing only type 5 or type F,
but not Linux' type 85), then all four entries in that EBR are logical
partitions 5-8 (and no more, rather than a linked list chain, as in:

MBR:  EBR
+--+  +--+
+ Part 1   +   |->+ Part 5   +
+ Part 2   +   |  + Part 6   +
+ Part 3   +   |  + Part 7   +
+ Part 4 ---  + Part 8   +
+--+  +--+

It's highly unlikely that there are any BIOS implementations that would
actually recognize such a partition beyond 5 as being bootable (there
might be OSs which are a bit more tolerant, since MBR doesn't seem to
have any one hard canonical specification).  I'd have to compare what
the Linux kernel MBR code does, to see if we even stand a chance of
being interoperable in any manner.  It doesn't help that nbdkit does not
yet support Logical MBR partitions.

Oh well, another project for another day; this documentation change is
going in as-is because it at least matches the code, even if the code is
buggy.  (I'm tempted to fix nbdkit to fully support MBR logical
partitions, then rip out the partitioning code in qemu as redundant as
you could get it via nbdkit if you really need it - qemu-nbd's -P 8 has
unchanged, and thus buggy, since at least commit 7a5ca8648 in May 2008,
with no user complaining of a bug for 11 years)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-18 Thread Eric Blake
On 1/18/19 7:47 AM, Vladimir Sementsov-Ogievskiy wrote:

>>
>> Interesting, I don't get it again. Searched in outlook online, I found only 
>> v2 of it,
>> checked spam folder and filters. Magic.

Weird indeed.  Maybe there's enough @ signs due to .texi content in
there that a spam filter somewhere along the line ate it silently?

>>
>> anyway, I can look at 
>> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg04328.html
>>
>>
> 
> just look at it in your qemu-nbd-list-v4 tag:

Thanks for that extra effort.

> 
>  > Only expose MBR partition @var{num}.  Understands physical partitions
>  > 1-4 and logical partitions 5-8.
> 
> I'm afraid, I'm too lazy to sort out these things I don't know, so just 
> believe. It at least
> corresponds to limits in code that it should be 1 <= partition <= 8 ;)
> 
>  > @item -c, --connect=@var{dev}
>  > Connect @var{filename} to NBD device @var{dev} (Linux only).
>  > @item -d, --disconnect
>  > Disconnect the device @var{dev} (Linux only).
> 
> Yes, and we now have clean code which establish this limitation
> 
>  > @item -e, --shared=@var{num}
>  > Allow up to @var{num} clients to share the device (default
>  > @samp{1}). Safe for readers, but for now, consistency is not
>  > guaranteed between multiple writers.
> 
> Hmm, don't understand, why you decided to move @samp{1}). on the following 
> line, it looks
> better as it was. And with a period it's only 69 symbols, the file have a lot 
> longer lines.

Emacs auto-reflowed the line as I edited the paragraph.  End-of-lines
doesn't really matter in .texi files (it gets reflowed again in the
creation of output documentation from the .texi input), so if my editor
tries to stick to certain column lengths, I don't really fight it.


>  > @example
>  > qemu-nbd \
>  >   --object tls-creds-x509,id=tls0,endpoint=server,dir=/path/to/qemutls \
>  >   --tls-creds tls0 -t -x subset -p 10810 \
>  >   --image-opts 
> driver=raw,offset=1M,length=1M,file.driver=file,file.filename=file.raw
>  > @end example
> 
> I don't know tls-related, the other options looks fine.

I looked through past git logs to see the command lines that Dan used
when initially implementing things; as well as comparison to iotest 233.

> 
> upd (decided to run the command, with dropped tls):
> looks fine but not work, as s/length/size.

Oh, good catch!  Will fix.

> 
>  >
>  > Serve a read-only copy of just the first MBR partition of a guest
>  > image over a Unix socket with as many as 5 simultaneous readers, with
>  > a persistent process forked as a daemon:
>  >
>  > @example
>  > qemu-nbd --fork -t -e 5 -s /path/to/sock -p 1 -r -f qcow2 file.qcow2
>  > @end example
> 
> Oops. s/-s/-k/, s/-p/-P/,

Ouch. Yes, thanks for catching that.

 and idea: may be, use self-descriptive long option names in
> examples?

Hmm. The nice part about short options is that they are less typing and
shorter lines; but verbose options are also useful.  Maybe listing the
same example twice, in both the short and long form?

> 
>  >
>  > Expose the guest-visible contents of a qcow2 file via a block device
>  > /dev/nbd0 (and possibly creating /dev/nbd0p1 and friends for
>  > partitions found within), then disconnect the device when done.
>  > @emph{CAUTION}: Do not use this method to mount filesystems from an
>  > untrusted guest image - a malicious guest may have prepared the image
>  > to attempt to trigger kernel bugs in partition probing or file system
>  > mounting.
>  >
>  > @example
> 
> should we note, that nbd kernel-module should be loaded for this to work?

Yes, that's worth mentioning (some distros include that module to be
running or autoload already, but it's not universal, so such a note is
indeed warranted).

> 
>  > qemu-nbd -c /dev/nbd0 -f qcow2 file.qcow2
>  > qemu-nbd -d /dev/nbd0
>  > @end example
> 
> With fixed wrong option names and s/length/size:
> for 03/21:
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-18 Thread Vladimir Sementsov-Ogievskiy
18.01.2019 11:15, Vladimir Sementsov-Ogievskiy wrote:
> 17.01.2019 22:36, Eric Blake wrote:
>> I got tired of debugging whether a server was advertising the
>> correct things during negotiation by inspecting the trace
>> logs of qemu-io as client - not to mention that without SOME
>> sort of client tracing particular commands, we can't easily
>> regression test the server for correct behavior.  The final
>> straw was at KVM Forum, when Nir asked me to make sure there
>> was a way to easily determine if an NBD server is exposing what
>> we really want (and fixing x-dirty-bitmap to behave saner fell
>> out as a result of answering that question).
>>
>> I note that upstream NBD has 'nbd-client -l $host' for querying
>> just export names (with no quoting, so you have to know that
>> a blank line means the default export), but it wasn't powerful
>> enough, so I implemented 'qemu-nbd -L' to document everything.
>> Upstream NBD has separate 'nbd-client' and 'nbd-server' binaries,
>> while we only have 'qemu-nbd' (which is normally just a server,
>> but 'qemu-nbd -c' also operates a second thread as a client).
>> Our other uses of qemu as NBD client are for consuming a block
>> device (as in qemu-io, qemu-img, or a drive to qemu) - but those
>> binaries are less suited to something so specific to the NBD
>> protocol.
>>
>> Bonus: As a result of my work on this series, nbdkit now supports
>> NBD_OPT_INFO (my interoperability testing between server
>> implementations has been paying off, both at fixing server bugs,
>> and at making this code more reliable across difference in valid
>> servers).
>>
>> Also available at:
>> https://repo.or.cz/qemu/ericb.git qemu-nbd-list-v4
>>
>> Currently based on master.
>>
>> Since v3:
>> - 1 new patch (1/21)
>> - split old patch 15/19 into two (16,17/21)
>> - retitle two patches (git backport-diff doesn't do too well at showing
>> the diff on a retitled patch; 5,11/21)
>> - fix review comments from Rich, Vladimir
>>
>> 001/21:[down] 'iotests: Make 233 output more reliable'
>> 002/21:[] [--] 'maint: Allow for EXAMPLES in texi2pod'
>> 003/21:[] [--] 'qemu-nbd: Enhance man page'
> 
> Interesting, I don't get it again. Searched in outlook online, I found only 
> v2 of it,
> checked spam folder and filters. Magic.
> 
> anyway, I can look at 
> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg04328.html
> 
> 

just look at it in your qemu-nbd-list-v4 tag:

 > Only expose MBR partition @var{num}.  Understands physical partitions
 > 1-4 and logical partitions 5-8.

I'm afraid, I'm too lazy to sort out these things I don't know, so just 
believe. It at least
corresponds to limits in code that it should be 1 <= partition <= 8 ;)

 > @item -c, --connect=@var{dev}
 > Connect @var{filename} to NBD device @var{dev} (Linux only).
 > @item -d, --disconnect
 > Disconnect the device @var{dev} (Linux only).

Yes, and we now have clean code which establish this limitation

 > @item -e, --shared=@var{num}
 > Allow up to @var{num} clients to share the device (default
 > @samp{1}). Safe for readers, but for now, consistency is not
 > guaranteed between multiple writers.

Hmm, don't understand, why you decided to move @samp{1}). on the following 
line, it looks
better as it was. And with a period it's only 69 symbols, the file have a lot 
longer lines.

 > Start a server listening on port 10809 that exposes only the
 > guest-visible contents of a qcow2 file, with no TLS encryption, and
 > with the default export name (an empty string). The command is
 > one-shot, and will block until the first successful client
 > disconnects:
 >
 > @example
 > qemu-nbd -f qcow2 file.qcow2
 > @end example
 >
 > Start a long-running server listening with encryption on port 10810,
 > and require clients to have a correct X.509 certificate to connect to
 > a 1 megabyte subset of a raw file, using the export name 'subset':
 >
 > @example
 > qemu-nbd \
 >   --object tls-creds-x509,id=tls0,endpoint=server,dir=/path/to/qemutls \
 >   --tls-creds tls0 -t -x subset -p 10810 \
 >   --image-opts 
 > driver=raw,offset=1M,length=1M,file.driver=file,file.filename=file.raw
 > @end example

I don't know tls-related, the other options looks fine.

upd (decided to run the command, with dropped tls):
looks fine but not work, as s/length/size.

 >
 > Serve a read-only copy of just the first MBR partition of a guest
 > image over a Unix socket with as many as 5 simultaneous readers, with
 > a persistent process forked as a daemon:
 >
 > @example
 > qemu-nbd --fork -t -e 5 -s /path/to/sock -p 1 -r -f qcow2 file.qcow2
 > @end example

Oops. s/-s/-k/, s/-p/-P/, and idea: may be, use self-descriptive long option 
names in
examples?

 >
 > Expose the guest-visible contents of a qcow2 file via a block device
 > /dev/nbd0 (and possibly creating /dev/nbd0p1 and friends for
 > partitions found within), then disconnect the device when done.
 > @emph{CAUTION}: Do not use this method to mount filesystems from an
 > untrusted guest image 

Re: [Qemu-block] [PATCH v4 00/21] nbd: add qemu-nbd --list

2019-01-18 Thread Vladimir Sementsov-Ogievskiy
17.01.2019 22:36, Eric Blake wrote:
> I got tired of debugging whether a server was advertising the
> correct things during negotiation by inspecting the trace
> logs of qemu-io as client - not to mention that without SOME
> sort of client tracing particular commands, we can't easily
> regression test the server for correct behavior.  The final
> straw was at KVM Forum, when Nir asked me to make sure there
> was a way to easily determine if an NBD server is exposing what
> we really want (and fixing x-dirty-bitmap to behave saner fell
> out as a result of answering that question).
> 
> I note that upstream NBD has 'nbd-client -l $host' for querying
> just export names (with no quoting, so you have to know that
> a blank line means the default export), but it wasn't powerful
> enough, so I implemented 'qemu-nbd -L' to document everything.
> Upstream NBD has separate 'nbd-client' and 'nbd-server' binaries,
> while we only have 'qemu-nbd' (which is normally just a server,
> but 'qemu-nbd -c' also operates a second thread as a client).
> Our other uses of qemu as NBD client are for consuming a block
> device (as in qemu-io, qemu-img, or a drive to qemu) - but those
> binaries are less suited to something so specific to the NBD
> protocol.
> 
> Bonus: As a result of my work on this series, nbdkit now supports
> NBD_OPT_INFO (my interoperability testing between server
> implementations has been paying off, both at fixing server bugs,
> and at making this code more reliable across difference in valid
> servers).
> 
> Also available at:
> https://repo.or.cz/qemu/ericb.git qemu-nbd-list-v4
> 
> Currently based on master.
> 
> Since v3:
> - 1 new patch (1/21)
> - split old patch 15/19 into two (16,17/21)
> - retitle two patches (git backport-diff doesn't do too well at showing
> the diff on a retitled patch; 5,11/21)
> - fix review comments from Rich, Vladimir
> 
> 001/21:[down] 'iotests: Make 233 output more reliable'
> 002/21:[] [--] 'maint: Allow for EXAMPLES in texi2pod'
> 003/21:[] [--] 'qemu-nbd: Enhance man page'

Interesting, I don't get it again. Searched in outlook online, I found only v2 
of it,
checked spam folder and filters. Magic.

anyway, I can look at 
https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg04328.html


-- 
Best regards,
Vladimir