Re: [Libguestfs] [libnbd PATCH v4 1/4] states: Document our reliance on type overlaps
On 6/12/23 20:10, Richard W.M. Jones wrote: > On Fri, Jun 09, 2023 at 03:39:19PM -0500, Eric Blake wrote: >> [Bah - I typed up a longer response, but lost it when accidentally >> trying to send through the wrong SMTP server, so now I have to >> remember what I had...] >> >> On Fri, Jun 09, 2023 at 02:45:56PM +0200, Laszlo Ersek wrote: >>> On 6/9/23 04:17, Eric Blake wrote: When I added structured replies to the NBD spec, I intentionally chose a wire layout where the magic number and cookie overlap, even while the middle member changes from uint32_t error to the pair uint16_t flags and type. Based only on a strict reading of C rules on effective types and compatible type prefixes, it's probably questionable on whether my reliance on type aliasing to reuse cookie from the same offset of a union, or even the fact that a structured reply is built by first reading bytes into sbuf.simple_reply then following up with only bytes into the tail of sbuf.sr.structured_reply is strictly portable. But since it works in practice, it's worth at least adding some compile- and run-time assertions that our (ab)use of aliasing is accessing the bytes we want under the types we expect. Upcoming patches will restructure part of the sbuf layout to hopefully be a little easier to tie back to strict C standards. Suggested-by: Laszlo Ersek Signed-off-by: Eric Blake --- generator/states-reply.c| 17 + generator/states-reply-structured.c | 13 + 2 files changed, 22 insertions(+), 8 deletions(-) diff --git a/generator/states-reply.c b/generator/states-reply.c index 511e5cb1..2c77658b 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -17,6 +17,7 @@ */ #include +#include /* State machine for receiving reply messages from the server. * @@ -63,9 +64,15 @@ REPLY.START: ssize_t r; /* We read all replies initially as if they are simple replies, but - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. - * This works because the structured_reply header is larger. + * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. This + * works because the structured_reply header is larger, and because + * the last member of a simple reply, cookie, is coincident between + * the two structs (an intentional design decision in the NBD spec + * when structured replies were added). */ + STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) == + offsetof (struct nbd_handle, sbuf.sr.structured_reply.cookie), + cookie_aliasing); >>> >>> Can you perhaps append >>> >>> ... && >>> sizeof h->sbuf.simple_reply.cookie == >>> sizeof h->sbuf.sr.structured_reply.cookie >>> >>> (if you agree)? >> >> Yes, that makes sense, and I did so for what got pushed as 29342fedb53 >> >>> >>> Also, the commit message and the comment talk about the magic number as >>> well, not just the cookie, and the static assertion ignores magic. >>> However, I can see the magic handling changes in the next patch. >> >> I was a bit less concerned about magic (it is easy to see that it is >> at offset 0 in both types and could satisfy the common prefix rules, >> while seeing cookie's location and a non-common prefix makes the >> latter more imporant to assert). But checking two members instead of >> one shouldn't hurt, and in fact, once extended types are in (plus >> patch 4/4 of this series also adds an anonymous sub-struct in 'union >> reply_header' which is also worth validating), it may make sense to do >> a followup patch that adds: >> >> #define ASSERT_MEMBER_OVERLAP(TypeA, memberA, TypeB, memberB) \ >> STATIC_ASSERT (offsetof (TypeA, memberA) == offsetof (TypeB, memberB) && \ >> sizeof ((TypeA *)NULL)->memberA == sizeof ((TypeB >> *)NULL)->memberB, \ >> member_overlap) >> >> to be used either as: >> >> ASSERT_MEMBER_OVERLAP (struct nbd_simple_reply, cookie, >>struct nbd_structured_reply, cookie); >> >> or as >> >> ASSERT_MEMBER_OVERLAP (struct nbd_handle, sbuf.simple_reply.magic, >>struct nbd_handle, sbuf.sr.structured_reply.magic); > > This is a nice idea! > >> Would it make sense to have the macro take only three arguments (since >> both of those invocations repeat an argument); if so, is it better to >> share the common type name, or the common member name? > > We can always start with the 3 arg version and change it if we need to > later. At the moment I can't think of a reason to check that fields > in two unrelated types overlap, since you'd presumably always want to > use them through an actual union type, but I suppose it could happen. That's a good point! > >> I also note that our
Re: [Libguestfs] [libnbd PATCH v4 1/4] states: Document our reliance on type overlaps
On 6/9/23 22:39, Eric Blake wrote: > [Bah - I typed up a longer response, but lost it when accidentally > trying to send through the wrong SMTP server, so now I have to > remember what I had...] > > On Fri, Jun 09, 2023 at 02:45:56PM +0200, Laszlo Ersek wrote: >> On 6/9/23 04:17, Eric Blake wrote: >>> When I added structured replies to the NBD spec, I intentionally chose >>> a wire layout where the magic number and cookie overlap, even while >>> the middle member changes from uint32_t error to the pair uint16_t >>> flags and type. Based only on a strict reading of C rules on >>> effective types and compatible type prefixes, it's probably >>> questionable on whether my reliance on type aliasing to reuse cookie >>> from the same offset of a union, or even the fact that a structured >>> reply is built by first reading bytes into sbuf.simple_reply then >>> following up with only bytes into the tail of sbuf.sr.structured_reply >>> is strictly portable. But since it works in practice, it's worth at >>> least adding some compile- and run-time assertions that our (ab)use of >>> aliasing is accessing the bytes we want under the types we expect. >>> Upcoming patches will restructure part of the sbuf layout to hopefully >>> be a little easier to tie back to strict C standards. >>> >>> Suggested-by: Laszlo Ersek >>> Signed-off-by: Eric Blake >>> --- >>> generator/states-reply.c| 17 + >>> generator/states-reply-structured.c | 13 + >>> 2 files changed, 22 insertions(+), 8 deletions(-) >>> >>> diff --git a/generator/states-reply.c b/generator/states-reply.c >>> index 511e5cb1..2c77658b 100644 >>> --- a/generator/states-reply.c >>> +++ b/generator/states-reply.c >>> @@ -17,6 +17,7 @@ >>> */ >>> >>> #include >>> +#include >>> >>> /* State machine for receiving reply messages from the server. >>> * >>> @@ -63,9 +64,15 @@ REPLY.START: >>>ssize_t r; >>> >>>/* We read all replies initially as if they are simple replies, but >>> - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. >>> - * This works because the structured_reply header is larger. >>> + * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. This >>> + * works because the structured_reply header is larger, and because >>> + * the last member of a simple reply, cookie, is coincident between >>> + * the two structs (an intentional design decision in the NBD spec >>> + * when structured replies were added). >>> */ >>> + STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) == >>> + offsetof (struct nbd_handle, >>> sbuf.sr.structured_reply.cookie), >>> + cookie_aliasing); >> >> Can you perhaps append >> >> ... && >> sizeof h->sbuf.simple_reply.cookie == >> sizeof h->sbuf.sr.structured_reply.cookie >> >> (if you agree)? > > Yes, that makes sense, and I did so for what got pushed as 29342fedb53 > >> >> Also, the commit message and the comment talk about the magic number as >> well, not just the cookie, and the static assertion ignores magic. >> However, I can see the magic handling changes in the next patch. > > I was a bit less concerned about magic (it is easy to see that it is > at offset 0 in both types and could satisfy the common prefix rules, > while seeing cookie's location and a non-common prefix makes the > latter more imporant to assert). But checking two members instead of > one shouldn't hurt, and in fact, once extended types are in (plus > patch 4/4 of this series also adds an anonymous sub-struct in 'union > reply_header' which is also worth validating), it may make sense to do > a followup patch that adds: > > #define ASSERT_MEMBER_OVERLAP(TypeA, memberA, TypeB, memberB) \ > STATIC_ASSERT (offsetof (TypeA, memberA) == offsetof (TypeB, memberB) && \ > sizeof ((TypeA *)NULL)->memberA == sizeof ((TypeB > *)NULL)->memberB, \ > member_overlap) > > to be used either as: > > ASSERT_MEMBER_OVERLAP (struct nbd_simple_reply, cookie, >struct nbd_structured_reply, cookie); > > or as > > ASSERT_MEMBER_OVERLAP (struct nbd_handle, sbuf.simple_reply.magic, >struct nbd_handle, sbuf.sr.structured_reply.magic); > > Would it make sense to have the macro take only three arguments (since > both of those invocations repeat an argument); if so, is it better to > share the common type name, or the common member name? Both 4-arg invocations look fine to me, so I wouldn't push for a 3-arg variant at this time. > I also note that our "static-assert.h" file defines STATIC_ASSERT() as > a do/while statement (that is, it MUST appear inside a function body, > so we can't use it easily in .h files); contrast that with C11's > _Static_assert() or qemu's QEMU_BUILD_BUG_ON() that behave more as a > type declaration (and can therefore appear outside of a function body; > C23 will take it one step further by adding static_assert(expr) >
Re: [Libguestfs] [libnbd PATCH v4 1/4] states: Document our reliance on type overlaps
On Fri, Jun 09, 2023 at 03:39:19PM -0500, Eric Blake wrote: > [Bah - I typed up a longer response, but lost it when accidentally > trying to send through the wrong SMTP server, so now I have to > remember what I had...] > > On Fri, Jun 09, 2023 at 02:45:56PM +0200, Laszlo Ersek wrote: > > On 6/9/23 04:17, Eric Blake wrote: > > > When I added structured replies to the NBD spec, I intentionally chose > > > a wire layout where the magic number and cookie overlap, even while > > > the middle member changes from uint32_t error to the pair uint16_t > > > flags and type. Based only on a strict reading of C rules on > > > effective types and compatible type prefixes, it's probably > > > questionable on whether my reliance on type aliasing to reuse cookie > > > from the same offset of a union, or even the fact that a structured > > > reply is built by first reading bytes into sbuf.simple_reply then > > > following up with only bytes into the tail of sbuf.sr.structured_reply > > > is strictly portable. But since it works in practice, it's worth at > > > least adding some compile- and run-time assertions that our (ab)use of > > > aliasing is accessing the bytes we want under the types we expect. > > > Upcoming patches will restructure part of the sbuf layout to hopefully > > > be a little easier to tie back to strict C standards. > > > > > > Suggested-by: Laszlo Ersek > > > Signed-off-by: Eric Blake > > > --- > > > generator/states-reply.c| 17 + > > > generator/states-reply-structured.c | 13 + > > > 2 files changed, 22 insertions(+), 8 deletions(-) > > > > > > diff --git a/generator/states-reply.c b/generator/states-reply.c > > > index 511e5cb1..2c77658b 100644 > > > --- a/generator/states-reply.c > > > +++ b/generator/states-reply.c > > > @@ -17,6 +17,7 @@ > > > */ > > > > > > #include > > > +#include > > > > > > /* State machine for receiving reply messages from the server. > > > * > > > @@ -63,9 +64,15 @@ REPLY.START: > > >ssize_t r; > > > > > >/* We read all replies initially as if they are simple replies, but > > > - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. > > > - * This works because the structured_reply header is larger. > > > + * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. This > > > + * works because the structured_reply header is larger, and because > > > + * the last member of a simple reply, cookie, is coincident between > > > + * the two structs (an intentional design decision in the NBD spec > > > + * when structured replies were added). > > > */ > > > + STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) > > > == > > > + offsetof (struct nbd_handle, > > > sbuf.sr.structured_reply.cookie), > > > + cookie_aliasing); > > > > Can you perhaps append > > > > ... && > > sizeof h->sbuf.simple_reply.cookie == > > sizeof h->sbuf.sr.structured_reply.cookie > > > > (if you agree)? > > Yes, that makes sense, and I did so for what got pushed as 29342fedb53 > > > > > Also, the commit message and the comment talk about the magic number as > > well, not just the cookie, and the static assertion ignores magic. > > However, I can see the magic handling changes in the next patch. > > I was a bit less concerned about magic (it is easy to see that it is > at offset 0 in both types and could satisfy the common prefix rules, > while seeing cookie's location and a non-common prefix makes the > latter more imporant to assert). But checking two members instead of > one shouldn't hurt, and in fact, once extended types are in (plus > patch 4/4 of this series also adds an anonymous sub-struct in 'union > reply_header' which is also worth validating), it may make sense to do > a followup patch that adds: > > #define ASSERT_MEMBER_OVERLAP(TypeA, memberA, TypeB, memberB) \ > STATIC_ASSERT (offsetof (TypeA, memberA) == offsetof (TypeB, memberB) && \ > sizeof ((TypeA *)NULL)->memberA == sizeof ((TypeB > *)NULL)->memberB, \ > member_overlap) > > to be used either as: > > ASSERT_MEMBER_OVERLAP (struct nbd_simple_reply, cookie, >struct nbd_structured_reply, cookie); > > or as > > ASSERT_MEMBER_OVERLAP (struct nbd_handle, sbuf.simple_reply.magic, >struct nbd_handle, sbuf.sr.structured_reply.magic); This is a nice idea! > Would it make sense to have the macro take only three arguments (since > both of those invocations repeat an argument); if so, is it better to > share the common type name, or the common member name? We can always start with the 3 arg version and change it if we need to later. At the moment I can't think of a reason to check that fields in two unrelated types overlap, since you'd presumably always want to use them through an actual union type, but I suppose it could happen. > I also note that our "static-assert.h" file defines STATIC_ASSERT()
Re: [Libguestfs] [libnbd PATCH v4 1/4] states: Document our reliance on type overlaps
[Bah - I typed up a longer response, but lost it when accidentally trying to send through the wrong SMTP server, so now I have to remember what I had...] On Fri, Jun 09, 2023 at 02:45:56PM +0200, Laszlo Ersek wrote: > On 6/9/23 04:17, Eric Blake wrote: > > When I added structured replies to the NBD spec, I intentionally chose > > a wire layout where the magic number and cookie overlap, even while > > the middle member changes from uint32_t error to the pair uint16_t > > flags and type. Based only on a strict reading of C rules on > > effective types and compatible type prefixes, it's probably > > questionable on whether my reliance on type aliasing to reuse cookie > > from the same offset of a union, or even the fact that a structured > > reply is built by first reading bytes into sbuf.simple_reply then > > following up with only bytes into the tail of sbuf.sr.structured_reply > > is strictly portable. But since it works in practice, it's worth at > > least adding some compile- and run-time assertions that our (ab)use of > > aliasing is accessing the bytes we want under the types we expect. > > Upcoming patches will restructure part of the sbuf layout to hopefully > > be a little easier to tie back to strict C standards. > > > > Suggested-by: Laszlo Ersek > > Signed-off-by: Eric Blake > > --- > > generator/states-reply.c| 17 + > > generator/states-reply-structured.c | 13 + > > 2 files changed, 22 insertions(+), 8 deletions(-) > > > > diff --git a/generator/states-reply.c b/generator/states-reply.c > > index 511e5cb1..2c77658b 100644 > > --- a/generator/states-reply.c > > +++ b/generator/states-reply.c > > @@ -17,6 +17,7 @@ > > */ > > > > #include > > +#include > > > > /* State machine for receiving reply messages from the server. > > * > > @@ -63,9 +64,15 @@ REPLY.START: > >ssize_t r; > > > >/* We read all replies initially as if they are simple replies, but > > - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. > > - * This works because the structured_reply header is larger. > > + * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. This > > + * works because the structured_reply header is larger, and because > > + * the last member of a simple reply, cookie, is coincident between > > + * the two structs (an intentional design decision in the NBD spec > > + * when structured replies were added). > > */ > > + STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) == > > + offsetof (struct nbd_handle, > > sbuf.sr.structured_reply.cookie), > > + cookie_aliasing); > > Can you perhaps append > > ... && > sizeof h->sbuf.simple_reply.cookie == > sizeof h->sbuf.sr.structured_reply.cookie > > (if you agree)? Yes, that makes sense, and I did so for what got pushed as 29342fedb53 > > Also, the commit message and the comment talk about the magic number as > well, not just the cookie, and the static assertion ignores magic. > However, I can see the magic handling changes in the next patch. I was a bit less concerned about magic (it is easy to see that it is at offset 0 in both types and could satisfy the common prefix rules, while seeing cookie's location and a non-common prefix makes the latter more imporant to assert). But checking two members instead of one shouldn't hurt, and in fact, once extended types are in (plus patch 4/4 of this series also adds an anonymous sub-struct in 'union reply_header' which is also worth validating), it may make sense to do a followup patch that adds: #define ASSERT_MEMBER_OVERLAP(TypeA, memberA, TypeB, memberB) \ STATIC_ASSERT (offsetof (TypeA, memberA) == offsetof (TypeB, memberB) && \ sizeof ((TypeA *)NULL)->memberA == sizeof ((TypeB *)NULL)->memberB, \ member_overlap) to be used either as: ASSERT_MEMBER_OVERLAP (struct nbd_simple_reply, cookie, struct nbd_structured_reply, cookie); or as ASSERT_MEMBER_OVERLAP (struct nbd_handle, sbuf.simple_reply.magic, struct nbd_handle, sbuf.sr.structured_reply.magic); Would it make sense to have the macro take only three arguments (since both of those invocations repeat an argument); if so, is it better to share the common type name, or the common member name? I also note that our "static-assert.h" file defines STATIC_ASSERT() as a do/while statement (that is, it MUST appear inside a function body, so we can't use it easily in .h files); contrast that with C11's _Static_assert() or qemu's QEMU_BUILD_BUG_ON() that behave more as a type declaration (and can therefore appear outside of a function body; C23 will take it one step further by adding static_assert(expr) alongside static_assert(expr, msg). I consider myself too tainted, not only by helping with qemu's implementation, but also by reviewing gnulib's implementation (which uses __VA_ARGS__ to emulate C23 semantics of an optional
Re: [Libguestfs] [libnbd PATCH v4 1/4] states: Document our reliance on type overlaps
On 6/9/23 04:17, Eric Blake wrote: > When I added structured replies to the NBD spec, I intentionally chose > a wire layout where the magic number and cookie overlap, even while > the middle member changes from uint32_t error to the pair uint16_t > flags and type. Based only on a strict reading of C rules on > effective types and compatible type prefixes, it's probably > questionable on whether my reliance on type aliasing to reuse cookie > from the same offset of a union, or even the fact that a structured > reply is built by first reading bytes into sbuf.simple_reply then > following up with only bytes into the tail of sbuf.sr.structured_reply > is strictly portable. But since it works in practice, it's worth at > least adding some compile- and run-time assertions that our (ab)use of > aliasing is accessing the bytes we want under the types we expect. > Upcoming patches will restructure part of the sbuf layout to hopefully > be a little easier to tie back to strict C standards. > > Suggested-by: Laszlo Ersek > Signed-off-by: Eric Blake > --- > generator/states-reply.c| 17 + > generator/states-reply-structured.c | 13 + > 2 files changed, 22 insertions(+), 8 deletions(-) > > diff --git a/generator/states-reply.c b/generator/states-reply.c > index 511e5cb1..2c77658b 100644 > --- a/generator/states-reply.c > +++ b/generator/states-reply.c > @@ -17,6 +17,7 @@ > */ > > #include > +#include > > /* State machine for receiving reply messages from the server. > * > @@ -63,9 +64,15 @@ REPLY.START: >ssize_t r; > >/* We read all replies initially as if they are simple replies, but > - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. > - * This works because the structured_reply header is larger. > + * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. This > + * works because the structured_reply header is larger, and because > + * the last member of a simple reply, cookie, is coincident between > + * the two structs (an intentional design decision in the NBD spec > + * when structured replies were added). > */ > + STATIC_ASSERT (offsetof (struct nbd_handle, sbuf.simple_reply.cookie) == > + offsetof (struct nbd_handle, > sbuf.sr.structured_reply.cookie), > + cookie_aliasing); Can you perhaps append ... && sizeof h->sbuf.simple_reply.cookie == sizeof h->sbuf.sr.structured_reply.cookie (if you agree)? Also, the commit message and the comment talk about the magic number as well, not just the cookie, and the static assertion ignores magic. However, I can see the magic handling changes in the next patch. Reviewed-by: Laszlo Ersek Thanks Laszlo >assert (h->reply_cmd == NULL); >assert (h->rlen == 0); > > @@ -135,7 +142,8 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: >} > >/* NB: This works for both simple and structured replies because the > - * handle (our cookie) is stored at the same offset. > + * handle (our cookie) is stored at the same offset. See the > + * STATIC_ASSERT above in state REPLY.START that confirmed this. > */ >h->chunks_received++; >cookie = be64toh (h->sbuf.simple_reply.cookie); > @@ -155,7 +163,8 @@ REPLY.FINISH_COMMAND: >bool retire; > >/* NB: This works for both simple and structured replies because the > - * handle (our cookie) is stored at the same offset. > + * handle (our cookie) is stored at the same offset. See the > + * STATIC_ASSERT above in state REPLY.START that confirmed this. > */ >cookie = be64toh (h->sbuf.simple_reply.cookie); >/* Find the command amongst the commands in flight. */ > diff --git a/generator/states-reply-structured.c > b/generator/states-reply-structured.c > index 5aca7262..205a236d 100644 > --- a/generator/states-reply-structured.c > +++ b/generator/states-reply-structured.c > @@ -19,6 +19,7 @@ > /* State machine for parsing structured replies from the server. */ > > #include > +#include > #include > #include > > @@ -45,11 +46,15 @@ structured_reply_in_bounds (uint64_t offset, uint32_t > length, > > STATE_MACHINE { > REPLY.STRUCTURED_REPLY.START: > - /* We've only read the simple_reply. The structured_reply is longer, > - * so read the remaining part. > + /* We've only read the bytes that fill simple_reply. The > + * structured_reply is longer, so read the remaining part. We > + * depend on the memory aliasing in union sbuf to overlay the two > + * reply types. > */ > - h->rbuf = >sbuf; > - h->rbuf = (char *)h->rbuf + sizeof h->sbuf.simple_reply; > + STATIC_ASSERT (sizeof h->sbuf.simple_reply == > + offsetof (struct nbd_structured_reply, length), > + simple_structured_overlap); > + assert (h->rbuf == (char *)>sbuf + sizeof h->sbuf.simple_reply); >h->rlen = sizeof h->sbuf.sr.structured_reply; >h->rlen -= sizeof h->sbuf.simple_reply; >SET_NEXT_STATE (%RECV_REMAINING);