So gcc-9 has a new warning about doing memset() across pointer boundaries.

We didn't have a lot of these things, and most of them got fixed
pretty quickly. The main remaining ones are oin FCoE, and look like
this:

In function ‘memset’,
    inlined from ‘fcoe_ctlr_vlan_parse’ at drivers/scsi/fcoe/fcoe_ctlr.c:2830:2,
    inlined from ‘fcoe_ctlr_vlan_recv’ at drivers/scsi/fcoe/fcoe_ctlr.c:3005:7:
./include/linux/string.h:344:9: warning: ‘__builtin_memset’ offset
[569, 600] from the object at ‘buf’ is out of the bounds of referenced
subobject ‘rdata’ with type ‘struct fc_rport_priv’ at offset 0
[-Warray-bounds]
  344 |  return __builtin_memset(p, c, size);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘memset’,
    inlined from ‘fcoe_ctlr_vn_parse’ at drivers/scsi/fcoe/fcoe_ctlr.c:2299:2,
    inlined from ‘fcoe_ctlr_vn_recv’ at drivers/scsi/fcoe/fcoe_ctlr.c:2772:7:
./include/linux/string.h:344:9: warning: ‘__builtin_memset’ offset
[569, 600] from the object at ‘buf’ is out of the bounds of referenced
subobject ‘rdata’ with type ‘struct fc_rport_priv’ at offset 0
[-Warray-bounds]
  344 |  return __builtin_memset(p, c, size);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

and honestly, when I look at the code, I cannot help but agree with
the new warning in this case (we had a few other cases where the
warning was understandable but annoying, but in the FCoE case it's
really "yeah, that code is wrong").

In particular, in fcoe_ctlr_vlan_parse(), the function is passed a
"struct fc_rport_priv *rdata" pointer, and then it does

        memset(rdata, 0, sizeof(*rdata) + sizeof(*frport));

and honestly, that should make people go "WTF?". You got passed a
pointer to one type, and then you clear not just that type, but
another entirely unrelated type after it. That's just completely bogus
and wrong.

In fact, what people *are* passing that thing is this:

        struct {
                struct fc_rport_priv rdata;
                struct fcoe_rport frport;
        } buf;

but that doesn't actually make that "memset()" any more correct. It's
still entirely wrong, because it's possible that "struct fcoe_rport"
could have different alignment requirements etc, so maybe there's a
gap between the two structures, and the memset() with just adding the
sizes may be entirely bogus.

Now, in practice I think the code works, but I have to say that in
this case the compiler warning is really quite correct. The code is
wrong, even if it might happen to work.

That "combined struct of two types" needs to be made into a type of
its own, and people need to use that.

I started doing that, but it turns out this mis-feature is deeply
embedded in that file, and also exposed indirectly in the whole
"struct fc_rport_operations", which uses a "event_callback" function
pointer that has the bogus type.

End result: I don't know how to fix it, but the compiler is right.
This code is wrong, and it needs to be fixed. Maybe "struct
fc_rport_priv" could just be made to include that "struct fcoe_rport"
as part of it? Maybe the event_callback() can be typed differently.
But passing around a "struct fc_rport_priv" pointer that is actually
*not* a pointer to just that thing, but must be a pointer to the
combined data, is wrong. Something like

 struct fc_rport_combined_data {
        struct fc_rport_priv rdata;
        struct fcoe_rport frport;
 };

may be needed, and then you pass a pointer to that. But it expands
quite quickly because this seems to be common.

Related to this thing, the whole

   static inline struct fcoe_rport *fcoe_ctlr_rport(struct fc_rport_priv *rdata)

is part of the disease, and needs to go away. Passing the right
"combined data" pointer around would have made that thing useless to
begin with (ie "fcoe_ctlr_rport(rdata)" should become just
"&buf->frport" instead if the types were done correctly).

Hmm? Please can some FCoE person look at this, because right now it
causes tens of lines of warnings that make me go "yeah, the compiler
is right".

                 Linus

Reply via email to