On 10/7/25 23:56, James Bottomley wrote:
On Tue, 2025-10-07 at 15:18 +0100, Gustavo A. R. Silva wrote:


On 10/7/25 12:59, James Bottomley wrote:
On Tue, 2025-10-07 at 11:43 +0100, Gustavo A. R. Silva wrote:
Hi all,

Friendly ping: who can take this, please?

After what happened with the qla2xxx driver, everyone is a bit wary
of these changes, particularly when they affect structures shared
with the hardware. Megaraid is a broadcom acquisition so although
maintained it might take them a while to check this.

I've been in constant communication with the people involved. So far,
none of them has expressed any concerns about this to me. However, I
appreciate your feedback.

In any case, I promptly submitted a bugfix minutes after getting the
report.

I'm not criticizing the response, just saying that problems like this
cause me to think that someone who understands and can test the
hardware needs to look at this.  However ...

That's true. I agree.


However, you could help us with this: as I understand it (there is
a bit of a no documentation problem here), the TRAILING_OVERLAP
formalism merely gets the compiler not to warn about the situation
rather than actually changing anything in the layout of the
structure?  In which case you should be able to demonstrate the
binary produced before and after this patch is the same, which
would very much reduce the risk of taking it.

This is quite simple. Here you go the pahole output before and after
changes.

BEFORE CHANGES:

pahole -C MR_FW_RAID_MAP_ALL drivers/scsi/megaraid/megaraid_sas_fp.o
struct MR_FW_RAID_MAP_ALL {
          struct MR_FW_RAID_MAP      raidMap;              /*     0
10408 */
          /* --- cacheline 162 boundary (10368 bytes) was 40 bytes ago
--- */
          struct MR_LD_SPAN_MAP      ldSpanMap[64];        /* 10408
161792 */

          /* size: 172200, cachelines: 2691, members: 2 */
          /* last cacheline: 40 bytes */
};

AFTER CHANGES:

pahole -C MR_FW_RAID_MAP_ALL drivers/scsi/megaraid/megaraid_sas_fp.o
struct MR_FW_RAID_MAP_ALL {
          union {
                  struct MR_FW_RAID_MAP raidMap;           /*     0
10408 */
                  struct {
                          unsigned char __offset_to_FAM[10408]; /*
0 10408 */
                          /* --- cacheline 162 boundary (10368 bytes)
was 40 bytes ago --- */
                          struct MR_LD_SPAN_MAP ldSpanMap[64]; /*
10408 161792 */
                  };                                       /*     0
172200 */
          };                                               /*     0
172200 */

          /* size: 172200, cachelines: 2691, members: 1 */
          /* last cacheline: 40 bytes */
};

As you can see, the size is exactly the same, as are the offsets for
both members raidMap and ldSpanMap.

... this is good enough to prove that the binary before and after is
identical and thus there's no change to the structures, which means the
risk of accepting the patch is significantly lower.

  The trick is that, thanks to the union and __offset_to_FAM, the
flexible-array member raidMap.ldSpanMap[] now appears as the last
member instead of somewhere in the middle.

So both ldSpanMap and raidMap.ldSpanMap[] now cleanly overlap, as
seems to have been intended.

(Exactly the same applies for struct MR_DRV_RAID_MAP_ALL)

I can include this explanation to the changelog text if you'd like.

I'll leave it up to Martin, but I think going forwards it would be
helpful if you could indicate that you've checked that the binary
layout before and after is unchanged and thus the risk of merging the
patch is low.

Absolutely. I'll do that.

Thanks for the feedback.
-Gustavo


Reply via email to