Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-08-08 Thread Samuel Holland
On 8/5/20 2:17 AM, Arnd Bergmann wrote:
> On Wed, Aug 5, 2020 at 3:44 AM Samuel Holland  wrote:
>> On 8/3/20 9:02 AM, Arnd Bergmann wrote:
>>> On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland  wrote:
 All of the command structures are packed, due to the "#pragma pack(1)" 
 earlier
 in the file. So alignment is not an issue. This dma_addr_t member _is_ the
 explicit padding to make sizeof(TW_Command) -
 sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. 
 And
 indeed the structure is expected to be a different size depending on
 sizeof(dma_addr_t).
>>>
>>> Ah, so only the first few members are accessed by hardware and the
>>> last union is only accessed by the OS then? In that case I suppose it is
>>> all fine, but I would also suggest removing the "#pragma packed"
>>> to get somewhat more efficient access on systems that have  problems
>>> with misaligned accesses.
>>
>> I don't know what part the hardware accesses; everything I know about the
>> hardware comes from reading the driver.
> 
> I see now from your explanation below that this is a hardware-defined
> structure. I was confused by how it can be either 32 or 64 bits wide but
> found the
> 
> tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0;
> 
> line now that tells the hardware about which format is used.
> 
>> The problem with removing the "#pragma pack(1)" is that the structure is
>> inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may 
>> begin
>> with a __le64.
> 
> I think a fairly clean way to handle this would be to remove the pragma
> and instead define a local type like
> 
> #if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
> typedef  __le64 twa_address_t __packed;
> #else
> typedef __le32 twa_addr_t;
> #endif

I would be happy to implement this... but __packed only works on enums, structs,
and unions[1]:

In file included from drivers/scsi/3w-9xxx.c:100:
drivers/scsi/3w-9xxx.h:474:1: warning: 'packed' attribute ignored [-Wattributes]
  474 | typedef __le64 twa_addr_t __packed;
  | ^~~

[1]:
https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-packed-type-attribute

> The problem with marking the entire structure as packed, rather than
> just individual members is that you end up with very inefficient bytewise
> access on some architectures (especially those without cache-coherent
> DMA or hardware unaligned access in the CPU), so I would recommend
> avoiding that in portable driver code.

I agree, but I think this is a separate issue from what this patch is fixing. I
would prefer to save this change for a separate patch.

Regards,
Samuel


Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-08-05 Thread Arnd Bergmann
On Wed, Aug 5, 2020 at 3:44 AM Samuel Holland  wrote:
> On 8/3/20 9:02 AM, Arnd Bergmann wrote:
> > On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland  wrote:
> >> All of the command structures are packed, due to the "#pragma pack(1)" 
> >> earlier
> >> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
> >> explicit padding to make sizeof(TW_Command) -
> >> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. 
> >> And
> >> indeed the structure is expected to be a different size depending on
> >> sizeof(dma_addr_t).
> >
> > Ah, so only the first few members are accessed by hardware and the
> > last union is only accessed by the OS then? In that case I suppose it is
> > all fine, but I would also suggest removing the "#pragma packed"
> > to get somewhat more efficient access on systems that have  problems
> > with misaligned accesses.
>
> I don't know what part the hardware accesses; everything I know about the
> hardware comes from reading the driver.

I see now from your explanation below that this is a hardware-defined
structure. I was confused by how it can be either 32 or 64 bits wide but
found the

tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0;

line now that tells the hardware about which format is used.

> The problem with removing the "#pragma pack(1)" is that the structure is
> inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may 
> begin
> with a __le64.

I think a fairly clean way to handle this would be to remove the pragma
and instead define a local type like

#if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
typedef  __le64 twa_address_t __packed;
#else
typedef __le32 twa_addr_t;
#endif

The problem with marking the entire structure as packed, rather than
just individual members is that you end up with very inefficient bytewise
access on some architectures (especially those without cache-coherent
DMA or hardware unaligned access in the CPU), so I would recommend
avoiding that in portable driver code.

  Arnd


Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-08-04 Thread Samuel Holland
On 8/3/20 9:02 AM, Arnd Bergmann wrote:
> On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland  wrote:
>> On 7/31/20 2:29 AM, Arnd Bergmann wrote:
>>> On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland  wrote:

 The main issue observed was at the call to scsi_set_resid, where the
 byteswapped parameter would eventually trigger the alignment check at
 drivers/scsi/sd.c:2009. At that point, the kernel would continuously
 complain about an "Unaligned partial completion", and no further I/O
 could occur.

 This gets the controller working on big endian powerpc64.

 Signed-off-by: Samuel Holland 
 ---

 Changes since v1:
  - Include changes to use __le?? types in command structures
  - Use an object literal for the intermediate "schedulertime" value
  - Use local "error" variable to avoid repeated byte swapping
  - Create a local "length" variable to avoid very long lines
  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines

>>>
>>> Looks much better, thanks for the update. I see one more issue here
  /* Command Packet */
  typedef struct TW_Command {
 -   unsigned char opcode__sgloffset;
 -   unsigned char size;
 -   unsigned char request_id;
 -   unsigned char unit__hostid;
 +   u8  opcode__sgloffset;
 +   u8  size;
 +   u8  request_id;
 +   u8  unit__hostid;
 /* Second DWORD */
 -   unsigned char status;
 -   unsigned char flags;
 +   u8  status;
 +   u8  flags;
 union {
 -   unsigned short block_count;
 -   unsigned short parameter_count;
 +   __le16  block_count;
 +   __le16  parameter_count;
 } byte6_offset;
 union {
 struct {
 -   u32 lba;
 -   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
 -   dma_addr_t padding;
 +   __le32  lba;
 +   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
 +   dma_addr_t  padding;
>>>
>>>
>>> The use of dma_addr_t here seems odd, since this is neither endian-safe nor
>>> fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
>>> a variable-length fixed-endian word. I guess there is a chance that this is
>>> correct, but it is really confusing. On top of that, it seems that there is
>>> implied padding in the structure when built with a 64-bit dma_addr_t
>>> on most architectures but not on x86-32 (which uses 32-bit alignment for
>>> 64-bit integers). I don't know what the hardware definition is for 
>>> TW_Command,
>>> but ideally this would be expressed using only fixed-endian fixed-length
>>> members and explicit padding.
>>
>> All of the command structures are packed, due to the "#pragma pack(1)" 
>> earlier
>> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
>> explicit padding to make sizeof(TW_Command) -
>> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
>> indeed the structure is expected to be a different size depending on
>> sizeof(dma_addr_t).
> 
> Ah, so only the first few members are accessed by hardware and the
> last union is only accessed by the OS then? In that case I suppose it is
> all fine, but I would also suggest removing the "#pragma packed"
> to get somewhat more efficient access on systems that have  problems
> with misaligned accesses.

I don't know what part the hardware accesses; everything I know about the
hardware comes from reading the driver.

The problem with removing the "#pragma pack(1)" is that the structure is
inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may begin
with a __le64.

Regards,
Samuel


Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-08-03 Thread Arnd Bergmann
On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland  wrote:
>
> On 7/31/20 2:29 AM, Arnd Bergmann wrote:
> > On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland  wrote:
> >>
> >> The main issue observed was at the call to scsi_set_resid, where the
> >> byteswapped parameter would eventually trigger the alignment check at
> >> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
> >> complain about an "Unaligned partial completion", and no further I/O
> >> could occur.
> >>
> >> This gets the controller working on big endian powerpc64.
> >>
> >> Signed-off-by: Samuel Holland 
> >> ---
> >>
> >> Changes since v1:
> >>  - Include changes to use __le?? types in command structures
> >>  - Use an object literal for the intermediate "schedulertime" value
> >>  - Use local "error" variable to avoid repeated byte swapping
> >>  - Create a local "length" variable to avoid very long lines
> >>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
> >>
> >
> > Looks much better, thanks for the update. I see one more issue here
> >>  /* Command Packet */
> >>  typedef struct TW_Command {
> >> -   unsigned char opcode__sgloffset;
> >> -   unsigned char size;
> >> -   unsigned char request_id;
> >> -   unsigned char unit__hostid;
> >> +   u8  opcode__sgloffset;
> >> +   u8  size;
> >> +   u8  request_id;
> >> +   u8  unit__hostid;
> >> /* Second DWORD */
> >> -   unsigned char status;
> >> -   unsigned char flags;
> >> +   u8  status;
> >> +   u8  flags;
> >> union {
> >> -   unsigned short block_count;
> >> -   unsigned short parameter_count;
> >> +   __le16  block_count;
> >> +   __le16  parameter_count;
> >> } byte6_offset;
> >> union {
> >> struct {
> >> -   u32 lba;
> >> -   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> >> -   dma_addr_t padding;
> >> +   __le32  lba;
> >> +   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> >> +   dma_addr_t  padding;
> >
> >
> > The use of dma_addr_t here seems odd, since this is neither endian-safe nor
> > fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
> > a variable-length fixed-endian word. I guess there is a chance that this is
> > correct, but it is really confusing. On top of that, it seems that there is
> > implied padding in the structure when built with a 64-bit dma_addr_t
> > on most architectures but not on x86-32 (which uses 32-bit alignment for
> > 64-bit integers). I don't know what the hardware definition is for 
> > TW_Command,
> > but ideally this would be expressed using only fixed-endian fixed-length
> > members and explicit padding.
>
> All of the command structures are packed, due to the "#pragma pack(1)" earlier
> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
> explicit padding to make sizeof(TW_Command) -
> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
> indeed the structure is expected to be a different size depending on
> sizeof(dma_addr_t).

Ah, so only the first few members are accessed by hardware and the
last union is only accessed by the OS then? In that case I suppose it is
all fine, but I would also suggest removing the "#pragma packed"
to get somewhat more efficient access on systems that have  problems
with misaligned accesses.

  Arnd


Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-08-02 Thread Samuel Holland
On 7/31/20 2:29 AM, Arnd Bergmann wrote:
> On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland  wrote:
>>
>> The main issue observed was at the call to scsi_set_resid, where the
>> byteswapped parameter would eventually trigger the alignment check at
>> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
>> complain about an "Unaligned partial completion", and no further I/O
>> could occur.
>>
>> This gets the controller working on big endian powerpc64.
>>
>> Signed-off-by: Samuel Holland 
>> ---
>>
>> Changes since v1:
>>  - Include changes to use __le?? types in command structures
>>  - Use an object literal for the intermediate "schedulertime" value
>>  - Use local "error" variable to avoid repeated byte swapping
>>  - Create a local "length" variable to avoid very long lines
>>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
>>
> 
> Looks much better, thanks for the update. I see one more issue here
>>  /* Command Packet */
>>  typedef struct TW_Command {
>> -   unsigned char opcode__sgloffset;
>> -   unsigned char size;
>> -   unsigned char request_id;
>> -   unsigned char unit__hostid;
>> +   u8  opcode__sgloffset;
>> +   u8  size;
>> +   u8  request_id;
>> +   u8  unit__hostid;
>> /* Second DWORD */
>> -   unsigned char status;
>> -   unsigned char flags;
>> +   u8  status;
>> +   u8  flags;
>> union {
>> -   unsigned short block_count;
>> -   unsigned short parameter_count;
>> +   __le16  block_count;
>> +   __le16  parameter_count;
>> } byte6_offset;
>> union {
>> struct {
>> -   u32 lba;
>> -   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>> -   dma_addr_t padding;
>> +   __le32  lba;
>> +   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>> +   dma_addr_t  padding;
> 
> 
> The use of dma_addr_t here seems odd, since this is neither endian-safe nor
> fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
> a variable-length fixed-endian word. I guess there is a chance that this is
> correct, but it is really confusing. On top of that, it seems that there is
> implied padding in the structure when built with a 64-bit dma_addr_t
> on most architectures but not on x86-32 (which uses 32-bit alignment for
> 64-bit integers). I don't know what the hardware definition is for TW_Command,
> but ideally this would be expressed using only fixed-endian fixed-length
> members and explicit padding.

All of the command structures are packed, due to the "#pragma pack(1)" earlier
in the file. So alignment is not an issue. This dma_addr_t member _is_ the
explicit padding to make sizeof(TW_Command) -
sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
indeed the structure is expected to be a different size depending on
sizeof(dma_addr_t).

I left the padding member alone to avoid the #ifdef; since it's never accessed,
the endianness doesn't matter. In fact, since in both cases it's at the end of
the structure, it could probably be removed entirely. I don't see
sizeof(TW_Command) being used anywhere, but I'm not 100% certain. The downside
of removing it would be TW_COMMAND_SIZE becoming a slightly more magic number.

Regards,
Samuel


Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-07-31 Thread Arnd Bergmann
On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland  wrote:
>
> The main issue observed was at the call to scsi_set_resid, where the
> byteswapped parameter would eventually trigger the alignment check at
> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
> complain about an "Unaligned partial completion", and no further I/O
> could occur.
>
> This gets the controller working on big endian powerpc64.
>
> Signed-off-by: Samuel Holland 
> ---
>
> Changes since v1:
>  - Include changes to use __le?? types in command structures
>  - Use an object literal for the intermediate "schedulertime" value
>  - Use local "error" variable to avoid repeated byte swapping
>  - Create a local "length" variable to avoid very long lines
>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
>

Looks much better, thanks for the update. I see one more issue here
>  /* Command Packet */
>  typedef struct TW_Command {
> -   unsigned char opcode__sgloffset;
> -   unsigned char size;
> -   unsigned char request_id;
> -   unsigned char unit__hostid;
> +   u8  opcode__sgloffset;
> +   u8  size;
> +   u8  request_id;
> +   u8  unit__hostid;
> /* Second DWORD */
> -   unsigned char status;
> -   unsigned char flags;
> +   u8  status;
> +   u8  flags;
> union {
> -   unsigned short block_count;
> -   unsigned short parameter_count;
> +   __le16  block_count;
> +   __le16  parameter_count;
> } byte6_offset;
> union {
> struct {
> -   u32 lba;
> -   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> -   dma_addr_t padding;
> +   __le32  lba;
> +   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
> +   dma_addr_t  padding;


The use of dma_addr_t here seems odd, since this is neither endian-safe nor
fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
a variable-length fixed-endian word. I guess there is a chance that this is
correct, but it is really confusing. On top of that, it seems that there is
implied padding in the structure when built with a 64-bit dma_addr_t
on most architectures but not on x86-32 (which uses 32-bit alignment for
64-bit integers). I don't know what the hardware definition is for TW_Command,
but ideally this would be expressed using only fixed-endian fixed-length
members and explicit padding.

Arnd


[PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-07-30 Thread Samuel Holland
The main issue observed was at the call to scsi_set_resid, where the
byteswapped parameter would eventually trigger the alignment check at
drivers/scsi/sd.c:2009. At that point, the kernel would continuously
complain about an "Unaligned partial completion", and no further I/O
could occur.

This gets the controller working on big endian powerpc64.

Signed-off-by: Samuel Holland 
---

Changes since v1:
 - Include changes to use __le?? types in command structures
 - Use an object literal for the intermediate "schedulertime" value
 - Use local "error" variable to avoid repeated byte swapping
 - Create a local "length" variable to avoid very long lines
 - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines

I verified this patch with `make C=1`, and there were no warnings from
these files.

---
 drivers/scsi/3w-9xxx.c |  56 +
 drivers/scsi/3w-9xxx.h | 112 ++---
 2 files changed, 85 insertions(+), 83 deletions(-)

diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index 3337b1e80412..8f56beefa338 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -303,10 +303,10 @@ static int twa_aen_drain_queue(TW_Device_Extension 
*tw_dev, int no_check_reset)
 
/* Initialize sglist */
memset(, 0, sizeof(TW_SG_Entry));
-   sglist[0].length = TW_SECTOR_SIZE;
-   sglist[0].address = tw_dev->generic_buffer_phys[request_id];
+   sglist[0].length = cpu_to_le32(TW_SECTOR_SIZE);
+   sglist[0].address = 
TW_CPU_TO_SGL(tw_dev->generic_buffer_phys[request_id]);
 
-   if (sglist[0].address & TW_ALIGNMENT_9000_SGL) {
+   if (tw_dev->generic_buffer_phys[request_id] & TW_ALIGNMENT_9000_SGL) {
TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1, "Found unaligned 
address during AEN drain");
goto out;
}
@@ -440,8 +440,8 @@ static int twa_aen_read_queue(TW_Device_Extension *tw_dev, 
int request_id)
 
/* Initialize sglist */
memset(, 0, sizeof(TW_SG_Entry));
-   sglist[0].length = TW_SECTOR_SIZE;
-   sglist[0].address = tw_dev->generic_buffer_phys[request_id];
+   sglist[0].length = cpu_to_le32(TW_SECTOR_SIZE);
+   sglist[0].address = 
TW_CPU_TO_SGL(tw_dev->generic_buffer_phys[request_id]);
 
/* Mark internal command */
tw_dev->srb[request_id] = NULL;
@@ -501,9 +501,8 @@ static void twa_aen_sync_time(TW_Device_Extension *tw_dev, 
int request_id)
Sunday 12:00AM */
local_time = (ktime_get_real_seconds() - (sys_tz.tz_minuteswest * 60));
div_u64_rem(local_time - (3 * 86400), 604800, );
-   schedulertime = cpu_to_le32(schedulertime % 604800);
 
-   memcpy(param->data, , sizeof(u32));
+   memcpy(param->data, &(__le32){cpu_to_le32(schedulertime)}, 
sizeof(__le32));
 
/* Mark internal command */
tw_dev->srb[request_id] = NULL;
@@ -1000,19 +999,13 @@ static int twa_fill_sense(TW_Device_Extension *tw_dev, 
int request_id, int copy_
if (print_host)
printk(KERN_WARNING "3w-9xxx: scsi%d: ERROR: 
(0x%02X:0x%04X): %s:%s.\n",
   tw_dev->host->host_no,
-  TW_MESSAGE_SOURCE_CONTROLLER_ERROR,
-  full_command_packet->header.status_block.error,
-  error_str[0] == '\0' ?
-  twa_string_lookup(twa_error_table,
-
full_command_packet->header.status_block.error) : error_str,
+  TW_MESSAGE_SOURCE_CONTROLLER_ERROR, error,
+  error_str[0] ? error_str : 
twa_string_lookup(twa_error_table, error),
   full_command_packet->header.err_specific_desc);
else
printk(KERN_WARNING "3w-9xxx: ERROR: (0x%02X:0x%04X): 
%s:%s.\n",
-  TW_MESSAGE_SOURCE_CONTROLLER_ERROR,
-  full_command_packet->header.status_block.error,
-  error_str[0] == '\0' ?
-  twa_string_lookup(twa_error_table,
-
full_command_packet->header.status_block.error) : error_str,
+  TW_MESSAGE_SOURCE_CONTROLLER_ERROR, error,
+  error_str[0] ? error_str : 
twa_string_lookup(twa_error_table, error),
   full_command_packet->header.err_specific_desc);
}
 
@@ -1129,12 +1122,11 @@ static int twa_initconnection(TW_Device_Extension 
*tw_dev, int message_credits,
tw_initconnect->opcode__reserved = TW_OPRES_IN(0, 
TW_OP_INIT_CONNECTION);
tw_initconnect->request_id = request_id;
tw_initconnect->message_credits = cpu_to_le16(message_credits);
-   tw_initconnect->features = set_features;
 
/* Turn on 64-bit sgl support if we need to */
-