Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-03-01 Thread Martin K. Petersen

James,

>> Cc:  #4.4+

Ugh.

> Martin, can you fix this up with a rebase and I'll resync my tree?

Done and pushed.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-03-01 Thread James Bottomley
On Mon, 2018-02-12 at 10:28 -0800, Himanshu Madhani wrote:
[...]
> Cc:  #4.4+

This is the wrong stable tag, which would lead to stable not picking up
the patch automatically.  The correct stable address is

Cc:  #4.4+
                 ^^

Please be more careful in future.

Martin, can you fix this up with a rebase and I'll resync my tree?

Thanks,

James



Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-22 Thread Martin K. Petersen

Himanshu,

> Can you please queue this patch for 4.16-rc3

Applied to 4.16/scsi-fixes with the missing Fixes: tag. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-22 Thread Madhani, Himanshu
Hi Martin, 

> On Feb 14, 2018, at 11:58 AM, Madhani, Himanshu  
> wrote:
> 
> Hi John,
> 
>> On Feb 13, 2018, at 2:54 AM, John Garry  wrote:
>> 
>> On 12/02/2018 18:28, Himanshu Madhani wrote:
>>> This patch fixes NULL pointer crash due to active timer running
>>> for abort IOCB.
>>> 
 From crash dump analysis it was discoverd that get_next_timer_interrupt()
>>> encountered a corrupted entry on the timer list.
>>> 
>>> #9 [95e1f6f0fd40] page_fault at 914fe8f8
>>>   [exception RIP: get_next_timer_interrupt+440]
>>>   RIP: 90ea3088  RSP: 95e1f6f0fdf0  RFLAGS: 00010013
>>>   RAX: 95e1f6451028  RBX: 000218e2389e5f40  RCX: 0001232ad600
>>>   RDX: 0001  RSI: 95e1f6f0fdf0  RDI: 01232ad6
>>>   RBP: 95e1f6f0fe40   R8: 95e1f6451188   R9: 0001
>>>   R10: 0016  R11: 0016  R12: 0001232ad5f6
>>>   R13: 95e1f645  R14: 95e1f6f0fdf8  R15: 95e1f6f0fe10
>>>   ORIG_RAX:   CS: 0010  SS: 0018
>>> 
>>> Looking at the assembly of get_next_timer_interrupt(), address came
>>> from %r8 (95e1f6451188) which is pointing to list_head with single
>>> entry at 95e5ff621178.
>>> 
>>> 0x90ea307a :  mov(%r8),%rdx
>>> 0x90ea307d :  cmp%r8,%rdx
>>> 0x90ea3080 :  je 
>>> 0x90ea30a7 
>>> 0x90ea3082 :  nopw   
>>> 0x0(%rax,%rax,1)
>>> 0x90ea3088 :  testb  
>>> $0x1,0x18(%rdx)
>>> 
>>> crash> rd 95e1f6451188 10
>>> 95e1f6451188:  95e5ff621178 95e5ff621178   x.b.x.b.
>>> 95e1f6451198:  95e1f6451198 95e1f6451198   ..E...E.
>>> 95e1f64511a8:  95e1f64511a8 95e1f64511a8   ..E...E.
>>> 95e1f64511b8:  95e77cf509a0 95e77cf509a0   ...|...|
>>> 95e1f64511c8:  95e1f64511c8 95e1f64511c8   ..E...E.
>>> 
>>> crash> rd 95e5ff621178 10
>>> 95e5ff621178:  0001 95e15936aa00   ..6Y
>>> 95e5ff621188:      
>>> 95e5ff621198:  00a0 0010   
>>> 95e5ff6211a8:  95e5ff621198 000c   ..b.
>>> 95e5ff6211b8:  0f58 95e751f8d720   X... ..Q
>>> 
>>> 95e5ff621178 belongs to freed mempool object at 95e5ff621080.
>>> 
>>> CACHENAME OBJSIZE  ALLOCATED TOTAL  SLABS  
>>> SSIZE
>>> 95dc7fd74d00 mnt_cache384  19785 24948594   
>>>  16k
>>>  SLAB  MEMORYNODE  TOTAL  ALLOCATED  FREE
>>>  dc5dabfd8800  95e5ff62 1 42 2913
>>>  FREE / [ALLOCATED]
>>>   95e5ff621080  (cpu 6 cache)
>>> 
>>> Examining the contents of that memory reveals a pointer to a constant
>>> string in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().
>>> 
>>> crash> rd c059277c 20
>>> c059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
>>> c059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
>>> c059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
>>> c05927ac:  636976656420676e 786c252074612065   ng device at %lx
>>> c05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
>>> c05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
>>> c05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
>>> c05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
>>> c05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
>>> c059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma
>>> 
>>> crash> struct -ox srb_iocb
>>> struct srb_iocb {
>>>  union {
>>>  struct {...} logio;
>>>  struct {...} els_logo;
>>>  struct {...} tmf;
>>>  struct {...} fxiocb;
>>>  struct {...} abt;
>>>  struct ct_arg ctarg;
>>>  struct {...} mbx;
>>>  struct {...} nack;
>>>   [0x0 ] } u;
>>>   [0xb8] struct timer_list timer;
>>>   [0x108] void (*timeout)(void *);
>>> }
>>> SIZE: 0x110
>>> 
>>> crash> ! bc
>>> ibase=16
>>> obase=10
>>> B8+40
>>> F8
>>> 
>>> The object is a srb_t, and at offset 0xf8 within that structure
>>> (i.e. 95e5ff621080 + f8 -> 95e5ff621178) is a struct timer_list.
>>> 
>>> Cc:  #4.4+
>>> Signed-off-by: Himanshu Madhani 
>>> ---
>>> Hi Martin,
>>> 
>>> This patch addresses crash due to NULL pointer access because driver
>>> left active timer running for abort IOCB.
>>> 
>>> Please apply this patch to 4.16/scsi-fixes at your 

Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-14 Thread Madhani, Himanshu
Hi John,

> On Feb 13, 2018, at 2:54 AM, John Garry  wrote:
> 
> On 12/02/2018 18:28, Himanshu Madhani wrote:
>> This patch fixes NULL pointer crash due to active timer running
>> for abort IOCB.
>> 
>>> From crash dump analysis it was discoverd that get_next_timer_interrupt()
>> encountered a corrupted entry on the timer list.
>> 
>> #9 [95e1f6f0fd40] page_fault at 914fe8f8
>>[exception RIP: get_next_timer_interrupt+440]
>>RIP: 90ea3088  RSP: 95e1f6f0fdf0  RFLAGS: 00010013
>>RAX: 95e1f6451028  RBX: 000218e2389e5f40  RCX: 0001232ad600
>>RDX: 0001  RSI: 95e1f6f0fdf0  RDI: 01232ad6
>>RBP: 95e1f6f0fe40   R8: 95e1f6451188   R9: 0001
>>R10: 0016  R11: 0016  R12: 0001232ad5f6
>>R13: 95e1f645  R14: 95e1f6f0fdf8  R15: 95e1f6f0fe10
>>ORIG_RAX:   CS: 0010  SS: 0018
>> 
>> Looking at the assembly of get_next_timer_interrupt(), address came
>> from %r8 (95e1f6451188) which is pointing to list_head with single
>> entry at 95e5ff621178.
>> 
>> 0x90ea307a :  mov(%r8),%rdx
>> 0x90ea307d :  cmp%r8,%rdx
>> 0x90ea3080 :  je 
>> 0x90ea30a7 
>> 0x90ea3082 :  nopw   
>> 0x0(%rax,%rax,1)
>> 0x90ea3088 :  testb  
>> $0x1,0x18(%rdx)
>> 
>> crash> rd 95e1f6451188 10
>> 95e1f6451188:  95e5ff621178 95e5ff621178   x.b.x.b.
>> 95e1f6451198:  95e1f6451198 95e1f6451198   ..E...E.
>> 95e1f64511a8:  95e1f64511a8 95e1f64511a8   ..E...E.
>> 95e1f64511b8:  95e77cf509a0 95e77cf509a0   ...|...|
>> 95e1f64511c8:  95e1f64511c8 95e1f64511c8   ..E...E.
>> 
>> crash> rd 95e5ff621178 10
>> 95e5ff621178:  0001 95e15936aa00   ..6Y
>> 95e5ff621188:      
>> 95e5ff621198:  00a0 0010   
>> 95e5ff6211a8:  95e5ff621198 000c   ..b.
>> 95e5ff6211b8:  0f58 95e751f8d720   X... ..Q
>> 
>> 95e5ff621178 belongs to freed mempool object at 95e5ff621080.
>> 
>> CACHENAME OBJSIZE  ALLOCATED TOTAL  SLABS  
>> SSIZE
>> 95dc7fd74d00 mnt_cache384  19785 24948594
>> 16k
>>   SLAB  MEMORYNODE  TOTAL  ALLOCATED  FREE
>>   dc5dabfd8800  95e5ff62 1 42 2913
>>   FREE / [ALLOCATED]
>>95e5ff621080  (cpu 6 cache)
>> 
>> Examining the contents of that memory reveals a pointer to a constant
>> string in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().
>> 
>> crash> rd c059277c 20
>> c059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
>> c059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
>> c059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
>> c05927ac:  636976656420676e 786c252074612065   ng device at %lx
>> c05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
>> c05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
>> c05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
>> c05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
>> c05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
>> c059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma
>> 
>> crash> struct -ox srb_iocb
>> struct srb_iocb {
>>   union {
>>   struct {...} logio;
>>   struct {...} els_logo;
>>   struct {...} tmf;
>>   struct {...} fxiocb;
>>   struct {...} abt;
>>   struct ct_arg ctarg;
>>   struct {...} mbx;
>>   struct {...} nack;
>>[0x0 ] } u;
>>[0xb8] struct timer_list timer;
>>[0x108] void (*timeout)(void *);
>> }
>> SIZE: 0x110
>> 
>> crash> ! bc
>> ibase=16
>> obase=10
>> B8+40
>> F8
>> 
>> The object is a srb_t, and at offset 0xf8 within that structure
>> (i.e. 95e5ff621080 + f8 -> 95e5ff621178) is a struct timer_list.
>> 
>> Cc:  #4.4+
>> Signed-off-by: Himanshu Madhani 
>> ---
>> Hi Martin,
>> 
>> This patch addresses crash due to NULL pointer access because driver
>> left active timer running for abort IOCB.
>> 
>> Please apply this patch to 4.16/scsi-fixes at your earliest convenience.
>> 
>> Thanks,
>> Himanshu
>> ---
>> drivers/scsi/qla2xxx/qla_init.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/scsi/qla2xxx/qla_init.c 
>> 

Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-14 Thread Johannes Thumshirn
On Tue, 2018-02-13 at 18:13 +, Madhani, Himanshu wrote:
> Thanks for asking (I did not add fixes commit ID because it was pre
> 4.x kernel) 
> 
> Here’s commit id 4440e46d5db7b445a961a8849b2a31fa7fd1 which is
> fixed by this patch.

Thanks for the info. In general having a 
Fixes: 4440e46d5db7 ("[SCSI] qla2xxx: Add IOCB Abort command
asynchronous handling.")
line is extremly valuable as many distributions have automation in
place to check whether a fix for a backport arrived upstream.

Thanks,
Johannes

-- 
Johannes Thumshirn                                          Storage
jthu
msh...@suse.de                                +49 911 74053 689
SUSE
LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane
Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38
9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-13 Thread Madhani, Himanshu
Hi John, 

> On Feb 13, 2018, at 2:54 AM, John Garry  wrote:
> 
> On 12/02/2018 18:28, Himanshu Madhani wrote:
>> This patch fixes NULL pointer crash due to active timer running
>> for abort IOCB.
>> 
>>> From crash dump analysis it was discoverd that get_next_timer_interrupt()
>> encountered a corrupted entry on the timer list.
>> 
>> #9 [95e1f6f0fd40] page_fault at 914fe8f8
>>[exception RIP: get_next_timer_interrupt+440]
>>RIP: 90ea3088  RSP: 95e1f6f0fdf0  RFLAGS: 00010013
>>RAX: 95e1f6451028  RBX: 000218e2389e5f40  RCX: 0001232ad600
>>RDX: 0001  RSI: 95e1f6f0fdf0  RDI: 01232ad6
>>RBP: 95e1f6f0fe40   R8: 95e1f6451188   R9: 0001
>>R10: 0016  R11: 0016  R12: 0001232ad5f6
>>R13: 95e1f645  R14: 95e1f6f0fdf8  R15: 95e1f6f0fe10
>>ORIG_RAX:   CS: 0010  SS: 0018
>> 
>> Looking at the assembly of get_next_timer_interrupt(), address came
>> from %r8 (95e1f6451188) which is pointing to list_head with single
>> entry at 95e5ff621178.
>> 
>> 0x90ea307a :  mov(%r8),%rdx
>> 0x90ea307d :  cmp%r8,%rdx
>> 0x90ea3080 :  je 
>> 0x90ea30a7 
>> 0x90ea3082 :  nopw   
>> 0x0(%rax,%rax,1)
>> 0x90ea3088 :  testb  
>> $0x1,0x18(%rdx)
>> 
>> crash> rd 95e1f6451188 10
>> 95e1f6451188:  95e5ff621178 95e5ff621178   x.b.x.b.
>> 95e1f6451198:  95e1f6451198 95e1f6451198   ..E...E.
>> 95e1f64511a8:  95e1f64511a8 95e1f64511a8   ..E...E.
>> 95e1f64511b8:  95e77cf509a0 95e77cf509a0   ...|...|
>> 95e1f64511c8:  95e1f64511c8 95e1f64511c8   ..E...E.
>> 
>> crash> rd 95e5ff621178 10
>> 95e5ff621178:  0001 95e15936aa00   ..6Y
>> 95e5ff621188:      
>> 95e5ff621198:  00a0 0010   
>> 95e5ff6211a8:  95e5ff621198 000c   ..b.
>> 95e5ff6211b8:  0f58 95e751f8d720   X... ..Q
>> 
>> 95e5ff621178 belongs to freed mempool object at 95e5ff621080.
>> 
>> CACHENAME OBJSIZE  ALLOCATED TOTAL  SLABS  
>> SSIZE
>> 95dc7fd74d00 mnt_cache384  19785 24948594
>> 16k
>>   SLAB  MEMORYNODE  TOTAL  ALLOCATED  FREE
>>   dc5dabfd8800  95e5ff62 1 42 2913
>>   FREE / [ALLOCATED]
>>95e5ff621080  (cpu 6 cache)
>> 
>> Examining the contents of that memory reveals a pointer to a constant
>> string in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().
>> 
>> crash> rd c059277c 20
>> c059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
>> c059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
>> c059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
>> c05927ac:  636976656420676e 786c252074612065   ng device at %lx
>> c05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
>> c05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
>> c05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
>> c05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
>> c05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
>> c059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma
>> 
>> crash> struct -ox srb_iocb
>> struct srb_iocb {
>>   union {
>>   struct {...} logio;
>>   struct {...} els_logo;
>>   struct {...} tmf;
>>   struct {...} fxiocb;
>>   struct {...} abt;
>>   struct ct_arg ctarg;
>>   struct {...} mbx;
>>   struct {...} nack;
>>[0x0 ] } u;
>>[0xb8] struct timer_list timer;
>>[0x108] void (*timeout)(void *);
>> }
>> SIZE: 0x110
>> 
>> crash> ! bc
>> ibase=16
>> obase=10
>> B8+40
>> F8
>> 
>> The object is a srb_t, and at offset 0xf8 within that structure
>> (i.e. 95e5ff621080 + f8 -> 95e5ff621178) is a struct timer_list.
>> 
>> Cc:  #4.4+
>> Signed-off-by: Himanshu Madhani 
>> ---
>> Hi Martin,
>> 
>> This patch addresses crash due to NULL pointer access because driver
>> left active timer running for abort IOCB.
>> 
>> Please apply this patch to 4.16/scsi-fixes at your earliest convenience.
>> 
>> Thanks,
>> Himanshu
>> ---
>> drivers/scsi/qla2xxx/qla_init.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/scsi/qla2xxx/qla_init.c 
>> 

Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-13 Thread Madhani, Himanshu
Hi Johannes, 

> On Feb 13, 2018, at 2:38 AM, Johannes Thumshirn  wrote:
> 
> Thanks Himanshu,
> Reviewed-by: Johannes Thumshirn 
> 
> Do you happen to know which commit it actually fixes?
> 

Thanks for asking (I did not add fixes commit ID because it was pre 4.x kernel) 

Here’s commit id 4440e46d5db7b445a961a8849b2a31fa7fd1 which is fixed by 
this patch.

> -- 
> Johannes Thumshirn  Storage
> jthu
> msh...@suse.de+49 911 74053 689
> SUSE
> LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane
> Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38
> 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Thanks,
- Himanshu



Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-13 Thread John Garry

On 12/02/2018 18:28, Himanshu Madhani wrote:

This patch fixes NULL pointer crash due to active timer running
for abort IOCB.


From crash dump analysis it was discoverd that get_next_timer_interrupt()

encountered a corrupted entry on the timer list.

 #9 [95e1f6f0fd40] page_fault at 914fe8f8
[exception RIP: get_next_timer_interrupt+440]
RIP: 90ea3088  RSP: 95e1f6f0fdf0  RFLAGS: 00010013
RAX: 95e1f6451028  RBX: 000218e2389e5f40  RCX: 0001232ad600
RDX: 0001  RSI: 95e1f6f0fdf0  RDI: 01232ad6
RBP: 95e1f6f0fe40   R8: 95e1f6451188   R9: 0001
R10: 0016  R11: 0016  R12: 0001232ad5f6
R13: 95e1f645  R14: 95e1f6f0fdf8  R15: 95e1f6f0fe10
ORIG_RAX:   CS: 0010  SS: 0018

 Looking at the assembly of get_next_timer_interrupt(), address came
 from %r8 (95e1f6451188) which is pointing to list_head with single
 entry at 95e5ff621178.

 0x90ea307a :  mov(%r8),%rdx
 0x90ea307d :  cmp%r8,%rdx
 0x90ea3080 :  je 0x90ea30a7 

 0x90ea3082 :  nopw   0x0(%rax,%rax,1)
 0x90ea3088 :  testb  $0x1,0x18(%rdx)

 crash> rd 95e1f6451188 10
 95e1f6451188:  95e5ff621178 95e5ff621178   x.b.x.b.
 95e1f6451198:  95e1f6451198 95e1f6451198   ..E...E.
 95e1f64511a8:  95e1f64511a8 95e1f64511a8   ..E...E.
 95e1f64511b8:  95e77cf509a0 95e77cf509a0   ...|...|
 95e1f64511c8:  95e1f64511c8 95e1f64511c8   ..E...E.

 crash> rd 95e5ff621178 10
 95e5ff621178:  0001 95e15936aa00   ..6Y
 95e5ff621188:      
 95e5ff621198:  00a0 0010   
 95e5ff6211a8:  95e5ff621198 000c   ..b.
 95e5ff6211b8:  0f58 95e751f8d720   X... ..Q

 95e5ff621178 belongs to freed mempool object at 95e5ff621080.

 CACHENAME OBJSIZE  ALLOCATED TOTAL  SLABS  
SSIZE
 95dc7fd74d00 mnt_cache384  19785 24948594
16k
   SLAB  MEMORYNODE  TOTAL  ALLOCATED  FREE
   dc5dabfd8800  95e5ff62 1 42 2913
   FREE / [ALLOCATED]
95e5ff621080  (cpu 6 cache)

 Examining the contents of that memory reveals a pointer to a constant
 string in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().

 crash> rd c059277c 20
 c059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
 c059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
 c059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
 c05927ac:  636976656420676e 786c252074612065   ng device at %lx
 c05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
 c05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
 c05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
 c05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
 c05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
 c059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma

 crash> struct -ox srb_iocb
 struct srb_iocb {
   union {
   struct {...} logio;
   struct {...} els_logo;
   struct {...} tmf;
   struct {...} fxiocb;
   struct {...} abt;
   struct ct_arg ctarg;
   struct {...} mbx;
   struct {...} nack;
[0x0 ] } u;
[0xb8] struct timer_list timer;
[0x108] void (*timeout)(void *);
 }
 SIZE: 0x110

 crash> ! bc
 ibase=16
 obase=10
 B8+40
 F8

 The object is a srb_t, and at offset 0xf8 within that structure
 (i.e. 95e5ff621080 + f8 -> 95e5ff621178) is a struct timer_list.

Cc:  #4.4+
Signed-off-by: Himanshu Madhani 
---
Hi Martin,

This patch addresses crash due to NULL pointer access because driver
left active timer running for abort IOCB.

Please apply this patch to 4.16/scsi-fixes at your earliest convenience.

Thanks,
Himanshu
---
 drivers/scsi/qla2xxx/qla_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 2dea1129d396..04870621e712 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -1527,6 +1527,7 @@ qla24xx_abort_sp_done(void *ptr, int res)
srb_t *sp = ptr;
struct srb_iocb *abt = >u.iocb_cmd;

+   del_timer(>u.iocb_cmd.timer);


Hi,

I am not familiar with this 

Re: [PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-13 Thread Johannes Thumshirn
Thanks Himanshu,
Reviewed-by: Johannes Thumshirn 

Do you happen to know which commit it actually fixes?

-- 
Johannes Thumshirn                                          Storage
jthu
msh...@suse.de                                +49 911 74053 689
SUSE
LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane
Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38
9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


[PATCH] qla2xxx: Fix NULL pointer crash due to active timer for ABTS

2018-02-12 Thread Himanshu Madhani
This patch fixes NULL pointer crash due to active timer running
for abort IOCB.

>From crash dump analysis it was discoverd that get_next_timer_interrupt()
encountered a corrupted entry on the timer list.

 #9 [95e1f6f0fd40] page_fault at 914fe8f8
[exception RIP: get_next_timer_interrupt+440]
RIP: 90ea3088  RSP: 95e1f6f0fdf0  RFLAGS: 00010013
RAX: 95e1f6451028  RBX: 000218e2389e5f40  RCX: 0001232ad600
RDX: 0001  RSI: 95e1f6f0fdf0  RDI: 01232ad6
RBP: 95e1f6f0fe40   R8: 95e1f6451188   R9: 0001
R10: 0016  R11: 0016  R12: 0001232ad5f6
R13: 95e1f645  R14: 95e1f6f0fdf8  R15: 95e1f6f0fe10
ORIG_RAX:   CS: 0010  SS: 0018

 Looking at the assembly of get_next_timer_interrupt(), address came
 from %r8 (95e1f6451188) which is pointing to list_head with single
 entry at 95e5ff621178.

 0x90ea307a :  mov(%r8),%rdx
 0x90ea307d :  cmp%r8,%rdx
 0x90ea3080 :  je 
0x90ea30a7 
 0x90ea3082 :  nopw   0x0(%rax,%rax,1)
 0x90ea3088 :  testb  $0x1,0x18(%rdx)

 crash> rd 95e1f6451188 10
 95e1f6451188:  95e5ff621178 95e5ff621178   x.b.x.b.
 95e1f6451198:  95e1f6451198 95e1f6451198   ..E...E.
 95e1f64511a8:  95e1f64511a8 95e1f64511a8   ..E...E.
 95e1f64511b8:  95e77cf509a0 95e77cf509a0   ...|...|
 95e1f64511c8:  95e1f64511c8 95e1f64511c8   ..E...E.

 crash> rd 95e5ff621178 10
 95e5ff621178:  0001 95e15936aa00   ..6Y
 95e5ff621188:      
 95e5ff621198:  00a0 0010   
 95e5ff6211a8:  95e5ff621198 000c   ..b.
 95e5ff6211b8:  0f58 95e751f8d720   X... ..Q

 95e5ff621178 belongs to freed mempool object at 95e5ff621080.

 CACHENAME OBJSIZE  ALLOCATED TOTAL  SLABS  
SSIZE
 95dc7fd74d00 mnt_cache384  19785 24948594
16k
   SLAB  MEMORYNODE  TOTAL  ALLOCATED  FREE
   dc5dabfd8800  95e5ff62 1 42 2913
   FREE / [ALLOCATED]
95e5ff621080  (cpu 6 cache)

 Examining the contents of that memory reveals a pointer to a constant
 string in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().

 crash> rd c059277c 20
 c059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
 c059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
 c059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
 c05927ac:  636976656420676e 786c252074612065   ng device at %lx
 c05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
 c05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
 c05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
 c05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
 c05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
 c059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma

 crash> struct -ox srb_iocb
 struct srb_iocb {
   union {
   struct {...} logio;
   struct {...} els_logo;
   struct {...} tmf;
   struct {...} fxiocb;
   struct {...} abt;
   struct ct_arg ctarg;
   struct {...} mbx;
   struct {...} nack;
[0x0 ] } u;
[0xb8] struct timer_list timer;
[0x108] void (*timeout)(void *);
 }
 SIZE: 0x110

 crash> ! bc
 ibase=16
 obase=10
 B8+40
 F8

 The object is a srb_t, and at offset 0xf8 within that structure
 (i.e. 95e5ff621080 + f8 -> 95e5ff621178) is a struct timer_list.

Cc:  #4.4+
Signed-off-by: Himanshu Madhani 
---
Hi Martin, 

This patch addresses crash due to NULL pointer access because driver
left active timer running for abort IOCB.

Please apply this patch to 4.16/scsi-fixes at your earliest convenience.

Thanks,
Himanshu
---
 drivers/scsi/qla2xxx/qla_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index 2dea1129d396..04870621e712 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -1527,6 +1527,7 @@ qla24xx_abort_sp_done(void *ptr, int res)
srb_t *sp = ptr;
struct srb_iocb *abt = >u.iocb_cmd;
 
+   del_timer(>u.iocb_cmd.timer);
complete(>u.abt.comp);
 }
 
-- 
2.12.0