Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-21 Thread Leon Romanovsky
On Wed, Jun 17, 2020 at 03:23:00PM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 17, 2020 at 08:17:39AM +0300, Leon Romanovsky wrote:
> >
> > My thoughts that everything here hints me that state machine and
> > locking are implemented wrongly. In ideal world, the expectation
> > is that REQ message will have a state in it (PREPARED, SENT, ACK
> > e.t.c.) and list manipulations are done accordingly with proper
> > locks, while rdma_nl_multicast() is done outside of the locks.
>
> It can't be done outside the lock without creating races - once
> rdma_nl_multicast happens it is possible for the other leg of the
> operation to begin processing.

It means that the state machine is wrong, not complete.

>
> The list must be updated before this happens.
>
> What is missing here is refcounting - the lifetime model of this data
> is too implicit, but it is not worth adding I think

I have same feeling for now, but it will flip if new fixes be in this area.

Thanks

>
> Jason


Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-19 Thread Divya Indi
Hi Jason,

Thanks for taking the time to review!

On 6/17/20 11:24 AM, Jason Gunthorpe wrote:
> On Tue, Jun 16, 2020 at 10:56:53AM -0700, Divya Indi wrote:
>> The other option might be to use GFP_NOWAIT conditionally ie
>> (only use GFP_NOWAIT when GFP_ATOMIC is not specified in gfp_mask else
>> use GFP_ATOMIC). Eventual goal being to not have a blocking memory 
>> allocation.
> This is probably safest for now, unless you can audit all callers and
> see if they can switch to GFP_NOWAIT as well

At present the callers with GFP_ATOMIC appear to be ipoib. Might not be
feasible to change them all to GFP_NOWAIT. 

Will incorporate the review comments and send out v3 early next week.

Thanks,
Divya

>
> Jason


Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-17 Thread Jason Gunthorpe
On Tue, Jun 16, 2020 at 10:56:53AM -0700, Divya Indi wrote:
> The other option might be to use GFP_NOWAIT conditionally ie
> (only use GFP_NOWAIT when GFP_ATOMIC is not specified in gfp_mask else
> use GFP_ATOMIC). Eventual goal being to not have a blocking memory allocation.

This is probably safest for now, unless you can audit all callers and
see if they can switch to GFP_NOWAIT as well

Jason


Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-17 Thread Jason Gunthorpe
On Wed, Jun 17, 2020 at 08:17:39AM +0300, Leon Romanovsky wrote:
> 
> My thoughts that everything here hints me that state machine and
> locking are implemented wrongly. In ideal world, the expectation
> is that REQ message will have a state in it (PREPARED, SENT, ACK
> e.t.c.) and list manipulations are done accordingly with proper
> locks, while rdma_nl_multicast() is done outside of the locks.

It can't be done outside the lock without creating races - once
rdma_nl_multicast happens it is possible for the other leg of the
operation to begin processing. 

The list must be updated before this happens.

What is missing here is refcounting - the lifetime model of this data
is too implicit, but it is not worth adding I think

Jason


Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-16 Thread Leon Romanovsky
On Tue, Jun 16, 2020 at 10:56:53AM -0700, Divya Indi wrote:
> Hi Leon,
>
> Please find my comments inline -
>
> On 6/13/20 11:41 PM, Leon Romanovsky wrote:
> > On Tue, Jun 09, 2020 at 07:45:21AM -0700, Divya Indi wrote:
> >> Hi Leon,
> >>
> >> Thanks for taking the time to review.
> >>
> >> Please find my comments inline -
> >>
> >> On 6/9/20 12:00 AM, Leon Romanovsky wrote:
> >>> On Mon, Jun 08, 2020 at 07:46:16AM -0700, Divya Indi wrote:
>  Commit 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list 
>  before sending")'
>  -
>  1. Adds the query to the request list before ib_nl_snd_msg.
>  2. Removes ib_nl_send_msg from within the spinlock which also makes it
>  possible to allocate memory with GFP_KERNEL.
> 
>  However, if there is a delay in sending out the request (For
>  eg: Delay due to low memory situation) the timer to handle request 
>  timeout
>  might kick in before the request is sent out to ibacm via netlink.
>  ib_nl_request_timeout may release the query causing a use after free 
>  situation
>  while accessing the query in ib_nl_send_msg.
> 
>  Call Trace for the above race:
> 
>  [] ? ib_pack+0x17b/0x240 [ib_core]
>  [] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
>  [] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
>  [] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
>  [] ? rds_rdma_cm_event_handler_cmn+0x850/0x850
>  [rds_rdma]
>  [] rds_rdma_cm_event_handler_cmn+0x22c/0x850
>  [rds_rdma]
>  [] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
>  [] addr_handler+0x9e/0x140 [rdma_cm]
>  [] process_req+0x134/0x190 [ib_addr]
>  [] process_one_work+0x169/0x4a0
>  [] worker_thread+0x5b/0x560
>  [] ? flush_delayed_work+0x50/0x50
>  [] kthread+0xcb/0xf0
>  [] ? __schedule+0x24a/0x810
>  [] ? __schedule+0x24a/0x810
>  [] ? kthread_create_on_node+0x180/0x180
>  [] ret_from_fork+0x47/0x90
>  [] ? kthread_create_on_node+0x180/0x180
>  
>  RIP  [] send_mad+0x33d/0x5d0 [ib_sa]
> 
>  To resolve the above issue -
>  1. Add the req to the request list only after the request has been sent 
>  out.
>  2. To handle the race where response comes in before adding request to
>  the request list, send(rdma_nl_multicast) and add to list while holding 
>  the
>  spinlock - request_lock.
>  3. Use GFP_NOWAIT for rdma_nl_multicast since it is called while holding
>  a spinlock. In case of memory allocation failure, request will go out to 
>  SA.
> 
>  Signed-off-by: Divya Indi 
>  Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
>  before sending")
> >>> Author SOB should be after "Fixes" line.
> >> My bad. Noted.
> >>
>  ---
>   drivers/infiniband/core/sa_query.c | 34 
>  +-
>   1 file changed, 17 insertions(+), 17 deletions(-)
> 
>  diff --git a/drivers/infiniband/core/sa_query.c 
>  b/drivers/infiniband/core/sa_query.c
>  index 74e0058..042c99b 100644
>  --- a/drivers/infiniband/core/sa_query.c
>  +++ b/drivers/infiniband/core/sa_query.c
>  @@ -836,6 +836,9 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
>  gfp_t gfp_mask)
>   void *data;
>   struct ib_sa_mad *mad;
>   int len;
>  +unsigned long flags;
>  +unsigned long delay;
>  +int ret;
> 
>   mad = query->mad_buf->mad;
>   len = ib_nl_get_path_rec_attrs_len(mad->sa_hdr.comp_mask);
>  @@ -860,35 +863,32 @@ static int ib_nl_send_msg(struct ib_sa_query 
>  *query, gfp_t gfp_mask)
>   /* Repair the nlmsg header length */
>   nlmsg_end(skb, nlh);
> 
>  -return rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, 
>  gfp_mask);
>  +spin_lock_irqsave(_nl_request_lock, flags);
>  +ret =  rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, 
>  GFP_NOWAIT);
> >>> It is hard to be convinced that this is correct solution. The mix of
> >>> gfp_flags and GFP_NOWAIT at the same time and usage of
> >>> ib_nl_request_lock to protect lists and suddenly rdma_nl_multicast() too
> >>> makes this code unreadable/non-maintainable.
> >> Prior to 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
> >> before sending"), we had ib_nl_send_msg under the spinlock 
> >> ib_nl_request_lock.
> >>
> >> ie we had -
> >>
> >> 1. Get spinlock - ib_nl_request_lock
> >> 2. ib_nl_send_msg
> >>2.a) rdma_nl_multicast
> >> 3. Add request to the req list
> >> 4. Arm the timer if needed.
> >> 5. Release spinlock
> >>
> >> However, ib_nl_send_msg involved a memory allocation using GFP_KERNEL.
> >> hence, was moved out of the spinlock. In addition, req was now being
> >> added prior to ib_nl_send_msg [To handle the race where response can
> >> come in before we get a chance to add the request back to the 

Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-16 Thread Divya Indi
Hi Leon,

Please find my comments inline -

On 6/13/20 11:41 PM, Leon Romanovsky wrote:
> On Tue, Jun 09, 2020 at 07:45:21AM -0700, Divya Indi wrote:
>> Hi Leon,
>>
>> Thanks for taking the time to review.
>>
>> Please find my comments inline -
>>
>> On 6/9/20 12:00 AM, Leon Romanovsky wrote:
>>> On Mon, Jun 08, 2020 at 07:46:16AM -0700, Divya Indi wrote:
 Commit 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list 
 before sending")'
 -
 1. Adds the query to the request list before ib_nl_snd_msg.
 2. Removes ib_nl_send_msg from within the spinlock which also makes it
 possible to allocate memory with GFP_KERNEL.

 However, if there is a delay in sending out the request (For
 eg: Delay due to low memory situation) the timer to handle request timeout
 might kick in before the request is sent out to ibacm via netlink.
 ib_nl_request_timeout may release the query causing a use after free 
 situation
 while accessing the query in ib_nl_send_msg.

 Call Trace for the above race:

 [] ? ib_pack+0x17b/0x240 [ib_core]
 [] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
 [] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
 [] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
 [] ? rds_rdma_cm_event_handler_cmn+0x850/0x850
 [rds_rdma]
 [] rds_rdma_cm_event_handler_cmn+0x22c/0x850
 [rds_rdma]
 [] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
 [] addr_handler+0x9e/0x140 [rdma_cm]
 [] process_req+0x134/0x190 [ib_addr]
 [] process_one_work+0x169/0x4a0
 [] worker_thread+0x5b/0x560
 [] ? flush_delayed_work+0x50/0x50
 [] kthread+0xcb/0xf0
 [] ? __schedule+0x24a/0x810
 [] ? __schedule+0x24a/0x810
 [] ? kthread_create_on_node+0x180/0x180
 [] ret_from_fork+0x47/0x90
 [] ? kthread_create_on_node+0x180/0x180
 
 RIP  [] send_mad+0x33d/0x5d0 [ib_sa]

 To resolve the above issue -
 1. Add the req to the request list only after the request has been sent 
 out.
 2. To handle the race where response comes in before adding request to
 the request list, send(rdma_nl_multicast) and add to list while holding the
 spinlock - request_lock.
 3. Use GFP_NOWAIT for rdma_nl_multicast since it is called while holding
 a spinlock. In case of memory allocation failure, request will go out to 
 SA.

 Signed-off-by: Divya Indi 
 Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
 before sending")
>>> Author SOB should be after "Fixes" line.
>> My bad. Noted.
>>
 ---
  drivers/infiniband/core/sa_query.c | 34 +-
  1 file changed, 17 insertions(+), 17 deletions(-)

 diff --git a/drivers/infiniband/core/sa_query.c 
 b/drivers/infiniband/core/sa_query.c
 index 74e0058..042c99b 100644
 --- a/drivers/infiniband/core/sa_query.c
 +++ b/drivers/infiniband/core/sa_query.c
 @@ -836,6 +836,9 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
 gfp_t gfp_mask)
void *data;
struct ib_sa_mad *mad;
int len;
 +  unsigned long flags;
 +  unsigned long delay;
 +  int ret;

mad = query->mad_buf->mad;
len = ib_nl_get_path_rec_attrs_len(mad->sa_hdr.comp_mask);
 @@ -860,35 +863,32 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
 gfp_t gfp_mask)
/* Repair the nlmsg header length */
nlmsg_end(skb, nlh);

 -  return rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, gfp_mask);
 +  spin_lock_irqsave(_nl_request_lock, flags);
 +  ret =  rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, GFP_NOWAIT);
>>> It is hard to be convinced that this is correct solution. The mix of
>>> gfp_flags and GFP_NOWAIT at the same time and usage of
>>> ib_nl_request_lock to protect lists and suddenly rdma_nl_multicast() too
>>> makes this code unreadable/non-maintainable.
>> Prior to 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
>> before sending"), we had ib_nl_send_msg under the spinlock 
>> ib_nl_request_lock.
>>
>> ie we had -
>>
>> 1. Get spinlock - ib_nl_request_lock
>> 2. ib_nl_send_msg
>>  2.a) rdma_nl_multicast
>> 3. Add request to the req list
>> 4. Arm the timer if needed.
>> 5. Release spinlock
>>
>> However, ib_nl_send_msg involved a memory allocation using GFP_KERNEL.
>> hence, was moved out of the spinlock. In addition, req was now being
>> added prior to ib_nl_send_msg [To handle the race where response can
>> come in before we get a chance to add the request back to the list].
>>
>> This introduced another race resulting in use-after-free.[Described in the 
>> commit.]
>>
>> To resolve this, sending out the request and adding it to list need to
>> happen while holding the request_lock.
>> To ensure minimum allocations while holding the lock, instead of having
>> the entire ib_nl_send_msg under the lock, we only have rdma_nl_multicast
>> under this spinlock.
>>
>> However, do 

Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-14 Thread Leon Romanovsky
On Tue, Jun 09, 2020 at 07:45:21AM -0700, Divya Indi wrote:
> Hi Leon,
>
> Thanks for taking the time to review.
>
> Please find my comments inline -
>
> On 6/9/20 12:00 AM, Leon Romanovsky wrote:
> > On Mon, Jun 08, 2020 at 07:46:16AM -0700, Divya Indi wrote:
> >> Commit 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list 
> >> before sending")'
> >> -
> >> 1. Adds the query to the request list before ib_nl_snd_msg.
> >> 2. Removes ib_nl_send_msg from within the spinlock which also makes it
> >> possible to allocate memory with GFP_KERNEL.
> >>
> >> However, if there is a delay in sending out the request (For
> >> eg: Delay due to low memory situation) the timer to handle request timeout
> >> might kick in before the request is sent out to ibacm via netlink.
> >> ib_nl_request_timeout may release the query causing a use after free 
> >> situation
> >> while accessing the query in ib_nl_send_msg.
> >>
> >> Call Trace for the above race:
> >>
> >> [] ? ib_pack+0x17b/0x240 [ib_core]
> >> [] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
> >> [] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
> >> [] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
> >> [] ? rds_rdma_cm_event_handler_cmn+0x850/0x850
> >> [rds_rdma]
> >> [] rds_rdma_cm_event_handler_cmn+0x22c/0x850
> >> [rds_rdma]
> >> [] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
> >> [] addr_handler+0x9e/0x140 [rdma_cm]
> >> [] process_req+0x134/0x190 [ib_addr]
> >> [] process_one_work+0x169/0x4a0
> >> [] worker_thread+0x5b/0x560
> >> [] ? flush_delayed_work+0x50/0x50
> >> [] kthread+0xcb/0xf0
> >> [] ? __schedule+0x24a/0x810
> >> [] ? __schedule+0x24a/0x810
> >> [] ? kthread_create_on_node+0x180/0x180
> >> [] ret_from_fork+0x47/0x90
> >> [] ? kthread_create_on_node+0x180/0x180
> >> 
> >> RIP  [] send_mad+0x33d/0x5d0 [ib_sa]
> >>
> >> To resolve the above issue -
> >> 1. Add the req to the request list only after the request has been sent 
> >> out.
> >> 2. To handle the race where response comes in before adding request to
> >> the request list, send(rdma_nl_multicast) and add to list while holding the
> >> spinlock - request_lock.
> >> 3. Use GFP_NOWAIT for rdma_nl_multicast since it is called while holding
> >> a spinlock. In case of memory allocation failure, request will go out to 
> >> SA.
> >>
> >> Signed-off-by: Divya Indi 
> >> Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
> >> before sending")
> > Author SOB should be after "Fixes" line.
>
> My bad. Noted.
>
> >
> >> ---
> >>  drivers/infiniband/core/sa_query.c | 34 +-
> >>  1 file changed, 17 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/core/sa_query.c 
> >> b/drivers/infiniband/core/sa_query.c
> >> index 74e0058..042c99b 100644
> >> --- a/drivers/infiniband/core/sa_query.c
> >> +++ b/drivers/infiniband/core/sa_query.c
> >> @@ -836,6 +836,9 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
> >> gfp_t gfp_mask)
> >>void *data;
> >>struct ib_sa_mad *mad;
> >>int len;
> >> +  unsigned long flags;
> >> +  unsigned long delay;
> >> +  int ret;
> >>
> >>mad = query->mad_buf->mad;
> >>len = ib_nl_get_path_rec_attrs_len(mad->sa_hdr.comp_mask);
> >> @@ -860,35 +863,32 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
> >> gfp_t gfp_mask)
> >>/* Repair the nlmsg header length */
> >>nlmsg_end(skb, nlh);
> >>
> >> -  return rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, gfp_mask);
> >> +  spin_lock_irqsave(_nl_request_lock, flags);
> >> +  ret =  rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, GFP_NOWAIT);
> > It is hard to be convinced that this is correct solution. The mix of
> > gfp_flags and GFP_NOWAIT at the same time and usage of
> > ib_nl_request_lock to protect lists and suddenly rdma_nl_multicast() too
> > makes this code unreadable/non-maintainable.
>
> Prior to 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
> before sending"), we had ib_nl_send_msg under the spinlock ib_nl_request_lock.
>
> ie we had -
>
> 1. Get spinlock - ib_nl_request_lock
> 2. ib_nl_send_msg
>   2.a) rdma_nl_multicast
> 3. Add request to the req list
> 4. Arm the timer if needed.
> 5. Release spinlock
>
> However, ib_nl_send_msg involved a memory allocation using GFP_KERNEL.
> hence, was moved out of the spinlock. In addition, req was now being
> added prior to ib_nl_send_msg [To handle the race where response can
> come in before we get a chance to add the request back to the list].
>
> This introduced another race resulting in use-after-free.[Described in the 
> commit.]
>
> To resolve this, sending out the request and adding it to list need to
> happen while holding the request_lock.
> To ensure minimum allocations while holding the lock, instead of having
> the entire ib_nl_send_msg under the lock, we only have rdma_nl_multicast
> under this spinlock.
>
> However, do you think it would be a good idea to split ib_nl_send_msg
> into 2 functions -
> 1. Prepare the req/query [Outside the 

Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-09 Thread Divya Indi
Hi Leon,

Thanks for taking the time to review.

Please find my comments inline -

On 6/9/20 12:00 AM, Leon Romanovsky wrote:
> On Mon, Jun 08, 2020 at 07:46:16AM -0700, Divya Indi wrote:
>> Commit 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list 
>> before sending")'
>> -
>> 1. Adds the query to the request list before ib_nl_snd_msg.
>> 2. Removes ib_nl_send_msg from within the spinlock which also makes it
>> possible to allocate memory with GFP_KERNEL.
>>
>> However, if there is a delay in sending out the request (For
>> eg: Delay due to low memory situation) the timer to handle request timeout
>> might kick in before the request is sent out to ibacm via netlink.
>> ib_nl_request_timeout may release the query causing a use after free 
>> situation
>> while accessing the query in ib_nl_send_msg.
>>
>> Call Trace for the above race:
>>
>> [] ? ib_pack+0x17b/0x240 [ib_core]
>> [] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
>> [] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
>> [] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
>> [] ? rds_rdma_cm_event_handler_cmn+0x850/0x850
>> [rds_rdma]
>> [] rds_rdma_cm_event_handler_cmn+0x22c/0x850
>> [rds_rdma]
>> [] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
>> [] addr_handler+0x9e/0x140 [rdma_cm]
>> [] process_req+0x134/0x190 [ib_addr]
>> [] process_one_work+0x169/0x4a0
>> [] worker_thread+0x5b/0x560
>> [] ? flush_delayed_work+0x50/0x50
>> [] kthread+0xcb/0xf0
>> [] ? __schedule+0x24a/0x810
>> [] ? __schedule+0x24a/0x810
>> [] ? kthread_create_on_node+0x180/0x180
>> [] ret_from_fork+0x47/0x90
>> [] ? kthread_create_on_node+0x180/0x180
>> 
>> RIP  [] send_mad+0x33d/0x5d0 [ib_sa]
>>
>> To resolve the above issue -
>> 1. Add the req to the request list only after the request has been sent out.
>> 2. To handle the race where response comes in before adding request to
>> the request list, send(rdma_nl_multicast) and add to list while holding the
>> spinlock - request_lock.
>> 3. Use GFP_NOWAIT for rdma_nl_multicast since it is called while holding
>> a spinlock. In case of memory allocation failure, request will go out to SA.
>>
>> Signed-off-by: Divya Indi 
>> Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
>> before sending")
> Author SOB should be after "Fixes" line.

My bad. Noted.

>
>> ---
>>  drivers/infiniband/core/sa_query.c | 34 +-
>>  1 file changed, 17 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/sa_query.c 
>> b/drivers/infiniband/core/sa_query.c
>> index 74e0058..042c99b 100644
>> --- a/drivers/infiniband/core/sa_query.c
>> +++ b/drivers/infiniband/core/sa_query.c
>> @@ -836,6 +836,9 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
>> gfp_t gfp_mask)
>>  void *data;
>>  struct ib_sa_mad *mad;
>>  int len;
>> +unsigned long flags;
>> +unsigned long delay;
>> +int ret;
>>
>>  mad = query->mad_buf->mad;
>>  len = ib_nl_get_path_rec_attrs_len(mad->sa_hdr.comp_mask);
>> @@ -860,35 +863,32 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
>> gfp_t gfp_mask)
>>  /* Repair the nlmsg header length */
>>  nlmsg_end(skb, nlh);
>>
>> -return rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, gfp_mask);
>> +spin_lock_irqsave(_nl_request_lock, flags);
>> +ret =  rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, GFP_NOWAIT);
> It is hard to be convinced that this is correct solution. The mix of
> gfp_flags and GFP_NOWAIT at the same time and usage of
> ib_nl_request_lock to protect lists and suddenly rdma_nl_multicast() too
> makes this code unreadable/non-maintainable.

Prior to 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
before sending"), we had ib_nl_send_msg under the spinlock ib_nl_request_lock.

ie we had -

1. Get spinlock - ib_nl_request_lock 
2. ib_nl_send_msg
2.a) rdma_nl_multicast
3. Add request to the req list
4. Arm the timer if needed.
5. Release spinlock

However, ib_nl_send_msg involved a memory allocation using GFP_KERNEL.
hence, was moved out of the spinlock. In addition, req was now being 
added prior to ib_nl_send_msg [To handle the race where response can
come in before we get a chance to add the request back to the list].

This introduced another race resulting in use-after-free.[Described in the 
commit.]

To resolve this, sending out the request and adding it to list need to 
happen while holding the request_lock.  
To ensure minimum allocations while holding the lock, instead of having
the entire ib_nl_send_msg under the lock, we only have rdma_nl_multicast
under this spinlock.

However, do you think it would be a good idea to split ib_nl_send_msg
into 2 functions -
1. Prepare the req/query [Outside the spinlock]
2. Sending the req - rdma_nl_multicast [while holding spinlock]

Would this be more intuitive? 

>> +if (!ret) {
> Please use kernel coding style.
>
> if (ret) {
>   spin_unlock_irqrestore(_nl_request_lock, flags);
>   return ret;
>   }
>
>  

Noted. Will make 

Re: [PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-09 Thread Leon Romanovsky
On Mon, Jun 08, 2020 at 07:46:16AM -0700, Divya Indi wrote:
> Commit 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list before 
> sending")'
> -
> 1. Adds the query to the request list before ib_nl_snd_msg.
> 2. Removes ib_nl_send_msg from within the spinlock which also makes it
> possible to allocate memory with GFP_KERNEL.
>
> However, if there is a delay in sending out the request (For
> eg: Delay due to low memory situation) the timer to handle request timeout
> might kick in before the request is sent out to ibacm via netlink.
> ib_nl_request_timeout may release the query causing a use after free situation
> while accessing the query in ib_nl_send_msg.
>
> Call Trace for the above race:
>
> [] ? ib_pack+0x17b/0x240 [ib_core]
> [] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
> [] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
> [] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
> [] ? rds_rdma_cm_event_handler_cmn+0x850/0x850
> [rds_rdma]
> [] rds_rdma_cm_event_handler_cmn+0x22c/0x850
> [rds_rdma]
> [] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
> [] addr_handler+0x9e/0x140 [rdma_cm]
> [] process_req+0x134/0x190 [ib_addr]
> [] process_one_work+0x169/0x4a0
> [] worker_thread+0x5b/0x560
> [] ? flush_delayed_work+0x50/0x50
> [] kthread+0xcb/0xf0
> [] ? __schedule+0x24a/0x810
> [] ? __schedule+0x24a/0x810
> [] ? kthread_create_on_node+0x180/0x180
> [] ret_from_fork+0x47/0x90
> [] ? kthread_create_on_node+0x180/0x180
> 
> RIP  [] send_mad+0x33d/0x5d0 [ib_sa]
>
> To resolve the above issue -
> 1. Add the req to the request list only after the request has been sent out.
> 2. To handle the race where response comes in before adding request to
> the request list, send(rdma_nl_multicast) and add to list while holding the
> spinlock - request_lock.
> 3. Use GFP_NOWAIT for rdma_nl_multicast since it is called while holding
> a spinlock. In case of memory allocation failure, request will go out to SA.
>
> Signed-off-by: Divya Indi 
> Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
> before sending")

Author SOB should be after "Fixes" line.

> ---
>  drivers/infiniband/core/sa_query.c | 34 +-
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/infiniband/core/sa_query.c 
> b/drivers/infiniband/core/sa_query.c
> index 74e0058..042c99b 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -836,6 +836,9 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
> gfp_t gfp_mask)
>   void *data;
>   struct ib_sa_mad *mad;
>   int len;
> + unsigned long flags;
> + unsigned long delay;
> + int ret;
>
>   mad = query->mad_buf->mad;
>   len = ib_nl_get_path_rec_attrs_len(mad->sa_hdr.comp_mask);
> @@ -860,35 +863,32 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
> gfp_t gfp_mask)
>   /* Repair the nlmsg header length */
>   nlmsg_end(skb, nlh);
>
> - return rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, gfp_mask);
> + spin_lock_irqsave(_nl_request_lock, flags);
> + ret =  rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, GFP_NOWAIT);

It is hard to be convinced that this is correct solution. The mix of
gfp_flags and GFP_NOWAIT at the same time and usage of
ib_nl_request_lock to protect lists and suddenly rdma_nl_multicast() too
makes this code unreadable/non-maintainable.

> + if (!ret) {

Please use kernel coding style.

if (ret) {
  spin_unlock_irqrestore(_nl_request_lock, flags);
  return ret;
  }

 

> + /* Put the request on the list.*/
> + delay = msecs_to_jiffies(sa_local_svc_timeout_ms);
> + query->timeout = delay + jiffies;
> + list_add_tail(>list, _nl_request_list);
> + /* Start the timeout if this is the only request */
> + if (ib_nl_request_list.next == >list)
> + queue_delayed_work(ib_nl_wq, _nl_timed_work, delay);
> + }
> + spin_unlock_irqrestore(_nl_request_lock, flags);
> +
> + return ret;
>  }
>
>  static int ib_nl_make_request(struct ib_sa_query *query, gfp_t gfp_mask)
>  {
> - unsigned long flags;
> - unsigned long delay;
>   int ret;
>
>   INIT_LIST_HEAD(>list);
>   query->seq = (u32)atomic_inc_return(_nl_sa_request_seq);
>
> - /* Put the request on the list first.*/
> - spin_lock_irqsave(_nl_request_lock, flags);
> - delay = msecs_to_jiffies(sa_local_svc_timeout_ms);
> - query->timeout = delay + jiffies;
> - list_add_tail(>list, _nl_request_list);
> - /* Start the timeout if this is the only request */
> - if (ib_nl_request_list.next == >list)
> - queue_delayed_work(ib_nl_wq, _nl_timed_work, delay);
> - spin_unlock_irqrestore(_nl_request_lock, flags);
> -
>   ret = ib_nl_send_msg(query, gfp_mask);
>   if (ret) {
>   ret = -EIO;
> - /* Remove the request */
> - spin_lock_irqsave(_nl_request_lock, flags);
> -  

[PATCH v3] IB/sa: Resolving use-after-free in ib_nl_send_msg

2020-06-08 Thread Divya Indi
Commit 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list before 
sending")'
-
1. Adds the query to the request list before ib_nl_snd_msg.
2. Removes ib_nl_send_msg from within the spinlock which also makes it
possible to allocate memory with GFP_KERNEL.

However, if there is a delay in sending out the request (For
eg: Delay due to low memory situation) the timer to handle request timeout
might kick in before the request is sent out to ibacm via netlink.
ib_nl_request_timeout may release the query causing a use after free situation
while accessing the query in ib_nl_send_msg.

Call Trace for the above race:

[] ? ib_pack+0x17b/0x240 [ib_core]
[] ib_sa_path_rec_get+0x181/0x200 [ib_sa]
[] rdma_resolve_route+0x3c0/0x8d0 [rdma_cm]
[] ? cma_bind_port+0xa0/0xa0 [rdma_cm]
[] ? rds_rdma_cm_event_handler_cmn+0x850/0x850
[rds_rdma]
[] rds_rdma_cm_event_handler_cmn+0x22c/0x850
[rds_rdma]
[] rds_rdma_cm_event_handler+0x10/0x20 [rds_rdma]
[] addr_handler+0x9e/0x140 [rdma_cm]
[] process_req+0x134/0x190 [ib_addr]
[] process_one_work+0x169/0x4a0
[] worker_thread+0x5b/0x560
[] ? flush_delayed_work+0x50/0x50
[] kthread+0xcb/0xf0
[] ? __schedule+0x24a/0x810
[] ? __schedule+0x24a/0x810
[] ? kthread_create_on_node+0x180/0x180
[] ret_from_fork+0x47/0x90
[] ? kthread_create_on_node+0x180/0x180

RIP  [] send_mad+0x33d/0x5d0 [ib_sa]

To resolve the above issue -
1. Add the req to the request list only after the request has been sent out.
2. To handle the race where response comes in before adding request to
the request list, send(rdma_nl_multicast) and add to list while holding the
spinlock - request_lock.
3. Use GFP_NOWAIT for rdma_nl_multicast since it is called while holding
a spinlock. In case of memory allocation failure, request will go out to SA.

Signed-off-by: Divya Indi 
Fixes: 3ebd2fd0d011 ("IB/sa: Put netlink request into the request list
before sending")
---
 drivers/infiniband/core/sa_query.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c 
b/drivers/infiniband/core/sa_query.c
index 74e0058..042c99b 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -836,6 +836,9 @@ static int ib_nl_send_msg(struct ib_sa_query *query, gfp_t 
gfp_mask)
void *data;
struct ib_sa_mad *mad;
int len;
+   unsigned long flags;
+   unsigned long delay;
+   int ret;
 
mad = query->mad_buf->mad;
len = ib_nl_get_path_rec_attrs_len(mad->sa_hdr.comp_mask);
@@ -860,35 +863,32 @@ static int ib_nl_send_msg(struct ib_sa_query *query, 
gfp_t gfp_mask)
/* Repair the nlmsg header length */
nlmsg_end(skb, nlh);
 
-   return rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, gfp_mask);
+   spin_lock_irqsave(_nl_request_lock, flags);
+   ret =  rdma_nl_multicast(_net, skb, RDMA_NL_GROUP_LS, GFP_NOWAIT);
+   if (!ret) {
+   /* Put the request on the list.*/
+   delay = msecs_to_jiffies(sa_local_svc_timeout_ms);
+   query->timeout = delay + jiffies;
+   list_add_tail(>list, _nl_request_list);
+   /* Start the timeout if this is the only request */
+   if (ib_nl_request_list.next == >list)
+   queue_delayed_work(ib_nl_wq, _nl_timed_work, delay);
+   }
+   spin_unlock_irqrestore(_nl_request_lock, flags);
+
+   return ret;
 }
 
 static int ib_nl_make_request(struct ib_sa_query *query, gfp_t gfp_mask)
 {
-   unsigned long flags;
-   unsigned long delay;
int ret;
 
INIT_LIST_HEAD(>list);
query->seq = (u32)atomic_inc_return(_nl_sa_request_seq);
 
-   /* Put the request on the list first.*/
-   spin_lock_irqsave(_nl_request_lock, flags);
-   delay = msecs_to_jiffies(sa_local_svc_timeout_ms);
-   query->timeout = delay + jiffies;
-   list_add_tail(>list, _nl_request_list);
-   /* Start the timeout if this is the only request */
-   if (ib_nl_request_list.next == >list)
-   queue_delayed_work(ib_nl_wq, _nl_timed_work, delay);
-   spin_unlock_irqrestore(_nl_request_lock, flags);
-
ret = ib_nl_send_msg(query, gfp_mask);
if (ret) {
ret = -EIO;
-   /* Remove the request */
-   spin_lock_irqsave(_nl_request_lock, flags);
-   list_del(>list);
-   spin_unlock_irqrestore(_nl_request_lock, flags);
}
 
return ret;
-- 
1.8.3.1