srp: Protect free_tx iu list from concurrent flows

Sagi Grimberg Thu, 27 Feb 2014 03:52:51 -0800

On 2/24/2014 5:38 PM, Bart Van Assche wrote:

On 02/24/14 15:30, Sagi Grimberg wrote:

From: Vu Pham <[email protected]>


srp_reconnect_rport() serializes calls of srp_rport_reconnect()
with srp_queuecommand(), srp_abort(), srp_reset_device(),
srp_reset_host() via rport->mutex and also blocks srp_queuecommand();
however, it cannot block scsi error handler commands (stu, tur).
This may introduces corruption in free_tx IUs list and IU itself

Signed-off-by: Vu Pham <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
---
  drivers/infiniband/ulp/srp/ib_srp.c |    3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
b/drivers/infiniband/ulp/srp/ib_srp.c
index b615135..656602b 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -859,6 +859,7 @@ static int srp_rport_reconnect(struct srp_rport *rport)
  {
        struct srp_target_port *target = rport->lld_data;
        int i, ret;
+       unsigned long flags;

srp_disconnect_target(target);

        /*
@@ -882,9 +883,11 @@ static int srp_rport_reconnect(struct srp_rport *rport)
                srp_finish_req(target, req, DID_RESET << 16);
        }

+ spin_lock_irqsave(&target->lock, flags);

        INIT_LIST_HEAD(&target->free_tx);
        for (i = 0; i < target->queue_size; ++i)
                list_add(&target->tx_ring[i]->list, &target->free_tx);
+       spin_unlock_irqrestore(&target->lock, flags);

if (ret == 0)

                ret = srp_connect_target(target);

Hello Sagi and Vu,

srp_rport_reconnect() should never get invoked concurrently with
srp_queuecommand() - see e.g. the "in_scsi_eh" variable in
srp_queuecommand(). Is the list corruption reproducible with the patch
mentioned in my reply to patch 1/3 ?

Thanks,

Bart.


I need to re-test this.

Regarding in_scsi_eh, can you end-up still posting a send if you are inan interrupt context?it's just that we have a *very* rare case (not easy to reproduce) inRH6.5 where we end-up posting on a just destroyed QP(race right in between destroy_qp and assignment of new qp insrp_create_target_ib).

We tested it with in_scsi_eh patch and it still happened.

As I see it, SRP problems comes in a distinct period when rport is instate BLOCKED.On one hand, all request processing are allowed (not failing commands),and on the other reconnect flow may be running in concurrently.Will it be acceptable to take the rport_mutex in queue_command if rportis in BLOCKED state?


Thoughts?

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 3/3] IB/srp: Protect free_tx iu list from concurrent flows

Reply via email to