------- Comment From heinz-werner_se...@de.ibm.com 2018-08-31 07:51 EDT-------
IBM bugzilla status -> closed, Fix Released by Canonical

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1780067

Title:
  zfcp: fix infinite iteration on ERP ready list

Status in Ubuntu on IBM z Systems:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  Please backport:
  commit fa89adba1941e4f3b213399b81732a5c12fd9131
      scsi: zfcp: fix infinite iteration on ERP ready list
      
      zfcp_erp_adapter_reopen() schedules blocking of all of the adapter's
      rports via zfcp_scsi_schedule_rports_block() and enqueues a reopen
      adapter ERP action via zfcp_erp_action_enqueue(). Both are separately
      processed asynchronously and concurrently.
      
      Blocking of rports is done in a kworker by zfcp_scsi_rport_work(). It
      calls zfcp_scsi_rport_block(), which then traces a DBF REC "scpdely" via
      zfcp_dbf_rec_trig().  zfcp_dbf_rec_trig() acquires the DBF REC spin lock
      and then iterates with list_for_each() over the adapter's ERP ready list
      without holding the ERP lock. This opens a race window in which the
      current list entry can be moved to another list, causing list_for_each()
      to iterate forever on the wrong list, as the erp_ready_head is never
      encountered as terminal condition.
      
      Meanwhile the ERP action can be processed in the ERP thread by
      zfcp_erp_thread(). It calls zfcp_erp_strategy(), which acquires the ERP
      lock and then calls zfcp_erp_action_to_running() to move the ERP action
      from the ready to the running list.  zfcp_erp_action_to_running() can
      move the ERP action using list_move() just during the aforementioned
      race window. It then traces a REC RUN "erator1" via zfcp_dbf_rec_run().
      zfcp_dbf_rec_run() tries to acquire the DBF REC spin lock. If this is
      held by the infinitely looping kworker, it effectively spins forever.
      
      Example Sequence Diagram:
      
      Process                ERP Thread             rport_work
      -------------------    -------------------    -------------------
      zfcp_erp_adapter_reopen()
      zfcp_erp_adapter_block()
      zfcp_scsi_schedule_rports_block()
      lock ERP                                      zfcp_scsi_rport_work()
      zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER)
      list_add_tail() on ready                      !(rport_task==RPORT_ADD)
      wake_up() ERP thread                          zfcp_scsi_rport_block()
      zfcp_dbf_rec_trig()    zfcp_erp_strategy()    zfcp_dbf_rec_trig()
      unlock ERP                                    lock DBF REC
      zfcp_erp_wait()        lock ERP
      |                      zfcp_erp_action_to_running()
      |                                             list_for_each() ready
      |                      list_move()              current entry
      |                        ready to running
      |                      zfcp_dbf_rec_run()       endless loop over running
      |                      zfcp_dbf_rec_run_lvl()
      |                      lock DBF REC spins forever
      
      Any adapter recovery can trigger this, such as setting the device offline
      or reboot.
      
      V4.9 commit 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport
      during rport gone") introduced additional tracing of (un)blocking of
      rports. It missed that the adapter->erp_lock must be held when calling
      zfcp_dbf_rec_trig().
      
      This fix uses the approach formerly introduced by commit aa0fec62391c
      ("[SCSI] zfcp: Fix sparse warning by providing new entry in dbf") that got
      later removed by commit ae0904f60fab ("[SCSI] zfcp: Redesign of the debug
      tracing for recovery actions.").
      
      Introduce zfcp_dbf_rec_trig_lock(), a wrapper for zfcp_dbf_rec_trig() that
      acquires and releases the adapter->erp_lock for read.
      
      Reported-by: Sebastian Ott <seb...@linux.ibm.com>
      Signed-off-by: Jens Remus <jre...@linux.ibm.com>
      Fixes: 4eeaa4f3f1d6 ("zfcp: close window with unblocked rport during 
rport gone")
      Cc: <sta...@vger.kernel.org> # 2.6.32+
      Reviewed-by: Benjamin Block <bbl...@linux.vnet.ibm.com>
      Signed-off-by: Steffen Maier <ma...@linux.ibm.com>
      Signed-off-by: Martin K. Petersen <martin.peter...@oracle.com>

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1780067/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to