Author: ken
Date: Mon Dec  2 19:57:39 2019
New Revision: 355299
URL: https://svnweb.freebsd.org/changeset/base/355299

Log:
  Fix a hang introduced in r351599.
  
  My changes in 351599 (kindly committed by avg) made the cd(4) media check
  asynchronous to avoid a sleep while holding a mutex.
  
  There was a difficult to reproduce bug with those changes that caused a
  hang on boot on some single processor machines/VMs.  Leandro Lupori
  managed to reproduce the bug, diagnose it, and supplied a patch!  Here is
  his analysis, from the PR:
  
  ======
  I was able to reproduce the problem described in comment#14.
  
  Actually, I wasn't trying to reproduce it, I just started seeing it a few
  weeks ago, in CURRENT.
  
  I can reproduce it consistently, by using QEMU to run a PowerPC64 VM with a
  single core/thread (-smp 1).
  
  It happens only when there is no media in the emulated CD-ROM, a device
  that QEMU adds by default, unless -nodefaults is specified in command line.
  
  I've debugged it and this is what I've found:
  
  1- After the CD probe is successful, GEOM will try to open the device,
  which will end up calling cdcheckmedia(), that sets CD state to
  CD_STATE_MEDIA_PREVENT.
  2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCKED
  flag is set and CD state moves to CD_STATE_MEDIA_SIZE.
  3- Next, scsi_read_capacity() is executed and fails, state is set to
  CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up
  cdcheckmedia().
  4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it
  first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to
  CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, entering
  an infinite MEDIA_SIZE command loop.
  
  When there is a least another core/thread, the GEOM thread that performed
  the initial cdopen() will get scheduled again, closing the CD device, that
  will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag and
  breaks the loop.
  
  So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when
  CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the
  state should be advanced to CD_STATE_MEDIA size only when the current state
  is CD_STATE_MEDIA_PREVENT.
  =====
  
  PR:           kern/219857
  Submitted by: Leandro Lupori <leandro.lup...@gmail.com>
  MFC after:    1 week

Modified:
  head/sys/cam/scsi/scsi_cd.c

Modified: head/sys/cam/scsi/scsi_cd.c
==============================================================================
--- head/sys/cam/scsi/scsi_cd.c Mon Dec  2 19:57:20 2019        (r355298)
+++ head/sys/cam/scsi/scsi_cd.c Mon Dec  2 19:57:39 2019        (r355299)
@@ -1032,7 +1032,8 @@ cdstart(struct cam_periph *periph, union ccb *start_cc
                 * If the CD is already locked, we don't need to do this.
                 * Move on to the capacity check.
                 */
-               if ((softc->flags & CD_FLAG_DISC_LOCKED) != 0) {
+               if (softc->state == CD_STATE_MEDIA_PREVENT
+                && (softc->flags & CD_FLAG_DISC_LOCKED) != 0) {
                        softc->state = CD_STATE_MEDIA_SIZE;
                        xpt_release_ccb(start_ccb);
                        xpt_schedule(periph, CAM_PRIORITY_NORMAL);
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to