Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-11 Thread Mike Belopuhov
On Fri, May 08, 2015 at 20:28 +0200, Mark Kettenis wrote:
  Date: Fri, 8 May 2015 20:15:58 +0200
  From: Mike Belopuhov m...@belopuhov.com
  
  On Fri, May 08, 2015 at 12:34 +0200, Mike Belopuhov wrote:
I think tsleep(9) and msleep(9) need to release and re-acquire the
kernel lock in the cold || panicstr case.
   
   Well, it's not hard to do really, but...
   
We might need this for
handling interrupts during autoconf as soon as we start distributing
interrupts over CPUs.
   
   
   ...cold might mean that interrupts are not ready yet.  So then we might
   need another flag for shutdown?
  
  This is what I have come up with.  Chunks were taken directly from
  mi_switch and it seems to do the job just fine.  Right now I'm not
  using any additional flags and it seems to work here.  I'll resume
  testing on Monday, but it looks fairly complete.  Any comments?
 
 Makes sense to me.  The msleep/tsleep code could be simplified to:
 
   if (__mp_lock_held(kernel_lock)) {
   hold_count = __mp_release_all(kernel_lock);
   __mp_acquire_count(kernel_lock, hold_count);
   }


Indeed.

 I'm also wondering whether we should change __mp_release_all() to
 simple return 0 if the current CPU does not hold the lock, such that
 the __mp_lock_held() check isn't needed anymore.  But that's a
 separate issue.
 

It would have made the __mp_release_all safer as well since it
wouldn't require an external check.

I don't have any firther input on this, diff works fine here.

OK?

diff --git sys/kern/kern_synch.c sys/kern/kern_synch.c
index 03308b4..6f573fc 100644
--- sys/kern/kern_synch.c
+++ sys/kern/kern_synch.c
@@ -103,10 +103,13 @@ extern int safepri;
 int
 tsleep(const volatile void *ident, int priority, const char *wmesg, int timo)
 {
struct sleep_state sls;
int error, error1;
+#ifdef MULTIPROCESSOR
+   int hold_count;
+#endif
 
KASSERT((priority  ~(PRIMASK | PCATCH)) == 0);
 
 #ifdef MULTIPROCESSOR
KASSERT(timo || __mp_lock_held(kernel_lock));
@@ -120,10 +123,16 @@ tsleep(const volatile void *ident, int priority, const 
char *wmesg, int timo)
 * don't run any other procs or panic below,
 * in case this is the idle process and already asleep.
 */
s = splhigh();
splx(safepri);
+#ifdef MULTIPROCESSOR
+   if (__mp_lock_held(kernel_lock)) {
+   hold_count = __mp_release_all(kernel_lock);
+   __mp_acquire_count(kernel_lock, hold_count);
+   }
+#endif
splx(s);
return (0);
}
 
sleep_setup(sls, ident, priority, wmesg);
@@ -149,10 +158,13 @@ int
 msleep(const volatile void *ident, struct mutex *mtx, int priority,
 const char *wmesg, int timo)
 {
struct sleep_state sls;
int error, error1, spl;
+#ifdef MULTIPROCESSOR
+   int hold_count;
+#endif
 
KASSERT((priority  ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
KASSERT(mtx != NULL);
 
if (cold || panicstr) {
@@ -163,10 +175,16 @@ msleep(const volatile void *ident, struct mutex *mtx, int 
priority,
 * in case this is the idle process and already asleep.
 */
spl = MUTEX_OLDIPL(mtx);
MUTEX_OLDIPL(mtx) = safepri;
mtx_leave(mtx);
+#ifdef MULTIPROCESSOR
+   if (__mp_lock_held(kernel_lock)) {
+   hold_count = __mp_release_all(kernel_lock);
+   __mp_acquire_count(kernel_lock, hold_count);
+   }
+#endif
if ((priority  PNORELOCK) == 0) {
mtx_enter(mtx);
MUTEX_OLDIPL(mtx) = spl;
} else
splx(spl);
diff --git sys/kern/vfs_subr.c sys/kern/vfs_subr.c
index a26fbe2..a373789 100644
--- sys/kern/vfs_subr.c
+++ sys/kern/vfs_subr.c
@@ -1664,10 +1664,13 @@ int
 vfs_syncwait(int verbose)
 {
struct buf *bp;
int iter, nbusy, dcount, s;
struct proc *p;
+#ifdef MULTIPROCESSOR
+   int hold_count;
+#endif
 
p = curproc? curproc : proc0;
sys_sync(p, (void *)0, (register_t *)0);
 
/* Wait for sync to finish. */
@@ -1698,11 +1701,21 @@ vfs_syncwait(int verbose)
}
if (nbusy == 0)
break;
if (verbose)
printf(%d , nbusy);
+#ifdef MULTIPROCESSOR
+   if (__mp_lock_held(kernel_lock))
+   hold_count = __mp_release_all(kernel_lock);
+   else
+   hold_count = 0;
+#endif
DELAY(4 * iter);
+#ifdef MULTIPROCESSOR
+   if (hold_count)
+   __mp_acquire_count(kernel_lock, hold_count);
+#endif
}
 
return nbusy;
 }
 



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-11 Thread Mark Kettenis
 Date: Mon, 11 May 2015 16:54:54 +0200
 From: Mike Belopuhov m...@belopuhov.com
 
 On Fri, May 08, 2015 at 20:28 +0200, Mark Kettenis wrote:
   Date: Fri, 8 May 2015 20:15:58 +0200
   From: Mike Belopuhov m...@belopuhov.com
   
   On Fri, May 08, 2015 at 12:34 +0200, Mike Belopuhov wrote:
 I think tsleep(9) and msleep(9) need to release and re-acquire the
 kernel lock in the cold || panicstr case.

Well, it's not hard to do really, but...

 We might need this for
 handling interrupts during autoconf as soon as we start distributing
 interrupts over CPUs.


...cold might mean that interrupts are not ready yet.  So then we might
need another flag for shutdown?
   
   This is what I have come up with.  Chunks were taken directly from
   mi_switch and it seems to do the job just fine.  Right now I'm not
   using any additional flags and it seems to work here.  I'll resume
   testing on Monday, but it looks fairly complete.  Any comments?
  
  Makes sense to me.  The msleep/tsleep code could be simplified to:
  
  if (__mp_lock_held(kernel_lock)) {
  hold_count = __mp_release_all(kernel_lock);
  __mp_acquire_count(kernel_lock, hold_count);
  }
 
 
 Indeed.
 
  I'm also wondering whether we should change __mp_release_all() to
  simple return 0 if the current CPU does not hold the lock, such that
  the __mp_lock_held() check isn't needed anymore.  But that's a
  separate issue.
  
 
 It would have made the __mp_release_all safer as well since it
 wouldn't require an external check.
 
 I don't have any firther input on this, diff works fine here.
 
 OK?

ok kettenis@

 diff --git sys/kern/kern_synch.c sys/kern/kern_synch.c
 index 03308b4..6f573fc 100644
 --- sys/kern/kern_synch.c
 +++ sys/kern/kern_synch.c
 @@ -103,10 +103,13 @@ extern int safepri;
  int
  tsleep(const volatile void *ident, int priority, const char *wmesg, int timo)
  {
   struct sleep_state sls;
   int error, error1;
 +#ifdef MULTIPROCESSOR
 + int hold_count;
 +#endif
  
   KASSERT((priority  ~(PRIMASK | PCATCH)) == 0);
  
  #ifdef MULTIPROCESSOR
   KASSERT(timo || __mp_lock_held(kernel_lock));
 @@ -120,10 +123,16 @@ tsleep(const volatile void *ident, int priority, const 
 char *wmesg, int timo)
* don't run any other procs or panic below,
* in case this is the idle process and already asleep.
*/
   s = splhigh();
   splx(safepri);
 +#ifdef MULTIPROCESSOR
 + if (__mp_lock_held(kernel_lock)) {
 + hold_count = __mp_release_all(kernel_lock);
 + __mp_acquire_count(kernel_lock, hold_count);
 + }
 +#endif
   splx(s);
   return (0);
   }
  
   sleep_setup(sls, ident, priority, wmesg);
 @@ -149,10 +158,13 @@ int
  msleep(const volatile void *ident, struct mutex *mtx, int priority,
  const char *wmesg, int timo)
  {
   struct sleep_state sls;
   int error, error1, spl;
 +#ifdef MULTIPROCESSOR
 + int hold_count;
 +#endif
  
   KASSERT((priority  ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
   KASSERT(mtx != NULL);
  
   if (cold || panicstr) {
 @@ -163,10 +175,16 @@ msleep(const volatile void *ident, struct mutex *mtx, 
 int priority,
* in case this is the idle process and already asleep.
*/
   spl = MUTEX_OLDIPL(mtx);
   MUTEX_OLDIPL(mtx) = safepri;
   mtx_leave(mtx);
 +#ifdef MULTIPROCESSOR
 + if (__mp_lock_held(kernel_lock)) {
 + hold_count = __mp_release_all(kernel_lock);
 + __mp_acquire_count(kernel_lock, hold_count);
 + }
 +#endif
   if ((priority  PNORELOCK) == 0) {
   mtx_enter(mtx);
   MUTEX_OLDIPL(mtx) = spl;
   } else
   splx(spl);
 diff --git sys/kern/vfs_subr.c sys/kern/vfs_subr.c
 index a26fbe2..a373789 100644
 --- sys/kern/vfs_subr.c
 +++ sys/kern/vfs_subr.c
 @@ -1664,10 +1664,13 @@ int
  vfs_syncwait(int verbose)
  {
   struct buf *bp;
   int iter, nbusy, dcount, s;
   struct proc *p;
 +#ifdef MULTIPROCESSOR
 + int hold_count;
 +#endif
  
   p = curproc? curproc : proc0;
   sys_sync(p, (void *)0, (register_t *)0);
  
   /* Wait for sync to finish. */
 @@ -1698,11 +1701,21 @@ vfs_syncwait(int verbose)
   }
   if (nbusy == 0)
   break;
   if (verbose)
   printf(%d , nbusy);
 +#ifdef MULTIPROCESSOR
 + if (__mp_lock_held(kernel_lock))
 + hold_count = __mp_release_all(kernel_lock);
 + else
 + hold_count = 0;
 +#endif
   DELAY(4 * iter);
 +#ifdef MULTIPROCESSOR
 + if (hold_count)
 + __mp_acquire_count(kernel_lock, hold_count);
 +#endif
  

Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Mark Kettenis
 Date: Thu, 7 May 2015 20:58:53 +0200
 From: Mike Belopuhov m...@belopuhov.com
 
 As I've pointed out before, on panic we can be running on any
 CPU and our disk controller's interrupts can interrupt on the
 other one.  Since we'll most likely be holding a kernel lock,
 dealing with unlocking it might get hairy very fast.  Instead
 what we could do to improve the chances of a clean shutdown on
 panic is to instruct our disk subsystem to do polled I/O that
 will be run on the same CPU with the panic.
 
 Initially I wanted to move cold = 1 earlier in boot(), but
 after talking to Miod, it started to look like a bad idea.
 
 Thoughts?

I'm not necessarily against forcing polled I/O in this case, but it
sucks a bit that we have to add checks in fairly low-level code to
accomplish this.  And I'm not sure if disk I/O is the only thing that
we should worry about here.

I think tsleep(9) and msleep(9) need to release and re-acquire the
kernel lock in the cold || panicstr case.  We might need this for
handling interrupts during autoconf as soon as we start distributing
interrupts over CPUs.

 diff --git sys/dev/ata/ata_wdc.c sys/dev/ata/ata_wdc.c
 index 1f52488..aea9ec1 100644
 --- sys/dev/ata/ata_wdc.c
 +++ sys/dev/ata/ata_wdc.c
 @@ -199,20 +199,22 @@ wd_hibernate_io(dev_t dev, daddr_t blkno, vaddr_t addr, 
 size_t size, int op, voi
   */
  int
  wdc_ata_bio(struct ata_drive_datas *drvp, struct ata_bio *ata_bio)
  {
   struct wdc_xfer *xfer;
   struct channel_softc *chp = drvp-chnl_softc;
  
   xfer = wdc_get_xfer(WDC_NOSLEEP);
   if (xfer == NULL)
   return WDC_TRY_AGAIN;
 + if (panicstr)
 + ata_bio-flags |= ATA_POLL;
   if (ata_bio-flags  ATA_POLL)
   xfer-c_flags |= C_POLL;
   if (!(ata_bio-flags  ATA_POLL) 
   (drvp-drive_flags  (DRIVE_DMA | DRIVE_UDMA)) 
   (ata_bio-flags  ATA_SINGLE) == 0 
   (ata_bio-bcount  512 ||
   (chp-wdc-quirks  WDC_QUIRK_NOSHORTDMA) == 0))
   xfer-c_flags |= C_DMA;
   xfer-drive = drvp-drive;
   xfer-cmd = ata_bio;
 diff --git sys/scsi/scsi_base.c sys/scsi/scsi_base.c
 index 9cf6b45..3afcc29 100644
 --- sys/scsi/scsi_base.c
 +++ sys/scsi/scsi_base.c
 @@ -1267,20 +1267,22 @@ scsi_report_luns(struct scsi_link *sc_link, int 
 selectreport,
   return (error);
  }
  
  void
  scsi_xs_exec(struct scsi_xfer *xs)
  {
   xs-error = XS_NOERROR;
   xs-resid = xs-datalen;
   xs-status = 0;
   CLR(xs-flags, ITSDONE);
 + if (panicstr)
 + SET(xs-flags, SCSI_AUTOCONF);
  
  #ifdef SCSIDEBUG
   if (xs-sc_link-flags  SDEV_DB1) {
   scsi_xs_show(xs);
   if (xs-datalen  (xs-flags  SCSI_DATA_OUT))
   scsi_show_mem(xs-data, min(64, xs-datalen));
   }
  #endif
  
   /* The adapter's scsi_cmd() is responsible for calling scsi_done(). */
 
 



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Martin Pieuchot
On 07/05/15(Thu) 20:58, Mike Belopuhov wrote:
 As I've pointed out before, on panic we can be running on any
 CPU and our disk controller's interrupts can interrupt on the
 other one.  Since we'll most likely be holding a kernel lock,
 dealing with unlocking it might get hairy very fast.  Instead
 what we could do to improve the chances of a clean shutdown on
 panic is to instruct our disk subsystem to do polled I/O that
 will be run on the same CPU with the panic.

Did you consider executing ddb's boot commands on cpu0?  I mean doing
an implicit machine ddbcpu 0 before executing any boot command?

 Initially I wanted to move cold = 1 earlier in boot(), but
 after talking to Miod, it started to look like a bad idea.
 
 Thoughts?
 
 diff --git sys/dev/ata/ata_wdc.c sys/dev/ata/ata_wdc.c
 index 1f52488..aea9ec1 100644
 --- sys/dev/ata/ata_wdc.c
 +++ sys/dev/ata/ata_wdc.c
 @@ -199,20 +199,22 @@ wd_hibernate_io(dev_t dev, daddr_t blkno, vaddr_t addr, 
 size_t size, int op, voi
   */
  int
  wdc_ata_bio(struct ata_drive_datas *drvp, struct ata_bio *ata_bio)
  {
   struct wdc_xfer *xfer;
   struct channel_softc *chp = drvp-chnl_softc;
  
   xfer = wdc_get_xfer(WDC_NOSLEEP);
   if (xfer == NULL)
   return WDC_TRY_AGAIN;
 + if (panicstr)
 + ata_bio-flags |= ATA_POLL;
   if (ata_bio-flags  ATA_POLL)
   xfer-c_flags |= C_POLL;
   if (!(ata_bio-flags  ATA_POLL) 
   (drvp-drive_flags  (DRIVE_DMA | DRIVE_UDMA)) 
   (ata_bio-flags  ATA_SINGLE) == 0 
   (ata_bio-bcount  512 ||
   (chp-wdc-quirks  WDC_QUIRK_NOSHORTDMA) == 0))
   xfer-c_flags |= C_DMA;
   xfer-drive = drvp-drive;
   xfer-cmd = ata_bio;
 diff --git sys/scsi/scsi_base.c sys/scsi/scsi_base.c
 index 9cf6b45..3afcc29 100644
 --- sys/scsi/scsi_base.c
 +++ sys/scsi/scsi_base.c
 @@ -1267,20 +1267,22 @@ scsi_report_luns(struct scsi_link *sc_link, int 
 selectreport,
   return (error);
  }
  
  void
  scsi_xs_exec(struct scsi_xfer *xs)
  {
   xs-error = XS_NOERROR;
   xs-resid = xs-datalen;
   xs-status = 0;
   CLR(xs-flags, ITSDONE);
 + if (panicstr)
 + SET(xs-flags, SCSI_AUTOCONF);
  
  #ifdef SCSIDEBUG
   if (xs-sc_link-flags  SDEV_DB1) {
   scsi_xs_show(xs);
   if (xs-datalen  (xs-flags  SCSI_DATA_OUT))
   scsi_show_mem(xs-data, min(64, xs-datalen));
   }
  #endif
  
   /* The adapter's scsi_cmd() is responsible for calling scsi_done(). */
 



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Mike Belopuhov
On 8 May 2015 at 11:43, Mark Kettenis mark.kette...@xs4all.nl wrote:
 Date: Thu, 7 May 2015 20:58:53 +0200
 From: Mike Belopuhov m...@belopuhov.com

 As I've pointed out before, on panic we can be running on any
 CPU and our disk controller's interrupts can interrupt on the
 other one.  Since we'll most likely be holding a kernel lock,
 dealing with unlocking it might get hairy very fast.  Instead
 what we could do to improve the chances of a clean shutdown on
 panic is to instruct our disk subsystem to do polled I/O that
 will be run on the same CPU with the panic.

 Initially I wanted to move cold = 1 earlier in boot(), but
 after talking to Miod, it started to look like a bad idea.

 Thoughts?

 I'm not necessarily against forcing polled I/O in this case, but it
 sucks a bit that we have to add checks in fairly low-level code to
 accomplish this.

Yes, but it's the place where we make decision what kind of I/O
we want to do.  SCSI I/O path in the controller or stack never falls
back to the polled I/O (unless it's a specific method, i.e. sddump).

 And I'm not sure if disk I/O is the only thing that
 we should worry about here.


What else?

 I think tsleep(9) and msleep(9) need to release and re-acquire the
 kernel lock in the cold || panicstr case.

Well, it's not hard to do really, but...

 We might need this for
 handling interrupts during autoconf as soon as we start distributing
 interrupts over CPUs.


...cold might mean that interrupts are not ready yet.  So then we might
need another flag for shutdown?



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Mike Belopuhov
On Fri, May 08, 2015 at 12:34 +0200, Mike Belopuhov wrote:
  I think tsleep(9) and msleep(9) need to release and re-acquire the
  kernel lock in the cold || panicstr case.
 
 Well, it's not hard to do really, but...
 
  We might need this for
  handling interrupts during autoconf as soon as we start distributing
  interrupts over CPUs.
 
 
 ...cold might mean that interrupts are not ready yet.  So then we might
 need another flag for shutdown?

This is what I have come up with.  Chunks were taken directly from
mi_switch and it seems to do the job just fine.  Right now I'm not
using any additional flags and it seems to work here.  I'll resume
testing on Monday, but it looks fairly complete.  Any comments?

diff --git sys/kern/kern_synch.c sys/kern/kern_synch.c
index 03308b4..4089adf 100644
--- sys/kern/kern_synch.c
+++ sys/kern/kern_synch.c
@@ -103,10 +103,13 @@ extern int safepri;
 int
 tsleep(const volatile void *ident, int priority, const char *wmesg, int timo)
 {
struct sleep_state sls;
int error, error1;
+#ifdef MULTIPROCESSOR
+   int hold_count;
+#endif
 
KASSERT((priority  ~(PRIMASK | PCATCH)) == 0);
 
 #ifdef MULTIPROCESSOR
KASSERT(timo || __mp_lock_held(kernel_lock));
@@ -120,10 +123,18 @@ tsleep(const volatile void *ident, int priority, const 
char *wmesg, int timo)
 * don't run any other procs or panic below,
 * in case this is the idle process and already asleep.
 */
s = splhigh();
splx(safepri);
+#ifdef MULTIPROCESSOR
+   if (__mp_lock_held(kernel_lock))
+   hold_count = __mp_release_all(kernel_lock);
+   else
+   hold_count = 0;
+   if (hold_count)
+   __mp_acquire_count(kernel_lock, hold_count);
+#endif
splx(s);
return (0);
}
 
sleep_setup(sls, ident, priority, wmesg);
@@ -149,10 +160,13 @@ int
 msleep(const volatile void *ident, struct mutex *mtx, int priority,
 const char *wmesg, int timo)
 {
struct sleep_state sls;
int error, error1, spl;
+#ifdef MULTIPROCESSOR
+   int hold_count;
+#endif
 
KASSERT((priority  ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
KASSERT(mtx != NULL);
 
if (cold || panicstr) {
@@ -163,10 +177,18 @@ msleep(const volatile void *ident, struct mutex *mtx, int 
priority,
 * in case this is the idle process and already asleep.
 */
spl = MUTEX_OLDIPL(mtx);
MUTEX_OLDIPL(mtx) = safepri;
mtx_leave(mtx);
+#ifdef MULTIPROCESSOR
+   if (__mp_lock_held(kernel_lock))
+   hold_count = __mp_release_all(kernel_lock);
+   else
+   hold_count = 0;
+   if (hold_count)
+   __mp_acquire_count(kernel_lock, hold_count);
+#endif
if ((priority  PNORELOCK) == 0) {
mtx_enter(mtx);
MUTEX_OLDIPL(mtx) = spl;
} else
splx(spl);
diff --git sys/kern/vfs_subr.c sys/kern/vfs_subr.c
index a26fbe2..a373789 100644
--- sys/kern/vfs_subr.c
+++ sys/kern/vfs_subr.c
@@ -1664,10 +1664,13 @@ int
 vfs_syncwait(int verbose)
 {
struct buf *bp;
int iter, nbusy, dcount, s;
struct proc *p;
+#ifdef MULTIPROCESSOR
+   int hold_count;
+#endif
 
p = curproc? curproc : proc0;
sys_sync(p, (void *)0, (register_t *)0);
 
/* Wait for sync to finish. */
@@ -1698,11 +1701,21 @@ vfs_syncwait(int verbose)
}
if (nbusy == 0)
break;
if (verbose)
printf(%d , nbusy);
+#ifdef MULTIPROCESSOR
+   if (__mp_lock_held(kernel_lock))
+   hold_count = __mp_release_all(kernel_lock);
+   else
+   hold_count = 0;
+#endif
DELAY(4 * iter);
+#ifdef MULTIPROCESSOR
+   if (hold_count)
+   __mp_acquire_count(kernel_lock, hold_count);
+#endif
}
 
return nbusy;
 }
 



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Mark Kettenis
 Date: Fri, 8 May 2015 20:15:58 +0200
 From: Mike Belopuhov m...@belopuhov.com
 
 On Fri, May 08, 2015 at 12:34 +0200, Mike Belopuhov wrote:
   I think tsleep(9) and msleep(9) need to release and re-acquire the
   kernel lock in the cold || panicstr case.
  
  Well, it's not hard to do really, but...
  
   We might need this for
   handling interrupts during autoconf as soon as we start distributing
   interrupts over CPUs.
  
  
  ...cold might mean that interrupts are not ready yet.  So then we might
  need another flag for shutdown?
 
 This is what I have come up with.  Chunks were taken directly from
 mi_switch and it seems to do the job just fine.  Right now I'm not
 using any additional flags and it seems to work here.  I'll resume
 testing on Monday, but it looks fairly complete.  Any comments?

Makes sense to me.  The msleep/tsleep code could be simplified to:

if (__mp_lock_held(kernel_lock)) {
hold_count = __mp_release_all(kernel_lock);
__mp_acquire_count(kernel_lock, hold_count);
}

I'm also wondering whether we should change __mp_release_all() to
simple return 0 if the current CPU does not hold the lock, such that
the __mp_lock_held() check isn't needed anymore.  But that's a
separate issue.

 diff --git sys/kern/kern_synch.c sys/kern/kern_synch.c
 index 03308b4..4089adf 100644
 --- sys/kern/kern_synch.c
 +++ sys/kern/kern_synch.c
 @@ -103,10 +103,13 @@ extern int safepri;
  int
  tsleep(const volatile void *ident, int priority, const char *wmesg, int timo)
  {
   struct sleep_state sls;
   int error, error1;
 +#ifdef MULTIPROCESSOR
 + int hold_count;
 +#endif
  
   KASSERT((priority  ~(PRIMASK | PCATCH)) == 0);
  
  #ifdef MULTIPROCESSOR
   KASSERT(timo || __mp_lock_held(kernel_lock));
 @@ -120,10 +123,18 @@ tsleep(const volatile void *ident, int priority, const 
 char *wmesg, int timo)
* don't run any other procs or panic below,
* in case this is the idle process and already asleep.
*/
   s = splhigh();
   splx(safepri);
 +#ifdef MULTIPROCESSOR
 + if (__mp_lock_held(kernel_lock))
 + hold_count = __mp_release_all(kernel_lock);
 + else
 + hold_count = 0;
 + if (hold_count)
 + __mp_acquire_count(kernel_lock, hold_count);
 +#endif
   splx(s);
   return (0);
   }
  
   sleep_setup(sls, ident, priority, wmesg);
 @@ -149,10 +160,13 @@ int
  msleep(const volatile void *ident, struct mutex *mtx, int priority,
  const char *wmesg, int timo)
  {
   struct sleep_state sls;
   int error, error1, spl;
 +#ifdef MULTIPROCESSOR
 + int hold_count;
 +#endif
  
   KASSERT((priority  ~(PRIMASK | PCATCH | PNORELOCK)) == 0);
   KASSERT(mtx != NULL);
  
   if (cold || panicstr) {
 @@ -163,10 +177,18 @@ msleep(const volatile void *ident, struct mutex *mtx, 
 int priority,
* in case this is the idle process and already asleep.
*/
   spl = MUTEX_OLDIPL(mtx);
   MUTEX_OLDIPL(mtx) = safepri;
   mtx_leave(mtx);
 +#ifdef MULTIPROCESSOR
 + if (__mp_lock_held(kernel_lock))
 + hold_count = __mp_release_all(kernel_lock);
 + else
 + hold_count = 0;
 + if (hold_count)
 + __mp_acquire_count(kernel_lock, hold_count);
 +#endif
   if ((priority  PNORELOCK) == 0) {
   mtx_enter(mtx);
   MUTEX_OLDIPL(mtx) = spl;
   } else
   splx(spl);
 diff --git sys/kern/vfs_subr.c sys/kern/vfs_subr.c
 index a26fbe2..a373789 100644
 --- sys/kern/vfs_subr.c
 +++ sys/kern/vfs_subr.c
 @@ -1664,10 +1664,13 @@ int
  vfs_syncwait(int verbose)
  {
   struct buf *bp;
   int iter, nbusy, dcount, s;
   struct proc *p;
 +#ifdef MULTIPROCESSOR
 + int hold_count;
 +#endif
  
   p = curproc? curproc : proc0;
   sys_sync(p, (void *)0, (register_t *)0);
  
   /* Wait for sync to finish. */
 @@ -1698,11 +1701,21 @@ vfs_syncwait(int verbose)
   }
   if (nbusy == 0)
   break;
   if (verbose)
   printf(%d , nbusy);
 +#ifdef MULTIPROCESSOR
 + if (__mp_lock_held(kernel_lock))
 + hold_count = __mp_release_all(kernel_lock);
 + else
 + hold_count = 0;
 +#endif
   DELAY(4 * iter);
 +#ifdef MULTIPROCESSOR
 + if (hold_count)
 + __mp_acquire_count(kernel_lock, hold_count);
 +#endif
   }
  
   return nbusy;
  }
  
 
 



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Mark Kettenis
 From: Mike Belopuhov m...@belopuhov.com
 Date: Fri, 8 May 2015 13:53:56 +0200
 
 On 8 May 2015 at 12:37, Martin Pieuchot m...@openbsd.org wrote:
  On 07/05/15(Thu) 20:58, Mike Belopuhov wrote:
  As I've pointed out before, on panic we can be running on any
  CPU and our disk controller's interrupts can interrupt on the
  other one.  Since we'll most likely be holding a kernel lock,
  dealing with unlocking it might get hairy very fast.  Instead
  what we could do to improve the chances of a clean shutdown on
  panic is to instruct our disk subsystem to do polled I/O that
  will be run on the same CPU with the panic.
 
  Did you consider executing ddb's boot commands on cpu0?  I mean doing
  an implicit machine ddbcpu 0 before executing any boot command?
 
 
 But panic can't do it really.

And not all our architectures support the machine ddbcpu command.



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-08 Thread Mike Belopuhov
On 8 May 2015 at 04:51, Masao Uebayashi uebay...@tombiinc.com wrote:
 By doing complex VFS shutdown operation, the system's memory image will
 be modified a lot since a panic was triggered.  I'd totally skip
 vfs_shutdown() after a panic [1], then do the best to dump a kernel core
 for analysis.

 [1] OpenBSD's panic(9) sets RB_NOSYNC only when panicstr is already set
 (== recursive panic()) at this moment.

This is also a valid point of view and in fact I was surprised
to see sys_sync called from vfs_syncwait, esp. since it's doing
uvm_vnp_sync.  But on the other hand it has been there for 20
years in OpenBSD and who knows how many years before that in
*BSD and we got used to relying on this behavior, didn't we?



Re: vfs_shutdown would like to do polled I/O at least on panic

2015-05-07 Thread Masao Uebayashi
By doing complex VFS shutdown operation, the system's memory image will
be modified a lot since a panic was triggered.  I'd totally skip
vfs_shutdown() after a panic [1], then do the best to dump a kernel core
for analysis.

[1] OpenBSD's panic(9) sets RB_NOSYNC only when panicstr is already set
(== recursive panic()) at this moment.