Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Dag-Erling Smørgrav
Garrett Cooper gcoo...@freebsd.org writes:
 Agreed. Spinning down at reboot isn't smart and seems like a good way
 to kill a disk quicker.

*not* spinning down at halt is far worse.  Most modern disks are rated
for hundreds of thousands of load-unload cycles, but far fewer emergency
unloads (which is what happens when the drive loses power while still
spinning).

DES
-- 
Dag-Erling Smørgrav - d...@des.no
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: is vfs.lookup_shared unsafe in 7.3?

2010-09-16 Thread cronfy
 Hello,

 Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA 
 40, very slow lstat() at these moments, looks like some kind of lock
 contention) I enabled vfs.lookup_shared=1 on two servers today. One is
 FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
 FreeBSD-7.3 csup'ed and built Jul 16 2010.

 The server with more fresh kernel is running nice and does not show
 high load anymore. But on the second server it did not help. More,
 after a few hours of work with vfs.lookup_shared=1 I noticed processes
 stucked in ufs state. I tried to kill them with no luck. Disabling
 vfs.lookup_shared freezed the whole system.

 So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
 between 16 Jul and 9 Sep (is it the reason why first system is still
 running?), or should I expect that it will freeze in a near time too?

 Thanks in advance!

 No, 7.3 has a bug that can cause these hangs that is probably made worse by
 vfs.lookup_shared=1, but can occur even if it is disabled.  You want
 these fixes applied (in order, one of them reverts part of another):

Thank you for the fix and for the explanation, that's exactly what I
wanted to know. Just to be sure: do these patches completely fix the
bug with hangs (even without vfs.lookup_shared=1)?

-- 
// cronfy
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Oliver Fromme

Alexander Best wrote:
  On Wed Sep 15 10, Oliver Fromme wrote:
   Warren Block wbl...@wonkity.com wrote:
[...]
8. Alexander Motin has an updated CAM version of the ATA system which 
will eventually replace the existing one.  In -CURRENT, anyway.  He was 
kind enough to look at my event handler.  My understanding is that he is 
looking at implementing the head parking/standby mechanism in that new 
code.
   
   The patch below will work with the new CAM ATA driver
   (i.e. ada(4) disks).  It adds a sysctl, so you can switch
   the spin-down off if you're going to just reboot:
   # sysctl kern.cam.ada.spindown_shutdown=0
  
  i haven't tested your patch yet, but i don't think deciding whether to spin
  down the hdd should be decided merely from the sysctl value.

It was the most simple and least intrusive way to introduce
some means to switch it on and off.  Of course there might
be better ways to do it.  You're welcome to submit your own
patch.

  the hdd should spindown when a shutdown has been issued and not spindown,
  if a reboot has been issued.

Right.  That's why my shutdown wrapper script sets the sysctl
to 0 when the -r option is present (I've got that wrapper
script for ages, for different reasons).

Also, there are cases where it is completely impossible to
decide automatically whether the disks should be spun down
or not.  For example, if the admin issues a shutdown -h
(halt), there's no way for the OS to know in advance whether
the admin is going to switch the machine off or reboot to
multi-user.  So there must be a way for the user to forcibly
enable/disable the spindown feature.  I think a sysctl is
the most appropriate way to do that, isn't it?

Actually, my plan is to have a mask of two bits for the
sysctl (the default value would be 3):

 - bit 0: enable (1) or disable (0) spindown
 - bit 1: automatic (1) or manual (0) setting

With the default setting (i.e. bit 1 == 1), at shutdown time
some facility would look at the reboot(2) howto flags and
then set bit 0 to either 0 or 1.

There are several ways where to handle that.  For example,
init(8) could be modified to pass the howto value to
rc.shutdown (which could be useful for other purposes, too).
Then a standard rc.d script could handle the spindown sysctl.
The advantage of that solution would be maximum flexibility,
because the actual logic is implemented in an rc.d script.

  deciding whether freebsd reboots or shuts down cannot be done from a script,
  since users might use the reboot or halt commands in which case (if i'm not
  mistaken) all shutdown scripts get skipped.

Right, which is why it is a rather bad idea to use halt(8)
or reboot(8), except in an emergency.  Actually I think the
manpages and handbook should strongly discourage it, and
recommend to use shutdown(8) or init(8) instead, both of
which send a signal to PID 1 by default, so rc.shutdown is
executed properly.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

If Java had true garbage collection, most programs
would delete themselves upon execution.
-- Robert Sewell
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: is vfs.lookup_shared unsafe in 7.3?

2010-09-16 Thread John Baldwin
On Thursday, September 16, 2010 3:53:47 am cronfy wrote:
  Hello,
 
  Trying to overtake high server load (sudden peaks of 15%us/85%sy, LA 
  40, very slow lstat() at these moments, looks like some kind of lock
  contention) I enabled vfs.lookup_shared=1 on two servers today. One is
  FreeBSD-7.3 kernel csup'ed and built Sep  9 2010 and other is
  FreeBSD-7.3 csup'ed and built Jul 16 2010.
 
  The server with more fresh kernel is running nice and does not show
  high load anymore. But on the second server it did not help. More,
  after a few hours of work with vfs.lookup_shared=1 I noticed processes
  stucked in ufs state. I tried to kill them with no luck. Disabling
  vfs.lookup_shared freezed the whole system.
 
  So, is vfs.lookup_shared=1 unsafe in 7.3? Did it become more stable
  between 16 Jul and 9 Sep (is it the reason why first system is still
  running?), or should I expect that it will freeze in a near time too?
 
  Thanks in advance!
 
  No, 7.3 has a bug that can cause these hangs that is probably made worse by
  vfs.lookup_shared=1, but can occur even if it is disabled.  You want
  these fixes applied (in order, one of them reverts part of another):
 
 Thank you for the fix and for the explanation, that's exactly what I
 wanted to know. Just to be sure: do these patches completely fix the
 bug with hangs (even without vfs.lookup_shared=1)?

Yes.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: odd issues with DDB vs GDB

2010-09-16 Thread John Baldwin
On Wednesday, September 15, 2010 8:01:19 pm Patrick Mahan wrote:
 All,
 
 I am trying to debug a system hang occurring on my HP Proliant G6 running 
 some of our
 kernel software.  I am seeing that under certain test loads, the system will 
 hang-up
 complete, no keyboard, no console, etc.  I suspect it is some of the kernel 
 code that
 I have inherited that contains a lot of locking (lots of data structure, each 
 having
 their own mutex lock (sleepable)).

You need to use 'kgdb' rather than 'gdb' on kernel.debug.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: traling whitespace in CFLAGS if make.conf:CPUTYPE is not defined/empty

2010-09-16 Thread John Baldwin
On Wednesday, September 15, 2010 9:01:20 pm Alexander Best wrote:
 hi there,
 
 after discovering PR #114082 i noticed that with CPUTYPE not being defined in
 make.conf, `make -VCFLAGS` reports a trailing whitespace for CFLAGS.
 the reason for this is that ${_CPUCFLAGS} gets added to CFLAGS even if it's
 empty.
 
 the following patch should take care of the problem. i also added the same
 logik to COPTFLAGS. although i wasn't able to trigger the trailing whitespace,
 it should still introduce a cleaner behaviour.

Does the trailing whitespace break anything?  In the past we have had a
non-empty default CPU CFLAGS (e.g. using '-mtune=pentiumpro' on i386 at one
point IIRC) which this change would break.  Unless the trailing whitespace
is causing non-cosmetic problems I'd probably just leave it as it is.

Also, if we were to go with this approach, I would not have changed
kern.pre.mk at all, but set both NO_CPU_CFLAGS and NO_CPU_COPTFLAGS in
bsd.cpu.mk when CPUTYPE was empty.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: traling whitespace in CFLAGS if make.conf:CPUTYPE is not defined/empty

2010-09-16 Thread Alexander Best
On Thu Sep 16 10, John Baldwin wrote:
 On Wednesday, September 15, 2010 9:01:20 pm Alexander Best wrote:
  hi there,
  
  after discovering PR #114082 i noticed that with CPUTYPE not being defined 
  in
  make.conf, `make -VCFLAGS` reports a trailing whitespace for CFLAGS.
  the reason for this is that ${_CPUCFLAGS} gets added to CFLAGS even if it's
  empty.
  
  the following patch should take care of the problem. i also added the same
  logik to COPTFLAGS. although i wasn't able to trigger the trailing 
  whitespace,
  it should still introduce a cleaner behaviour.
 
 Does the trailing whitespace break anything?  In the past we have had a
 non-empty default CPU CFLAGS (e.g. using '-mtune=pentiumpro' on i386 at one
 point IIRC) which this change would break.  Unless the trailing whitespace
 is causing non-cosmetic problems I'd probably just leave it as it is.

the PR claims that a few ports are having problems with trailing whitespaces
during ./configure, but personally i haven't experienced any problems.
however i don't use the port system a lot so i'm not really able to comment on
that.

cheers.
alex

 
 Also, if we were to go with this approach, I would not have changed
 kern.pre.mk at all, but set both NO_CPU_CFLAGS and NO_CPU_COPTFLAGS in
 bsd.cpu.mk when CPUTYPE was empty.
 
 -- 
 John Baldwin

-- 
a13x
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Tijl Coosemans
On Thursday 16 September 2010 10:41:07 Oliver Fromme wrote:
 Alexander Best wrote:
 On Wed Sep 15 10, Oliver Fromme wrote:
 The patch below will work with the new CAM ATA driver
 (i.e. ada(4) disks).  It adds a sysctl, so you can switch
 the spin-down off if you're going to just reboot:
 # sysctl kern.cam.ada.spindown_shutdown=0

 the hdd should spindown when a shutdown has been issued and not spindown,
 if a reboot has been issued.

 Right.  That's why my shutdown wrapper script sets the sysctl
 to 0 when the -r option is present (I've got that wrapper
 script for ages, for different reasons).
 
 Also, there are cases where it is completely impossible to
 decide automatically whether the disks should be spun down
 or not.  For example, if the admin issues a shutdown -h
 (halt), there's no way for the OS to know in advance whether
 the admin is going to switch the machine off or reboot to
 multi-user.  So there must be a way for the user to forcibly
 enable/disable the spindown feature.  I think a sysctl is
 the most appropriate way to do that, isn't it?

I would just spin down the disk in case of a halt. An unwanted spin
down is harmless compared to an emergency shutdown and usually the
intention is to power off rather than reboot.

Part of your patch modifies ada_shutdown. That function already gets
the reboot(2) howto flags passed to it, so you could test for
(howto  (RB_HALT | RB_POWEROFF)) != 0 before issuing the STANDBY
command. There's no need to make this more complicated with a sysctl
that can override this in my opinion.

Also command2 should be command1 in this line:

+   if (cgd-ident_data.support.command2  ATA_SUPPORT_POWERMGT)


signature.asc
Description: This is a digitally signed message part.


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Oliver Fromme

Tijl Coosemans wrote:
  On Thursday 16 September 2010 10:41:07 Oliver Fromme wrote:
   Also, there are cases where it is completely impossible to
   decide automatically whether the disks should be spun down
   or not.  For example, if the admin issues a shutdown -h
   (halt), there's no way for the OS to know in advance whether
   the admin is going to switch the machine off or reboot to
   multi-user.  So there must be a way for the user to forcibly
   enable/disable the spindown feature.  I think a sysctl is
   the most appropriate way to do that, isn't it?
  
  I would just spin down the disk in case of a halt. An unwanted spin
  down is harmless compared to an emergency shutdown and usually the
  intention is to power off rather than reboot.

Is it?  When I intend to power-off, I use shutdown -p, not
shutdown -h.  Quite often (but not always) when I halt a
machine, I'm going to reboot to multi-user, not power off.

In that case I certainly wouldn't want to spin the drives
down and have them spun up immediately afterwards.  I don't
think that weartear caused by that procedure is completely
insignificant (although it's certainly less of a problem
than emergency unloads).

For that reason I definitely want to have a way to disable
the spindown function manually.

  Part of your patch modifies ada_shutdown. That function already gets
  the reboot(2) howto flags passed to it, so you could test for
  (howto  (RB_HALT | RB_POWEROFF)) != 0 before issuing the STANDBY
  command.

Right, good point.  I didn't notice because the shutdown
function in ad(4) doesn't get the howto flag, so I assumed
(without checking) that ada(4) doesn't get it either.

  There's no need to make this more complicated with a sysctl
  that can override this in my opinion.

I'm afraid I have to disagree (see above).  Apart from that,
there's nothing complicated at all about a sysctl.

  Also command2 should be command1 in this line:
  
  +   if (cgd-ident_data.support.command2  ATA_SUPPORT_POWERMGT)

Oops ...  You're right.  Thanks for pointing that out.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

I made up the term 'object-oriented', and I can tell you
I didn't have C++ in mind.
-- Alan Kay, OOPSLA '97
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Warren Block

On Thu, 16 Sep 2010, Alexander Best wrote:


On Wed Sep 15 10, Oliver Fromme wrote:

Warren Block wbl...@wonkity.com wrote:
 [...]
 8. Alexander Motin has an updated CAM version of the ATA system which
 will eventually replace the existing one.  In -CURRENT, anyway.  He was
 kind enough to look at my event handler.  My understanding is that he is
 looking at implementing the head parking/standby mechanism in that new
 code.

The patch below will work with the new CAM ATA driver
(i.e. ada(4) disks).  It adds a sysctl, so you can switch
the spin-down off if you're going to just reboot:
# sysctl kern.cam.ada.spindown_shutdown=0


i haven't tested your patch yet, but i don't think deciding whether to spin
down the hdd should be decided merely from the sysctl value.

the hdd should spindown when a shutdown has been issued and not spindown,
if a reboot has been issued.


It's been a while, but the problem I found when comparing the NetBSD 
code was that there didn't appear to be a way to tell from within the 
FreeBSD driver whether it was a shutdown or reboot.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Nathan Whitehorn
On 09/16/10 09:42, Warren Block wrote:
 On Thu, 16 Sep 2010, Alexander Best wrote:

 On Wed Sep 15 10, Oliver Fromme wrote:
 Warren Block wbl...@wonkity.com wrote:
  [...]
  8. Alexander Motin has an updated CAM version of the ATA system which
  will eventually replace the existing one.  In -CURRENT, anyway. 
 He was
  kind enough to look at my event handler.  My understanding is that
 he is
  looking at implementing the head parking/standby mechanism in that
 new
  code.

 The patch below will work with the new CAM ATA driver
 (i.e. ada(4) disks).  It adds a sysctl, so you can switch
 the spin-down off if you're going to just reboot:
 # sysctl kern.cam.ada.spindown_shutdown=0

 i haven't tested your patch yet, but i don't think deciding whether
 to spin
 down the hdd should be decided merely from the sysctl value.

 the hdd should spindown when a shutdown has been issued and not
 spindown,
 if a reboot has been issued.

 It's been a while, but the problem I found when comparing the NetBSD
 code was that there didn't appear to be a way to tell from within the
 FreeBSD driver whether it was a shutdown or reboot.

Register a shutdown event handler? The second argument can be tested
against RB_HALT to determine what is happening.
-Nathan
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Tijl Coosemans
On Thursday 16 September 2010 16:10:22 Oliver Fromme wrote:
 Tijl Coosemans wrote:
 I would just spin down the disk in case of a halt. An unwanted spin
 down is harmless compared to an emergency shutdown and usually the
 intention is to power off rather than reboot.
 
 Is it?  When I intend to power-off, I use shutdown -p, not
 shutdown -h.  Quite often (but not always) when I halt a
 machine, I'm going to reboot to multi-user, not power off.

Hmm, I suppose support for power off is ubiquitous nowadays. It used to
be that halt meant: bring the system in a state where we can safely cut
the power. In that case it makes sense to let halt spin down the disks.
If you intend to reboot why not explicitly reboot rather than halt?
Also, to go from single to multi user mode you can just exit(1) the
shell.

 In that case I certainly wouldn't want to spin the drives
 down and have them spun up immediately afterwards.  I don't
 think that weartear caused by that procedure is completely
 insignificant (although it's certainly less of a problem
 than emergency unloads).
 
 For that reason I definitely want to have a way to disable
 the spindown function manually.

Ok, I'm soft on the sysctl really, it wouldn't hurt anyone. Although,
if the intention is to just override the default behaviour at the time
of shutdown you might as well just add an option to halt(8). A don't
spin down disks option would fit in with the other options there.


signature.asc
Description: This is a digitally signed message part.


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Oliver Fromme

Tijl Coosemans wrote:
  On Thursday 16 September 2010 16:10:22 Oliver Fromme wrote:
   Tijl Coosemans wrote:
I would just spin down the disk in case of a halt. An unwanted spin
down is harmless compared to an emergency shutdown and usually the
intention is to power off rather than reboot.
   
   Is it?  When I intend to power-off, I use shutdown -p, not
   shutdown -h.  Quite often (but not always) when I halt a
   machine, I'm going to reboot to multi-user, not power off.
  
  Hmm, I suppose support for power off is ubiquitous nowadays. It used to
  be that halt meant: bring the system in a state where we can safely cut
  the power. In that case it makes sense to let halt spin down the disks.
  If you intend to reboot why not explicitly reboot rather than halt?

For example, I use shutdown -h in order to swap disks that
are not hot-swappable, or other kind of hardware work that
can be done while the machine is switched on.

Of course, in that particular case the disk which is about
to be swapped out should be spun down, while the others
should not.  But that's not a problem because I can use
atacontrol(8) and camcontrol(8) to spin down a specific
disk drive manually.

  Also, to go from single to multi user mode you can just exit(1) the
  shell.

Yes, of course, that's a different matter.

I've updated the patch for ada(4).  It includes a bug fix
(command1 vs. command2) and uses the howto flags passed to
the shutdown function.  Thanks again for pointing these out.

Best regards
   Oliver


--- ata_da.c.orig   2010-05-23 18:16:33.0 +0200
+++ ata_da.c2010-09-16 17:21:10.0 +0200
@@ -42,6 +42,7 @@
 #include sys/eventhandler.h
 #include sys/malloc.h
 #include sys/cons.h
+#include sys/reboot.h
 #include geom/geom_disk.h
 #endif /* _KERNEL */
 
@@ -79,7 +80,8 @@
ADA_FLAG_CAN_TRIM   = 0x080,
ADA_FLAG_OPEN   = 0x100,
ADA_FLAG_SCTX_INIT  = 0x200,
-   ADA_FLAG_CAN_CFA= 0x400
+   ADA_FLAG_CAN_CFA= 0x400,
+   ADA_FLAG_CAN_POWERMGT   = 0x800
 } ada_flags;
 
 typedef enum {
@@ -180,6 +182,10 @@
 #defineADA_DEFAULT_SEND_ORDERED1
 #endif
 
+#ifndefADA_DEFAULT_SPINDOWN_SHUTDOWN
+#defineADA_DEFAULT_SPINDOWN_SHUTDOWN   1
+#endif
+
 /*
  * Most platforms map firmware geometry to actual, but some don't.  If
  * not overridden, default to nothing.
@@ -191,6 +197,7 @@
 static int ada_retry_count = ADA_DEFAULT_RETRY;
 static int ada_default_timeout = ADA_DEFAULT_TIMEOUT;
 static int ada_send_ordered = ADA_DEFAULT_SEND_ORDERED;
+static int ada_spindown_shutdown = ADA_DEFAULT_SPINDOWN_SHUTDOWN;
 
 SYSCTL_NODE(_kern_cam, OID_AUTO, ada, CTLFLAG_RD, 0,
 CAM Direct Access Disk driver);
@@ -203,6 +210,9 @@
 SYSCTL_INT(_kern_cam_ada, OID_AUTO, ada_send_ordered, CTLFLAG_RW,
ada_send_ordered, 0, Send Ordered Tags);
 TUNABLE_INT(kern.cam.ada.ada_send_ordered, ada_send_ordered);
+SYSCTL_INT(_kern_cam_ada, OID_AUTO, spindown_shutdown, CTLFLAG_RW,
+   ada_spindown_shutdown, 0, Spin down upon shutdown);
+TUNABLE_INT(kern.cam.ada.spindown_shutdown, ada_spindown_shutdown);
 
 /*
  * ADA_ORDEREDTAG_INTERVAL determines how often, relative
@@ -665,6 +675,8 @@
softc-flags |= ADA_FLAG_CAN_48BIT;
if (cgd-ident_data.support.command2  ATA_SUPPORT_FLUSHCACHE)
softc-flags |= ADA_FLAG_CAN_FLUSHCACHE;
+   if (cgd-ident_data.support.command1  ATA_SUPPORT_POWERMGT)
+   softc-flags |= ADA_FLAG_CAN_POWERMGT;
if (cgd-ident_data.satacapabilities  ATA_SUPPORT_NCQ 
cgd-inq_flags  SID_CmdQue)
softc-flags |= ADA_FLAG_CAN_NCQ;
@@ -1222,6 +1234,58 @@
 /*getcount_only*/0);
cam_periph_unlock(periph);
}
+
+   if (ada_spindown_shutdown == 0 ||
+   (howto  (RB_HALT | RB_POWEROFF)) == 0)
+   return;
+
+   DELAY(50);
+
+   TAILQ_FOREACH(periph, adadriver.units, unit_links) {
+   union ccb ccb;
+
+   /* If we paniced with lock held - not recurse here. */
+   if (cam_periph_owned(periph))
+   continue;
+   cam_periph_lock(periph);
+   softc = (struct ada_softc *)periph-softc;
+   /*
+* We only spin-down the drive if it is capable of it..
+*/
+   if ((softc-flags  ADA_FLAG_CAN_POWERMGT) == 0) {
+   cam_periph_unlock(periph);
+   continue;
+   }
+
+   /* XXX Hide this behind bootverbose? */
+   xpt_print(periph-path, spin-down\n);
+
+   xpt_setup_ccb(ccb.ccb_h, periph-path, CAM_PRIORITY_NORMAL);
+
+   ccb.ccb_h.ccb_state = ADA_CCB_DUMP;
+   cam_fill_ataio(ccb.ataio,
+   1,
+   adadone,
+  

Re: odd issues with DDB vs GDB

2010-09-16 Thread Patrick Mahan



John Baldwin wrote:

On Wednesday, September 15, 2010 8:01:19 pm Patrick Mahan wrote:

All,

I am trying to debug a system hang occurring on my HP Proliant G6 running some 
of our
kernel software.  I am seeing that under certain test loads, the system will 
hang-up
complete, no keyboard, no console, etc.  I suspect it is some of the kernel 
code that
I have inherited that contains a lot of locking (lots of data structure, each 
having
their own mutex lock (sleepable)).


You need to use 'kgdb' rather than 'gdb' on kernel.debug.



Doh! *-(

I'm so used to gdb even though I use kgdb for looking at crash dumps.

Thanks,

Patrick
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-16 Thread Andrey Simonenko
On Wed, Sep 15, 2010 at 08:46:00AM -0700, Matthew Fleming wrote:
 I'll take a stab at answering these...
 
 On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
 si...@comsys.ntu-kpi.kiev.ua wrote:
  Hello,
 
  I have questions about mutex implementation in kern/kern_mutex.c
  and sys/mutex.h files (current versions of these files):
 
  1. Is the following statement correct for a volatile pointer or integer
   variable: if a volatile variable is updated by the compare-and-set
   instruction (e.g. atomic_cmpset_ptr(val, ...)), then the current
   value of such variable can be read without any special instruction
   (e.g. v = val)?
 
   I checked Assembler code for a function with v = val and val = v
   like statements generated for volatile variable and simple variable
   and found differences: on ia64 v = val was implemented by ld.acq and
   val = v was implemented by st.rel; on mips and sparc64 Assembler code
   can have different order of lines for volatile and simple variable
   (depends on the code of a function).
 
 I think this depends somewhat on the hardware and what you mean by
 current value.

Current value means that the value of a variable read by one thread
is equal to the value of this variable successfully updated by another
thread by the compare-and-set instruction.  As I understand from the kernel
source code, atomic_cmpset_ptr() allows to update a variable in a way that
all other CPUs will invalidate corresponding cache lines that contain
the value of this variable.

The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
special to compare the value of m-mtx_lock (volatile) with current thread
pointer, all other functions that update m-mtx_lock of unowned mutex use
compare-and-set instruction.  Also I cannot find anything special in
generated Assembler code for volatile variables (except for ia64 where
acquire loads and release stores are used).

 
 If you want a value that is not in-flux, then something like
 atomic_cmpset_ptr() setting to the current value is needed, so that
 you force any other atomic_cmpset to fail.  However, since there is no
 explicit lock involved, there is no strong meaning for current value
 and a read that does not rely on a value cached in a register is
 likely sufficient.  While the volatile keyword in C has no explicit
 hardware meaning, it often means that a load from memory (or,
 presumably, L1-L3 cache) is required.

The volatile keyword here and all questions are related to the base C
compiler, current version and currently supported architectures in FreeBSD.
Yes, here under volatile I want to say that the value of a variable is
not cached in a register and it is referenced by its address in all
commands.

There are some places in the kernel where a variable is updated in
something like do { v = value; } while (!atomic_cmpset_int(value, ...));
and that variable is not volatile, but the compiler generates correct
Assembler code.  So volatile is not a requirement for all cases.

 
  2. Let there is a default (sleep) mutex and adaptive mutexes is enabled.
   A thread tries to obtain lock quickly and fails, _mtx_lock_sleep()
   is called, it gets the address of the current mutex's owner thread
   and checks whether that owner thread is running (on another CPU).
   How does _mtx_lock_sleep() know that that thread still exists
   (lines 311-337 in kern_mutex.c)?
 
   When adaptive mutexes was implemented there was explicit locking
   around adaptive mutexes code. When turnstile in mutex code was
   implemented that's locking logic was changed.
 
 It appears that it's possible for the thread pointer to be recycled
 between fetching the value of owner and looking at TD_IS_RUNNING.  On
 actual hardware, this race is unlikely to occur due to the time it
 takes for a thread to release a lock and perform all of thread exit
 code before the struct thread is returned to the uma zone.  However,
 even once returned to the uma zone on many FreeBSD implementations the
 access is safe as the address of the thread is still dereferenceable,
 due to the implementation of uma zones.

I checked exactly this scenario, that's why asked this question to verify
my understanding.

 
  3. Why there is no any memory barrier in mtx_init()? If another thread
   (on another CPU) finds that mutex is initialized using mtx_initialized()
   then it can mtx_lock() it and mtx_lock() it second time, as a result
   mtx_recurse field will be increased, but its value still can be
   uninitialized on architecture with relaxed memory ordering model.
 
 It seems to me that it's generally a programming error to rely on the
 return of mtx_initialized(), as there is no serialization with e.g. a
 thread calling mtx_destroy().  A fully correct serialization model
 would require that a single thread initialize the mtx and then create
 any worker threads that will use the mtx.

I agree that this should not happen in practice.  Another thread can get
a pointer to just initialized mutex and 

Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread RW
On Thu, 16 Sep 2010 09:17:52 +0200
Dag-Erling Smørgrav d...@des.no wrote:

 Garrett Cooper gcoo...@freebsd.org writes:
  Agreed. Spinning down at reboot isn't smart and seems like a good
  way to kill a disk quicker.
 
 *not* spinning down at halt is far worse.  Most modern disks are rated
 for hundreds of thousands of load-unload cycles, but far fewer
 emergency unloads (which is what happens when the drive loses power
 while still spinning).

As I understand it wear from spinning-down used to come from the head
actually scraping the disk surface as it lost lift, parking placed the
head on a disposable area, but modern drives take the head off the disk
altogether.

When Hitachi was specifying 300,000 unloads, they said that in testing
the drives were still working at 1,000,000, someone quoted 600,000 as
the current spec. At these levels you can be spinning the drives
down and up  ever few minutes for the normal lifetime of the drive.

Even on very old drives I doubt reboot are much of a problem, they're
rare on servers. On laptops and desktops they're rare compared to
shutdowns and suspends.  
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-16 Thread John Baldwin
On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:
 On Wed, Sep 15, 2010 at 08:46:00AM -0700, Matthew Fleming wrote:
  I'll take a stab at answering these...
  
  On Wed, Sep 15, 2010 at 6:44 AM, Andrey Simonenko
  si...@comsys.ntu-kpi.kiev.ua wrote:
   Hello,
  
   I have questions about mutex implementation in kern/kern_mutex.c
   and sys/mutex.h files (current versions of these files):
  
   1. Is the following statement correct for a volatile pointer or integer
variable: if a volatile variable is updated by the compare-and-set
instruction (e.g. atomic_cmpset_ptr(val, ...)), then the current
value of such variable can be read without any special instruction
(e.g. v = val)?
  
I checked Assembler code for a function with v = val and val = v
like statements generated for volatile variable and simple variable
and found differences: on ia64 v = val was implemented by ld.acq and
val = v was implemented by st.rel; on mips and sparc64 Assembler code
can have different order of lines for volatile and simple variable
(depends on the code of a function).
  
  I think this depends somewhat on the hardware and what you mean by
  current value.
 
 Current value means that the value of a variable read by one thread
 is equal to the value of this variable successfully updated by another
 thread by the compare-and-set instruction.  As I understand from the kernel
 source code, atomic_cmpset_ptr() allows to update a variable in a way that
 all other CPUs will invalidate corresponding cache lines that contain
 the value of this variable.

That is not true.  It is likely true on x86, but it is certainly not true on
other architectures such as sparc64 where a write may be held in a store 
buffer for an indeterminate amount of time (and note that some lock releases 
are simple stores with a rel memory barrier).  All that we require is that 
if the value is stale, the atomic_cmpset() that attempts to set MTX_CONTESTED 
will fail.

 The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
 special to compare the value of m-mtx_lock (volatile) with current thread
 pointer, all other functions that update m-mtx_lock of unowned mutex use
 compare-and-set instruction.  Also I cannot find anything special in
 generated Assembler code for volatile variables (except for ia64 where
 acquire loads and release stores are used).

No, mtx_owned() is just not harmed by the races it loses.  You can certainly 
read a stale value of mtx_lock in mtx_owned() if some other thread owns the 
lock or has just released the lock.  However, we don't care, because in both 
of those cases, mtx_owned() returns false.  What does matter is that 
mtx_owned() can only return true if we currently hold the mutex.  This works 
because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the 
same time, and 2) even CPUs that hold writes in store buffers will snoop their 
store buffer for local reads on that CPU.  That is, a given CPU will never 
read a stale value of a memory word that is older than a write it has 
performed to that word.

  If you want a value that is not in-flux, then something like
  atomic_cmpset_ptr() setting to the current value is needed, so that
  you force any other atomic_cmpset to fail.  However, since there is no
  explicit lock involved, there is no strong meaning for current value
  and a read that does not rely on a value cached in a register is
  likely sufficient.  While the volatile keyword in C has no explicit
  hardware meaning, it often means that a load from memory (or,
  presumably, L1-L3 cache) is required.
 
 The volatile keyword here and all questions are related to the base C
 compiler, current version and currently supported architectures in FreeBSD.
 Yes, here under volatile I want to say that the value of a variable is
 not cached in a register and it is referenced by its address in all
 commands.
 
 There are some places in the kernel where a variable is updated in
 something like do { v = value; } while (!atomic_cmpset_int(value, ...));
 and that variable is not volatile, but the compiler generates correct
 Assembler code.  So volatile is not a requirement for all cases.

Hmm, I suspect that many of those places actually do use volatile.  The 
various lock cookies (mtx_lock, etc.) are declared volatile in the structure.  
Otherwise the compiler would be free to conclude that 'v = value;' is a loop 
invariant and move it out of the loop which would break.  Given that, the 
construct you referred to does in fact require 'value' to be volatile.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


race conditions for destroying and opening a dev

2010-09-16 Thread Matthew Jacob


Has anyone seen this scenario before? I am seeing it in RELENG_7, but 
the code in question exists through to head.


Thread 1:

(kgdb) where
#0  sched_switch (td=0xff003a04ea80, newtd=0xff00210b4000, 
flags=Variable flags is not available.

) at ../../../kern/sched_ule.c:1944
#1  0x803b6091 in mi_switch (flags=1, newtd=0x0) at 
../../../kern/kern_synch.c:450
#2  0x80402399 in sleepq_switch (wchan=0xff8413b50b60) at 
../../../kern/subr_sleepqueue.c:497
#3  0x80402e8c in sleepq_timedwait (wchan=0xff8413b50b60) at 
../../../kern/subr_sleepqueue.c:615
#4  0x803b682d in _sleep (ident=0xff8413b50b60, 
lock=0x80b0ee00, priority=76, wmesg=0x806583bb devdrn, 
timo=100) at ../../../kern/kern_synch.c:228
#5  0x8037640c in destroy_devl (dev=0xff003aaf) at 
../../../kern/kern_conf.c:874
#6  0x80376759 in destroy_dev (dev=0xff003aaf) at 
../../../kern/kern_conf.c:916
#7  0x8034c939 in g_dev_orphan (cp=0xff003a544800) at 
../../../geom/geom_dev.c:438

#8  0x803506a0 in g_run_events () at ../../../geom/geom_event.c:164
#9  0x80351f1c in g_event_procbody () at 
../../../geom/geom_kern.c:141
#10 0x8038a73a in fork_exit (callout=0x80351eb0 
g_event_procbody at ../../../geom/geom_kern.c:132, arg=0x0, 
frame=0xff8413b50c80) at ../../../kern/kern_fork.c:829
#11 0x805a747e in fork_trampoline () at 
../../../amd64/amd64/exception.S:564

#12 0x in ?? ()

This thread is waiting on the threadcount to go away- i.e., the last 
close of the device to occur (da16 in this case).


Thread 2:

(kgdb) where
#0  sched_switch (td=0xff009bb4ca80, newtd=0xff003af43380, 
flags=Variable flags is not available.

) at ../../../kern/sched_ule.c:1944
#1  0x803b6091 in mi_switch (flags=1, newtd=0x0) at 
../../../kern/kern_synch.c:450
#2  0x80402399 in sleepq_switch (wchan=0x80b0e040) at 
../../../kern/subr_sleepqueue.c:497
#3  0x80402f84 in sleepq_wait (wchan=0x80b0e040) at 
../../../kern/subr_sleepqueue.c:580
#4  0x803b5385 in _sx_xlock_hard (sx=0x80b0e040, 
tid=18446742976810240640, opts=Variable opts is not available.

) at ../../../kern/kern_sx.c:562
#5  0x803b5731 in _sx_xlock (sx=0x80b0e040, opts=0, 
file=0x80652d27 ../../../geom/geom_dev.c, line=196) at sx.h:154
#6  0x8034d1bc in g_dev_open (dev=0xff003aaf, flags=1, 
fmt=Variable fmt is not available.

) at ../../../geom/geom_dev.c:196
#7  0x80333741 in devfs_open (ap=0xff841dea88b0) at 
../../../fs/devfs/devfs_vnops.c:902
#8  0x80601daf in VOP_OPEN_APV (vop=0x8089fb80, 
a=0xff841dea88b0) at vnode_if.c:371
#9  0x80467246 in vn_open_cred (ndp=0xff841dea8a00, 
flagp=0xff841dea894c, cmode=Variable cmode is not available.

) at vnode_if.h:199
#10 0x80463770 in kern_open (td=0xff009bb4ca80, 
path=0x5114a0 Address 0x5114a0 out of bounds, pathseg=Variable 
pathseg is not available.

) at ../../../kern/vfs_syscalls.c:1054
#11 0x805c599e in syscall (frame=0xff841dea8c80) at 
../../../amd64/amd64/trap.c:911
#12 0x805a723b in Xfast_syscall () at 
../../../amd64/amd64/exception.S:349

#13 0x0008009a219c in ?? ()

This thread was opening the device, bumped the refcount, but then wedged 
on the geom topology lock .


the refcount field is protected under devmtx

Anyone seen this?

I'm half inclined to either add in CDP_SCHED_DTR when one calls 
destroy_dev, or make dev_refthread look at CDP_ACTIVE, leaning more 
toward the latter.


Any thoughts on this?



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: race conditions for destroying and opening a dev

2010-09-16 Thread Kostik Belousov
On Thu, Sep 16, 2010 at 12:00:29PM -0700, Matthew Jacob wrote:
 
 Has anyone seen this scenario before? I am seeing it in RELENG_7, but 
 the code in question exists through to head.
 
 Thread 1:
 
 (kgdb) where
 #0  sched_switch (td=0xff003a04ea80, newtd=0xff00210b4000, 
 flags=Variable flags is not available.
 ) at ../../../kern/sched_ule.c:1944
 #1  0x803b6091 in mi_switch (flags=1, newtd=0x0) at 
 ../../../kern/kern_synch.c:450
 #2  0x80402399 in sleepq_switch (wchan=0xff8413b50b60) at 
 ../../../kern/subr_sleepqueue.c:497
 #3  0x80402e8c in sleepq_timedwait (wchan=0xff8413b50b60) at 
 ../../../kern/subr_sleepqueue.c:615
 #4  0x803b682d in _sleep (ident=0xff8413b50b60, 
 lock=0x80b0ee00, priority=76, wmesg=0x806583bb devdrn, 
 timo=100) at ../../../kern/kern_synch.c:228
 #5  0x8037640c in destroy_devl (dev=0xff003aaf) at 
 ../../../kern/kern_conf.c:874
 #6  0x80376759 in destroy_dev (dev=0xff003aaf) at 
 ../../../kern/kern_conf.c:916
 #7  0x8034c939 in g_dev_orphan (cp=0xff003a544800) at 
 ../../../geom/geom_dev.c:438
 #8  0x803506a0 in g_run_events () at ../../../geom/geom_event.c:164
 #9  0x80351f1c in g_event_procbody () at 
 ../../../geom/geom_kern.c:141
 #10 0x8038a73a in fork_exit (callout=0x80351eb0 
 g_event_procbody at ../../../geom/geom_kern.c:132, arg=0x0, 
 frame=0xff8413b50c80) at ../../../kern/kern_fork.c:829
 #11 0x805a747e in fork_trampoline () at 
 ../../../amd64/amd64/exception.S:564
 #12 0x in ?? ()
 
 This thread is waiting on the threadcount to go away- i.e., the last 
 close of the device to occur (da16 in this case).
 
 Thread 2:
 
 (kgdb) where
 #0  sched_switch (td=0xff009bb4ca80, newtd=0xff003af43380, 
 flags=Variable flags is not available.
 ) at ../../../kern/sched_ule.c:1944
 #1  0x803b6091 in mi_switch (flags=1, newtd=0x0) at 
 ../../../kern/kern_synch.c:450
 #2  0x80402399 in sleepq_switch (wchan=0x80b0e040) at 
 ../../../kern/subr_sleepqueue.c:497
 #3  0x80402f84 in sleepq_wait (wchan=0x80b0e040) at 
 ../../../kern/subr_sleepqueue.c:580
 #4  0x803b5385 in _sx_xlock_hard (sx=0x80b0e040, 
 tid=18446742976810240640, opts=Variable opts is not available.
 ) at ../../../kern/kern_sx.c:562
 #5  0x803b5731 in _sx_xlock (sx=0x80b0e040, opts=0, 
 file=0x80652d27 ../../../geom/geom_dev.c, line=196) at sx.h:154
 #6  0x8034d1bc in g_dev_open (dev=0xff003aaf, flags=1, 
 fmt=Variable fmt is not available.
 ) at ../../../geom/geom_dev.c:196
 #7  0x80333741 in devfs_open (ap=0xff841dea88b0) at 
 ../../../fs/devfs/devfs_vnops.c:902
 #8  0x80601daf in VOP_OPEN_APV (vop=0x8089fb80, 
 a=0xff841dea88b0) at vnode_if.c:371
 #9  0x80467246 in vn_open_cred (ndp=0xff841dea8a00, 
 flagp=0xff841dea894c, cmode=Variable cmode is not available.
 ) at vnode_if.h:199
 #10 0x80463770 in kern_open (td=0xff009bb4ca80, 
 path=0x5114a0 Address 0x5114a0 out of bounds, pathseg=Variable 
 pathseg is not available.
 ) at ../../../kern/vfs_syscalls.c:1054
 #11 0x805c599e in syscall (frame=0xff841dea8c80) at 
 ../../../amd64/amd64/trap.c:911
 #12 0x805a723b in Xfast_syscall () at 
 ../../../amd64/amd64/exception.S:349
 #13 0x0008009a219c in ?? ()
 
 This thread was opening the device, bumped the refcount, but then wedged 
 on the geom topology lock .
 
 the refcount field is protected under devmtx
 
 Anyone seen this?
 
 I'm half inclined to either add in CDP_SCHED_DTR when one calls 
 destroy_dev, or make dev_refthread look at CDP_ACTIVE, leaning more 
 toward the latter.
 
 Any thoughts on this?

And who owns the topology lock ? Is it thread 1 ?

Destroy_devl() clears si_devsw for departing cdev, and *refthread()
checks si_devsw against NULL as an indicator of device destruction in
progress.

I think that this situation is what destroy_dev_sched(9) was created
for.


pgpQbeqwkz8X0.pgp
Description: PGP signature


Re: race conditions for destroying and opening a dev

2010-09-16 Thread Matthew Fleming
On Thu, Sep 16, 2010 at 12:00 PM, Matthew Jacob m...@feral.com wrote:

 Has anyone seen this scenario before? I am seeing it in RELENG_7, but the
 code in question exists through to head.

 Thread 1:

 (kgdb) where
 #0  sched_switch (td=0xff003a04ea80, newtd=0xff00210b4000,
 flags=Variable flags is not available.
 ) at ../../../kern/sched_ule.c:1944
 #1  0x803b6091 in mi_switch (flags=1, newtd=0x0) at
 ../../../kern/kern_synch.c:450
 #2  0x80402399 in sleepq_switch (wchan=0xff8413b50b60) at
 ../../../kern/subr_sleepqueue.c:497
 #3  0x80402e8c in sleepq_timedwait (wchan=0xff8413b50b60) at
 ../../../kern/subr_sleepqueue.c:615
 #4  0x803b682d in _sleep (ident=0xff8413b50b60,
 lock=0x80b0ee00, priority=76, wmesg=0x806583bb devdrn,
 timo=100) at ../../../kern/kern_synch.c:228
 #5  0x8037640c in destroy_devl (dev=0xff003aaf) at
 ../../../kern/kern_conf.c:874
 #6  0x80376759 in destroy_dev (dev=0xff003aaf) at
 ../../../kern/kern_conf.c:916
 #7  0x8034c939 in g_dev_orphan (cp=0xff003a544800) at
 ../../../geom/geom_dev.c:438
 #8  0x803506a0 in g_run_events () at ../../../geom/geom_event.c:164
 #9  0x80351f1c in g_event_procbody () at
 ../../../geom/geom_kern.c:141
 #10 0x8038a73a in fork_exit (callout=0x80351eb0
 g_event_procbody at ../../../geom/geom_kern.c:132, arg=0x0,
 frame=0xff8413b50c80) at ../../../kern/kern_fork.c:829
 #11 0x805a747e in fork_trampoline () at
 ../../../amd64/amd64/exception.S:564
 #12 0x in ?? ()

 This thread is waiting on the threadcount to go away- i.e., the last close
 of the device to occur (da16 in this case).

 Thread 2:

 (kgdb) where
 #0  sched_switch (td=0xff009bb4ca80, newtd=0xff003af43380,
 flags=Variable flags is not available.
 ) at ../../../kern/sched_ule.c:1944
 #1  0x803b6091 in mi_switch (flags=1, newtd=0x0) at
 ../../../kern/kern_synch.c:450
 #2  0x80402399 in sleepq_switch (wchan=0x80b0e040) at
 ../../../kern/subr_sleepqueue.c:497
 #3  0x80402f84 in sleepq_wait (wchan=0x80b0e040) at
 ../../../kern/subr_sleepqueue.c:580
 #4  0x803b5385 in _sx_xlock_hard (sx=0x80b0e040,
 tid=18446742976810240640, opts=Variable opts is not available.
 ) at ../../../kern/kern_sx.c:562
 #5  0x803b5731 in _sx_xlock (sx=0x80b0e040, opts=0,
 file=0x80652d27 ../../../geom/geom_dev.c, line=196) at sx.h:154
 #6  0x8034d1bc in g_dev_open (dev=0xff003aaf, flags=1,
 fmt=Variable fmt is not available.
 ) at ../../../geom/geom_dev.c:196
 #7  0x80333741 in devfs_open (ap=0xff841dea88b0) at
 ../../../fs/devfs/devfs_vnops.c:902
 #8  0x80601daf in VOP_OPEN_APV (vop=0x8089fb80,
 a=0xff841dea88b0) at vnode_if.c:371
 #9  0x80467246 in vn_open_cred (ndp=0xff841dea8a00,
 flagp=0xff841dea894c, cmode=Variable cmode is not available.
 ) at vnode_if.h:199
 #10 0x80463770 in kern_open (td=0xff009bb4ca80, path=0x5114a0
 Address 0x5114a0 out of bounds, pathseg=Variable pathseg is not
 available.
 ) at ../../../kern/vfs_syscalls.c:1054
 #11 0x805c599e in syscall (frame=0xff841dea8c80) at
 ../../../amd64/amd64/trap.c:911
 #12 0x805a723b in Xfast_syscall () at
 ../../../amd64/amd64/exception.S:349
 #13 0x0008009a219c in ?? ()

 This thread was opening the device, bumped the refcount, but then wedged on
 the geom topology lock .

 the refcount field is protected under devmtx

 Anyone seen this?

 I'm half inclined to either add in CDP_SCHED_DTR when one calls destroy_dev,
 or make dev_refthread look at CDP_ACTIVE, leaning more toward the latter.

 Any thoughts on this?

We had a similar bug at Isilon, but in our case it was in
cam/scsi/scsi_pass.c::passcleanup() calling destroy_dev().  We
switched it to destroy_dev_sched() to fix the si_threadcount deadlock.

Cheers,
matthew
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: race conditions for destroying and opening a dev

2010-09-16 Thread Matthew Jacob

 kostik, matthew- thanks mucho!


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Summary: Re: Spin down HDD after disk sync or before power off

2010-09-16 Thread Warren Block

On Thu, 16 Sep 2010, Oliver Fromme wrote:


I've updated the patch for ada(4).  It includes a bug fix
(command1 vs. command2) and uses the howto flags passed to
the shutdown function.  Thanks again for pointing these out.


Works perfectly on a system here.  Thanks!
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Questions about mutex implementation in kern/kern_mutex.c

2010-09-16 Thread Benjamin Kaduk

On Thu, 16 Sep 2010, John Baldwin wrote:


On Thursday, September 16, 2010 1:33:07 pm Andrey Simonenko wrote:


The mtx_owned(9) macro uses this property, mtx_owned() does not use anything
special to compare the value of m-mtx_lock (volatile) with current thread
pointer, all other functions that update m-mtx_lock of unowned mutex use
compare-and-set instruction.  Also I cannot find anything special in
generated Assembler code for volatile variables (except for ia64 where
acquire loads and release stores are used).


No, mtx_owned() is just not harmed by the races it loses.  You can certainly
read a stale value of mtx_lock in mtx_owned() if some other thread owns the
lock or has just released the lock.  However, we don't care, because in both
of those cases, mtx_owned() returns false.  What does matter is that
mtx_owned() can only return true if we currently hold the mutex.  This works
because 1) the same thread cannot call mtx_unlock() and mtx_owned() at the
same time, and 2) even CPUs that hold writes in store buffers will snoop their
store buffer for local reads on that CPU.  That is, a given CPU will never
read a stale value of a memory word that is older than a write it has
performed to that word.


Sorry for the naive question, but would you mind expounding a bit on what 
keeps the thread from migrating to a different CPU and getting a stale 
value there?  (I can imagine a couple possible mechanisms, but don't know 
enough to know which one(s) are the real ones.)


Thanks,

Ben Kaduk
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org