Cannot build ports on fresh 10.0

2013-06-04 Thread mdf
I installed a new VM with 10.0 today from this .iso:

ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/amd64/10.0/FreeBSD-10.0-CURRENT-amd64-20130601-r251213-release.iso

And I'm behind a firewall and I'm not sure I can use pkg(1); my one attempt
failed:

# pkg install m4
Updating repository catalogue
Repository catalogue is up-to-date, no need to fetch fresh copy
pkg: Package 'm4' was not found in the repositories

So I'm building from source.  But the configure step of various ports fails
like so:

checking whether make sets $(MAKE)... yes
checking build system type... Invalid configuration
`amd64-portbld-freebsd10.0': machine `amd64-portbld' not recognized
configure: error: /bin/sh libltdl/config/config.sub
amd64-portbld-freebsd10.0 failed
===  Script configure failed unexpectedly.
Please report the problem to autoto...@freebsd.org [maintainer] and attach
the /usr/ports/devel/libtool/work/libtool-2.4.2/config.log including the
output of the failure of your make command. Also, it might be a good idea to
provide an overview of all packages installed on your system (e.g. a
/usr/local/sbin/pkg-static info -g -Ea).
*** Error code 1

Any ideas what could be wrong?  It's a fresh install with default options,
so it seems hard to believe I managed to screw something up already.
Posting on -current since this happens to many of the ports when I try to
build them.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Cannot build ports on fresh 10.0

2013-06-04 Thread mdf
On Tue, Jun 4, 2013 at 3:07 PM, hiren panchasara hi...@freebsd.org wrote:

 On Tue, Jun 4, 2013 at 2:42 PM,  m...@freebsd.org wrote:
  I installed a new VM with 10.0 today from this .iso:
 
 
 ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/amd64/10.0/FreeBSD-10.0-CURRENT-amd64-20130601-r251213-release.iso
 
  And I'm behind a firewall and I'm not sure I can use pkg(1); my one
 attempt
  failed:
 
  # pkg install m4
  Updating repository catalogue
  Repository catalogue is up-to-date, no need to fetch fresh copy
  pkg: Package 'm4' was not found in the repositories
 
  So I'm building from source.  But the configure step of various ports
 fails
  like so:
 
  checking whether make sets $(MAKE)... yes
  checking build system type... Invalid configuration
  `amd64-portbld-freebsd10.0': machine `amd64-portbld' not recognized
  configure: error: /bin/sh libltdl/config/config.sub
  amd64-portbld-freebsd10.0 failed
  ===  Script configure failed unexpectedly.
  Please report the problem to autoto...@freebsd.org [maintainer] and
 attach
  the /usr/ports/devel/libtool/work/libtool-2.4.2/config.log including
 the
  output of the failure of your make command. Also, it might be a good
 idea to
  provide an overview of all packages installed on your system (e.g. a
  /usr/local/sbin/pkg-static info -g -Ea).
  *** Error code 1

 _Probably_ similar to this?
 http://lists.freebsd.org/pipermail/freebsd-ports/2013-June/084040.html


Looks very likely.  I'm testing the patch now.  My rudimentary search of
-ports wasn't good enough, I guess.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


[RFC] use a shared lock for VOP_GETEXTATTR

2013-03-27 Thread mdf
VOP_GETEXTATTR is currently called with an exclusive lock, which seems
like overkill for what is essentially a read operation.  I had a look
over the various in-tree filesystems and it didn't look like any of
them will have a problem if a shared-mode lock is used for
vop_getextattr.

Does anyone know otherwise?  Is someone using extended attributes
regularly who can test this?

[sorry if this is a duplicate; I first sent from my non-FreeBSD mail]

Thanks,
matthew


0001-Use-a-shared-lock-for-VOP_GETEXTATTR-as-it-is-a-read.patch
Description: Binary data
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC] use a shared lock for VOP_GETEXTATTR

2013-03-27 Thread mdf
On Wed, Mar 27, 2013 at 10:32 PM, Konstantin Belousov
kostik...@gmail.com wrote:
 On Wed, Mar 27, 2013 at 06:37:51PM -0700, m...@freebsd.org wrote:
 VOP_GETEXTATTR is currently called with an exclusive lock, which seems
 like overkill for what is essentially a read operation.  I had a look
 over the various in-tree filesystems and it didn't look like any of
 them will have a problem if a shared-mode lock is used for
 vop_getextattr.

 Does anyone know otherwise?  Is someone using extended attributes
 regularly who can test this?

 I think this change should be fine. At least it seems to for UFS.

 What other filesystems did you audited ?

I looked over zfs, pseudofs, unionfs, ffs/ufs.  None seemed to have
any asserts on the lock type nor anything that looked like it would
try to modify anything.  zfs, I think it was, even used VOP_ISLOCKED
to get the lock type in one path (but I think that was after a
lookup(), so it may have been on a different vnode).

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: kmem_map auto-sizing and size dependencies

2013-01-18 Thread mdf
On Fri, Jan 18, 2013 at 7:29 AM, Andre Oppermann an...@freebsd.org wrote:
 The (inital?) size of the kmem_map is determined by some voodoo magic,
 a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables.
 However it seems to work out to an effective kmem_map_size of about 58MB
 on my 16GB AMD64 dev machine:

 vm.kvm_size: 549755809792
 vm.kvm_free: 530233421824
 vm.kmem_size: 16,594,300,928
 vm.kmem_size_min: 0
 vm.kmem_size_max: 329,853,485,875
 vm.kmem_size_scale: 1
 vm.kmem_map_size: 59,518,976
 vm.kmem_map_free: 16,534,777,856

 The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing
 else that uses UMA for memory allocation.

 Mbuf memory too is managed by UMA which obtains the backing kernel memory
 from the kmem_map.  The limits of the various mbuf memory types have
 been considerably raised recently and may make use of 50-75% of all
 physically
 present memory, or available KVM space, whichever is smaller.

 Now my questions/comments are:

  Does the kmem_map automatically extend itself if more memory is requested?

Not that I recall.

  Should it be set to a larger initial value based on min(physical,KVM) space
  available?

It needs to be smaller than the physical space, because the only limit
on the kernel's use of (pinned) memory is the size of the map.  So if
it is too large there is nothing to stop the kernel from consuming all
available memory.  The lowmem handler is called when running out of
virtual space only (i.e. a failure to allocate a range in the map).

  The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is
  confusing and not easy to reconcile.  Either we need some more detailing
  more aspects or less.  Plus perhaps sysctl subtrees to better describe the
  hierarchy of the maps.

  Why are separate kmem submaps being used?  Is it to limit memory usage of
  certain subsystems?  Are those limits actually enforced?

I mostly know about memguard, since I added memguard_fudge().  IIRC
some of the submaps are used.  The memguard_map specifically is used
to know whether an allocation is guarded or not, so at free(9) it can
be handled as normal malloc() or as memguard.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Memory modified after free - by whom?

2012-12-10 Thread mdf
On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd adr...@freebsd.org wrote:
 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
 after it's finalised/freed.

 I have a similar bug showing up on ath(4) RX. :(

Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
vm.memguard.desc to the name of the UMA zone used for the 9216 byte
allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
is touched after free.

Cheers,
matthew

 On 10 December 2012 14:22, Garrett Cooper yaneg...@gmail.com wrote:
 I noticed this while checking the logs on one of my test boxes
 after restarting the network. Any idea where I should start looking
 into this (has IPv6 enabled but wasn't using it, em/cxgbe/ixgbe
 interfaces with the ixgbe interfaces lagged previously, but now not)?
 It looks suspiciously like the same size as a jumbo frame, but I'm not
 100% sure if that's the real problem.
 Thanks,
 -Garrett

 Dec 10 14:03:12 wf158 kernel: em0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:13 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:13 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:15 wf158 kernel: em0: link state changed to UP
 Dec 10 14:03:15 wf158 dhclient: New IP Address (em0): 10.7.169.89
 Dec 10 14:03:15 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
 Dec 10 14:03:15 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
 Dec 10 14:03:15 wf158 dhclient: New Routers (em0): 10.7.160.1
 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:16 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:31 wf158 kernel: in6_purgeaddr: err=65, destination
 address delete failed
 Dec 10 14:03:31 wf158 dhclient[4539]: My address (10.7.169.89) was
 deleted, dhclient exiting
 Dec 10 14:03:32 wf158 dhclient[4521]: short write: wanted 20 got 0 bytes
 Dec 10 14:03:32 wf158 dhclient[4521]: exiting.
 Dec 10 14:03:33 wf158 kernel: em0: link state changed to DOWN
 Dec 10 14:03:33 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix0: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix0: link state changed to UP
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to DOWN
 Dec 10 14:03:34 wf158 kernel: ix1: link state changed to UP
 Dec 10 14:03:36 wf158 kernel: em0: link state changed to UP
 Dec 10 14:03:36 wf158 dhclient: New IP Address (em0): 10.7.169.89
 Dec 10 14:03:36 wf158 dhclient: New Subnet Mask (em0): 255.255.240.0
 Dec 10 14:03:36 wf158 dhclient: New Broadcast Address (em0): 10.7.175.255
 Dec 10 14:03:36 wf158 dhclient: New Routers (em0): 10.7.160.1
 Dec 10 14:05:34 wf158 kernel: Memory modified after free
 0xff81c016d000(9216) val= @ 0xff81c016d000
 Dec 10 14:05:35 wf158 kernel: Memory modified after free
 0xff81b5cdc000(9216) val= @ 0xff81b5cdc000

 # uname -a
 FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1
 r+2760369-dirty: Mon Dec 10 08:04:46 PST 2012
 r...@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC  amd64
 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic: sbuf_trim makes no sense on sbuf 0xffffff82434d8898 with drain

2012-11-25 Thread mdf
On Sun, Nov 25, 2012 at 2:26 PM, Niclas Zeising
zeising+free...@daemonic.se wrote:
 Hi!
 I consistently get this panic while trying to boot a kernel build from
 r243530.  It happens when the entropy harvesting rc.d script starts. r243380
 worked fine, I haven't tested any revisions in between. Attached is the
 backtrace from the kernel, as gotten by kgdb.  The machine uses zfs as a
 root pool, and there have been churn in this area.  To my untrained eyes,
 however, the issue seem related to hdaa.c. Please let me know if I can
 provide any more information.

r243530 added the new sysctl that is causing panic.  I'm not sure why
there's an sbuf_trim() call; there shouldn't be more than a few \n at
the end.  IMO the sbuf_trim() can be eliminated.

Alternately, the panic check can be removed and we could allow
sbuf_trim() to remove any un-emitted whitespace for an sbuf with
drain.

CC'ing mav@ who introduced the code.  (I introduced sbuf drains).

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: FILE's _file can only hold a short

2012-11-02 Thread mdf
On Thu, Nov 1, 2012 at 7:41 PM, Eitan Adler li...@eitanadler.com wrote:
 On 1 November 2012 10:40, Ian Lepore free...@damnhippie.dyndns.org wrote:
 On Wed, 2012-10-31 at 11:12 -0700, m...@freebsd.org wrote:
 I seem to recall a thread earlier on this limitation, but looking at
 actual libc/stdio sources, the 4 year old check for open(2)'s fd being
 less than SHRT_MAX is still there.  I thought I saw a patch to change
 this to an int, but it's not in the tree.  Was this in a PR or a
 mailing list thread or am I just imagining things?

 We've run into this limitation at work, where some processes have
 around 32k open file descriptors and then try to use the libc FILE
 interface.  Since we control ABI we can just change this to int, but I
 had been hoping there was a FreeBSD revision we could pull instead of
 having another diff.

 FWIW, I also remember some discussion recently (this year) on some
 mailing list about this, but I can't find it now.  I thought it was
 somehow related to in-lib versus external uses of the funopen()
 function, but I may be conflating two unrelated discusssions in my head.

 Perhaps 
 http://freebsd.1045724.n5.nabble.com/stdio-and-short-file-descriptors-revisited-td5747703.html
 ?

Yes, that was it exactly.  Thanks!  My (quick) search had not been fruitful.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


FILE's _file can only hold a short

2012-10-31 Thread mdf
I seem to recall a thread earlier on this limitation, but looking at
actual libc/stdio sources, the 4 year old check for open(2)'s fd being
less than SHRT_MAX is still there.  I thought I saw a patch to change
this to an int, but it's not in the tree.  Was this in a PR or a
mailing list thread or am I just imagining things?

We've run into this limitation at work, where some processes have
around 32k open file descriptors and then try to use the libc FILE
interface.  Since we control ABI we can just change this to int, but I
had been hoping there was a FreeBSD revision we could pull instead of
having another diff.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


make tinderbox failures

2012-10-26 Thread mdf
Running make tinderbox locally I see failures that aren't being
reported by the automated tinderbox builds.  I'm curious what's
different about the environment.  Failures are in the following:

arm ARMADAXP kernel failed, check _.arm.ARMADAXP for details
mips SENTRY5 kernel failed, check _.mips.SENTRY5 for details
mips XLPN32 kernel failed, check _.mips.XLPN32 for details
mips XLR kernel failed, check _.mips.XLR for details
mips XLRN32 kernel failed, check _.mips.XLRN32 for details

_.arm.ARMADAXP:
/data/sb/bsd.git/sys/arm/mv/armadaxp/armadaxp_mp.c: In function
'platform_mp_start_ap':
/data/sb/bsd.git/sys/arm/mv/armadaxp/armadaxp_mp.c:181: warning:
passing argument 1 of 'pmap_kextract' makes integer from pointer
without a cast

_.mips.SENTRY5:
config: Error: device siba is unknown

_.mips.XLPN32 and _.mips.XLRN32:
linking kernel.debug
stack_machdep.o: In function `stack_capture':
/data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x34): undefined
reference to `stack_zero'
/data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x34):
relocation truncated to fit: R_MIPS_26 against `stack_zero'
/data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x13c):
undefined reference to `stack_put'
/data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x13c):
relocation truncated to fit: R_MIPS_26 against `stack_put'

_.mips.XLR:
linking kernel.debug
board.o: In function `xlr_board_info_setup':
/data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3a4): undefined
reference to `__ucmpdi2'
/data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3a4): relocation
truncated to fit: R_MIPS_26 against `__ucmpdi2'
/data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3e4): undefined
reference to `__ucmpdi2'
/data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3e4): relocation
truncated to fit: R_MIPS_26 against `__ucmpdi2'

Since I don't work on arm or mips I generally ignore these.  But it
makes it harder to have confidence in a global change when make
tinderbox throws errors locally.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: make tinderbox failures

2012-10-26 Thread mdf
On Fri, Oct 26, 2012 at 3:24 PM, Warner Losh i...@bsdimp.com wrote:

 On Oct 26, 2012, at 1:08 PM, Garrett Cooper wrote:

 On Fri, Oct 26, 2012 at 10:26 AM,  m...@freebsd.org wrote:
 Running make tinderbox locally I see failures that aren't being
 reported by the automated tinderbox builds.  I'm curious what's
 different about the environment.  Failures are in the following:

 arm ARMADAXP kernel failed, check _.arm.ARMADAXP for details
 mips SENTRY5 kernel failed, check _.mips.SENTRY5 for details
 mips XLPN32 kernel failed, check _.mips.XLPN32 for details
 mips XLR kernel failed, check _.mips.XLR for details
 mips XLRN32 kernel failed, check _.mips.XLRN32 for details

 _.arm.ARMADAXP:
 /data/sb/bsd.git/sys/arm/mv/armadaxp/armadaxp_mp.c: In function
 'platform_mp_start_ap':
 /data/sb/bsd.git/sys/arm/mv/armadaxp/armadaxp_mp.c:181: warning:
 passing argument 1 of 'pmap_kextract' makes integer from pointer
 without a cast

 This still needs some love...

 _.mips.SENTRY5:
 config: Error: device siba is unknown

 OK.  Traced this down and fixed it.  I'd planned on moving siba to files, and 
 it looks like I never finished that.  My bad.

 _.mips.XLPN32 and _.mips.XLRN32:
 linking kernel.debug
 stack_machdep.o: In function `stack_capture':
 /data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x34): undefined
 reference to `stack_zero'
 /data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x34):
 relocation truncated to fit: R_MIPS_26 against `stack_zero'
 /data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x13c):
 undefined reference to `stack_put'
 /data/sb/bsd.git/sys/mips/mips/stack_machdep.c:(.text+0x13c):
 relocation truncated to fit: R_MIPS_26 against `stack_put'

 stack_machdep needs to depend on options stack or ddb.  Fixed.  Unsure why 
 other kernels didn't uncover this.

 _.mips.XLR:
 linking kernel.debug
 board.o: In function `xlr_board_info_setup':
 /data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3a4): undefined
 reference to `__ucmpdi2'
 /data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3a4): relocation
 truncated to fit: R_MIPS_26 against `__ucmpdi2'
 /data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3e4): undefined
 reference to `__ucmpdi2'
 /data/sb/bsd.git/sys/mips/rmi/board.c:(.text+0x3e4): relocation
 truncated to fit: R_MIPS_26 against `__ucmpdi2'

 Fixed earlier this week.

 Since I don't work on arm or mips I generally ignore these.  But it
 makes it harder to have confidence in a global change when make
 tinderbox throws errors locally.

 These are probably due to the changes that Warner made recently to the
 mips conf files.

 Some of them...

Thanks for fixes!  Is the reason the public tinderbox didn't see these
because it has explicit lists of arch/CONF kernels to build, while
make tinderbox on my machine just builds all the capitalized files
as configuration files?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic with DEBUG_MEMGUARD on PowerPC

2012-07-15 Thread mdf
On Sat, Jul 14, 2012 at 8:39 AM, Justin Hibbits chmeeed...@gmail.com wrote:
 On Jul 13, 2012, at 12:20 AM, m...@freebsd.org wrote:

 On Thu, Jul 12, 2012 at 6:33 PM, Justin Hibbits chmeeed...@gmail.com
 wrote:

 On Jul 12, 2012, at 9:11 PM, m...@freebsd.org wrote:

 On Thu, Jul 12, 2012 at 4:43 PM, Justin Hibbits chmeeed...@gmail.com
 wrote:


 When tracking down a panic exposed by INVARIANTS, I tried setting
 DEBUG_MEMGUARD, so I could find the culprit that's trashing freed
 memory.
 However, this causes a panic at bootup.  It shows up right after the
 first
 WARNING: WITNESS message, with the following:

 Tracing, and printf() debugging, I see arguments to vm_map_findspace():
 start: 0xD000, length: 4246446080, and map-max_offset =
 4026531839.

 Beyond that, I'm lost with tracking this down.  Machine is a dual
 processor
 PowerPC G4, with 2GB RAM.



 The length is 0xFD1BA000 which is almost 4GB.  Asking for 4GB of
 virtual space for 2GB of RAM sounds about right (it's been a while
 since I was in this code), unless this is a 32-bit kernel, in which
 case it'd be too much since there isn't that much virtual space
 available.

 So, is the kernel 32-bit?  What are the values used and returned by
 memguard_fudge()?  The intent of that routine is to get kmeminit() to
 allocate a larger map so memguard can use part of it for private
 virtual addresses.  But it shouldn't be asking for too much; i.e.
 the intent was to check both physical and virtual space available and
 be greedy, but not too greedy.

 There were some issues with that code for some platforms that e.g.
 didn't define a VM_KMEM_SIZE_MAX, but alc@ fixed that in r216425.


 It is a 32-bit kernel, on 32-bit hardware.  The values for memguard_fudge
 are (defaults):

 tmp: 4246446080, vm_kmem_size: 117440512, vm_kmem_size_max: 0

 When setting vm.kmem_size/vm.kmem_size_max to 2GB they are:

 tmp: 2147483648, vm_kmem_size: 214793648, vm_kmem_sizee_max: 2147483648
 (all
 2GB).

 But the start and map-max_offset remain the same on all runs I make.


 memguard_fudge is still broken for 32-bit architectures with no
 vm_kmem_max.  In the absence of a km_max to limit the value, we
 essentially use twice the physical memory for the virtual limit.  But
 with 2GB on a 32-bit machine, this requires 4GB of virtual space.

 Setting vm_kmem_size_max to 2GB should work; I'd expect to see
 tmp=about 200MB, which is much larger than the input 112MB but the
 allocation should work.  But I don't really know what else PowerPC has
 need of for virtual space, so that still could be too large.

 You can try smaller values of vm_kmem_size_max, like 1GB or 512MB.
 You shouldn't need to set vm_kmem_size at all.  At some point the
 added space for the memguard_map will be small enough that the
 kmem_suballoc will work.

 Hmm, what is the min_offset and max_offset of kernel_map when the call
 to memguard_fudge is made?

 Thanks,
 matthew



 Without setting vm.kmem_size/vm.kmem_size_max, I see the following:

 map: 0x100, min_offset: 0xD000, max_offset: 0xEFFF

 It does boot when I set vm.kmem_size=256M/vm.kmem_size_max=512M.

 When I tried 512M/1024M, it panicked at the same place -- kmem_suballoc from
 kmeminit.  So it looks like I have to set vm.kmem_size/vm.kmem_size_max way
 back in order for it to boot with memguard(9).

Please try the attached patch (or at
http://people.freebsd.org/~mdf/memguard.diff).

Thanks,
matthew


memguard.diff
Description: Binary data
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: panic with DEBUG_MEMGUARD on PowerPC

2012-07-14 Thread mdf
On Sat, Jul 14, 2012 at 8:39 AM, Justin Hibbits chmeeed...@gmail.com wrote:
 On Jul 13, 2012, at 12:20 AM, m...@freebsd.org wrote:

 On Thu, Jul 12, 2012 at 6:33 PM, Justin Hibbits chmeeed...@gmail.com
 wrote:

 On Jul 12, 2012, at 9:11 PM, m...@freebsd.org wrote:

 On Thu, Jul 12, 2012 at 4:43 PM, Justin Hibbits chmeeed...@gmail.com
 wrote:


 When tracking down a panic exposed by INVARIANTS, I tried setting
 DEBUG_MEMGUARD, so I could find the culprit that's trashing freed
 memory.
 However, this causes a panic at bootup.  It shows up right after the
 first
 WARNING: WITNESS message, with the following:

 Tracing, and printf() debugging, I see arguments to vm_map_findspace():
 start: 0xD000, length: 4246446080, and map-max_offset =
 4026531839.

 Beyond that, I'm lost with tracking this down.  Machine is a dual
 processor
 PowerPC G4, with 2GB RAM.



 The length is 0xFD1BA000 which is almost 4GB.  Asking for 4GB of
 virtual space for 2GB of RAM sounds about right (it's been a while
 since I was in this code), unless this is a 32-bit kernel, in which
 case it'd be too much since there isn't that much virtual space
 available.

 So, is the kernel 32-bit?  What are the values used and returned by
 memguard_fudge()?  The intent of that routine is to get kmeminit() to
 allocate a larger map so memguard can use part of it for private
 virtual addresses.  But it shouldn't be asking for too much; i.e.
 the intent was to check both physical and virtual space available and
 be greedy, but not too greedy.

 There were some issues with that code for some platforms that e.g.
 didn't define a VM_KMEM_SIZE_MAX, but alc@ fixed that in r216425.


 It is a 32-bit kernel, on 32-bit hardware.  The values for memguard_fudge
 are (defaults):

 tmp: 4246446080, vm_kmem_size: 117440512, vm_kmem_size_max: 0

 When setting vm.kmem_size/vm.kmem_size_max to 2GB they are:

 tmp: 2147483648, vm_kmem_size: 214793648, vm_kmem_sizee_max: 2147483648
 (all
 2GB).

 But the start and map-max_offset remain the same on all runs I make.


 memguard_fudge is still broken for 32-bit architectures with no
 vm_kmem_max.  In the absence of a km_max to limit the value, we
 essentially use twice the physical memory for the virtual limit.  But
 with 2GB on a 32-bit machine, this requires 4GB of virtual space.

 Setting vm_kmem_size_max to 2GB should work; I'd expect to see
 tmp=about 200MB, which is much larger than the input 112MB but the
 allocation should work.  But I don't really know what else PowerPC has
 need of for virtual space, so that still could be too large.

 You can try smaller values of vm_kmem_size_max, like 1GB or 512MB.
 You shouldn't need to set vm_kmem_size at all.  At some point the
 added space for the memguard_map will be small enough that the
 kmem_suballoc will work.

 Hmm, what is the min_offset and max_offset of kernel_map when the call
 to memguard_fudge is made?

 Thanks,
 matthew



 Without setting vm.kmem_size/vm.kmem_size_max, I see the following:

 map: 0x100, min_offset: 0xD000, max_offset: 0xEFFF

 It does boot when I set vm.kmem_size=256M/vm.kmem_size_max=512M.

 When I tried 512M/1024M, it panicked at the same place -- kmem_suballoc from
 kmeminit.  So it looks like I have to set vm.kmem_size/vm.kmem_size_max way
 back in order for it to boot with memguard(9).

I'll see about crafting a patch when I can get access to a machine
that works (my current woes with my laptop, VM image, etc. are quite
frustrating).  The gist is that, in the absence of a kmem_max, the
routine for determining the size of the kmem map should use the
kernel_map's size as a limiting factor.  In this case it looks like
the map's size is 512MB.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic with DEBUG_MEMGUARD on PowerPC

2012-07-12 Thread mdf
On Thu, Jul 12, 2012 at 4:43 PM, Justin Hibbits chmeeed...@gmail.com wrote:
 When tracking down a panic exposed by INVARIANTS, I tried setting
 DEBUG_MEMGUARD, so I could find the culprit that's trashing freed memory.
 However, this causes a panic at bootup.  It shows up right after the first
 WARNING: WITNESS message, with the following:

 panic: kmem_suballoc: bad status return of 3
 cpuid = 0
 KDB: stack backtrace:
 0xd0004ad0: at kdb_backtrace+0x4c
 0xd0004b40: at panic+0x224
 0xd0004ba0: at kmem_suballoc+0x8c
 0xd0004bd0: at kmeminit+0x1ac
 0xd0004c20: at mi_startup+0x13c
 0xd0004c50: at btext+0xc0

 Tracing, and printf() debugging, I see arguments to vm_map_findspace():
 start: 0xD000, length: 4246446080, and map-max_offset = 4026531839.

 Beyond that, I'm lost with tracking this down.  Machine is a dual processor
 PowerPC G4, with 2GB RAM.

The length is 0xFD1BA000 which is almost 4GB.  Asking for 4GB of
virtual space for 2GB of RAM sounds about right (it's been a while
since I was in this code), unless this is a 32-bit kernel, in which
case it'd be too much since there isn't that much virtual space
available.

So, is the kernel 32-bit?  What are the values used and returned by
memguard_fudge()?  The intent of that routine is to get kmeminit() to
allocate a larger map so memguard can use part of it for private
virtual addresses.  But it shouldn't be asking for too much; i.e.
the intent was to check both physical and virtual space available and
be greedy, but not too greedy.

There were some issues with that code for some platforms that e.g.
didn't define a VM_KMEM_SIZE_MAX, but alc@ fixed that in r216425.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic with DEBUG_MEMGUARD on PowerPC

2012-07-12 Thread mdf
On Thu, Jul 12, 2012 at 6:33 PM, Justin Hibbits chmeeed...@gmail.com wrote:
 On Jul 12, 2012, at 9:11 PM, m...@freebsd.org wrote:

 On Thu, Jul 12, 2012 at 4:43 PM, Justin Hibbits chmeeed...@gmail.com
 wrote:

 When tracking down a panic exposed by INVARIANTS, I tried setting
 DEBUG_MEMGUARD, so I could find the culprit that's trashing freed memory.
 However, this causes a panic at bootup.  It shows up right after the
 first
 WARNING: WITNESS message, with the following:

 Tracing, and printf() debugging, I see arguments to vm_map_findspace():
 start: 0xD000, length: 4246446080, and map-max_offset = 4026531839.

 Beyond that, I'm lost with tracking this down.  Machine is a dual
 processor
 PowerPC G4, with 2GB RAM.


 The length is 0xFD1BA000 which is almost 4GB.  Asking for 4GB of
 virtual space for 2GB of RAM sounds about right (it's been a while
 since I was in this code), unless this is a 32-bit kernel, in which
 case it'd be too much since there isn't that much virtual space
 available.

 So, is the kernel 32-bit?  What are the values used and returned by
 memguard_fudge()?  The intent of that routine is to get kmeminit() to
 allocate a larger map so memguard can use part of it for private
 virtual addresses.  But it shouldn't be asking for too much; i.e.
 the intent was to check both physical and virtual space available and
 be greedy, but not too greedy.

 There were some issues with that code for some platforms that e.g.
 didn't define a VM_KMEM_SIZE_MAX, but alc@ fixed that in r216425.

 It is a 32-bit kernel, on 32-bit hardware.  The values for memguard_fudge
 are (defaults):

 tmp: 4246446080, vm_kmem_size: 117440512, vm_kmem_size_max: 0

 When setting vm.kmem_size/vm.kmem_size_max to 2GB they are:

 tmp: 2147483648, vm_kmem_size: 214793648, vm_kmem_sizee_max: 2147483648 (all
 2GB).

 But the start and map-max_offset remain the same on all runs I make.

memguard_fudge is still broken for 32-bit architectures with no
vm_kmem_max.  In the absence of a km_max to limit the value, we
essentially use twice the physical memory for the virtual limit.  But
with 2GB on a 32-bit machine, this requires 4GB of virtual space.

Setting vm_kmem_size_max to 2GB should work; I'd expect to see
tmp=about 200MB, which is much larger than the input 112MB but the
allocation should work.  But I don't really know what else PowerPC has
need of for virtual space, so that still could be too large.

You can try smaller values of vm_kmem_size_max, like 1GB or 512MB.
You shouldn't need to set vm_kmem_size at all.  At some point the
added space for the memguard_map will be small enough that the
kmem_suballoc will work.

Hmm, what is the min_offset and max_offset of kernel_map when the call
to memguard_fudge is made?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sysctl filesystem ?

2012-06-26 Thread mdf
On Tue, Jun 26, 2012 at 1:59 AM, Robert Watson rwat...@freebsd.org wrote:
 On Tue, 26 Jun 2012, Chris Rees wrote:

 as well as we don't depend of /proc for normal operation we shouldn't for

 say /proc/sysctl


 improvements are welcome, better documentation is welcome, changes to

 what is OK - isn't.

 /proc/sysctl might be useful.  Just because Linux uses it doesn't make it
 a bad idea.


 One of the problems we've encounted with synthetic file systems is that
 off-the-shelf file system tools (e.g., cp, dd, cat) make simplistic (but not
 unreasonable) assumptions about the statistic content of files.  This comes
 up frequently with procfs-like systems where the size of, say, memory map
 data can be considerably larger than the perhaps 128-byte, 256-byte, or even
 8k buffers that might exist in a stock file access tool.  Unless we change
 all of those tools to use buffers much bigger than they currently do, which
 even suggets changing the C library buffer to defaults for FILE *, that
 places an onus on the file system to provide persisting snapshots of data
 until it's sure that a user process is done -- e.g., over many system calls.

 sysctl is not immune to the requirement of atomicity, but it has explicit
 control over it: sysctl is a single system call, rather than an unbounded
 open-read-seek-repeat-etc cycle, and has been carefully crafted to provide
 this and other MIB-like properties, such as a basic data type model so that
 command line tools know how to render content rather than having to guess
 and/or get it wrong.  sysctl has some file-system like properties, but on
 the whole, it's not a file system -- it's much more like an SNMP MIB.

 While you can map anything into anything (including Turing machines), I
 think the sysctl command line tool and API, despite its limitations, is a
 better match for accessing this sort of monitoring and control data than the
 POSIX file API, and would recommend against trying to move to a sysctl file
 system.

While I understand the problems you allude to, the sysctl(8) binary
can protect itself from them.  IMO the biggest problem with sysctls
not being files is that it makes no sense from the core UNIX
philosophy that everything is a file.  Sockets and pipes and character
devices and even unseekable things like stdout are files; why aren't
these other objects that allow read, write, and have their own
namespace?

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: new panic in cpu_reset() with WITNESS

2012-01-17 Thread mdf
2012/1/17 Gleb Smirnoff gleb...@freebsd.org:
  New panic has been introduced somewhere between
 r229851 and r229932, that happens on shutdown if
 kernel has WITNESS and doesn't have WITNESS_SKIPSPIN.

 Uptime: 1h0m17s
 Rebooting...
 panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ 
 /usr/src/head/sys/kern/kern_cons.c:500
 cpuid = 0
 KDB: enter: panic
 [ thread pid 1 tid 11 ]
 Stopped at      kdb_enter+0x3b: movq    $0,0x514d32(%rip)
 db
 db bt
 Tracing pid 1 tid 11 td 0xfe0001d5e000
 kdb_enter() at kdb_enter+0x3b
 panic() at panic+0x1c7
 _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x10f
 cnputs() at cnputs+0x7a
 putchar() at putchar+0x11f
 kvprintf() at kvprintf+0x83
 vprintf() at vprintf+0x85
 printf() at printf+0x67
 witness_checkorder() at witness_checkorder+0x773
 _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x99
 uart_cnputc() at uart_cnputc+0x3e
 cnputc() at cnputc+0x4c
 cnputs() at cnputs+0x26
 putchar() at putchar+0x11f
 kvprintf() at kvprintf+0x83
 vprintf() at vprintf+0x85
 printf() at printf+0x67
 cpu_reset() at cpu_reset+0x81
 kern_reboot() at kern_reboot+0x3a5
 --More--^M        ^Msys_reboot() at sys_reboot+0x42
 amd64_syscall() at amd64_syscall+0x39e
 Xfast_syscall() at Xfast_syscall+0xf7
 --- syscall (55, FreeBSD ELF64, sys_reboot), rip = 0x40ea3c, rsp = 
 0x7fffd6d8, rbp = 0x49 ---
 db
 db show locks
 exclusive sleep mutex Giant (Giant) r = 0 (0x809bc560) locked @ 
 /usr/src/head/sys/kern/kern_module.c:101
 exclusive spin mutex smp rendezvous (smp rendezvous) r = 0 
 (0x80a08840) locked @ /usr/src/head/sys/kern/kern_shutdown.c:542
 db

 So the problem is that we are holding smp rendezvous mutex during the 
 cpu_reset().
 No mutexes should be obtained after it. However, since cpu_reset() does 
 priting
 we obtain cnputs_mtx, and later obtain uart_hwmtx. The latter is hardcoded in
 the subr_witness.c as mutex to obtain before smp rendezvous, this triggers
 yet another printf from witness, that finally panics due to recursing on
 cnputs_mtx.

At $WORK we explicitly marked cnputs_mtx as NO_WITNESS since it didn't
seem possible to fit it into the heirarchy in any sane way, since a
print can come from basically anywhere.

If anyone has a better fix, that'd be great, but I haven't been able
to think of one.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-28 Thread mdf
On Wed, Dec 28, 2011 at 8:54 AM, Maxim Khitrov m...@mxcrypt.com wrote:
 On Wed, Dec 28, 2011 at 11:42 AM, Matthias Andree
 matthias.and...@gmx.de wrote:
 Am 27.12.2011 22:53, schrieb David Thiel:
 I've had multiple machines now (9.0-RC3, amd64, i386 and earlier
 9-CURRENT on ppc) running SU+J that have had unexplained panics and
 crashes start happening relating to disk I/O. When I end up running a
 full fsck, it keeps turning out that the disk is dirty and corrupted,
 but no mechanism is in place with SU+J to detect and fix this. A bgfsck
 never happens, but a manual fsck in single-user does indeed fix the
 crashing and weird behavior. Others have tested their SU+J volumes and
 found them to have errors as well. This makes me super nervous.

 The one thing I figured is that in the light of power outages, or
 crashing virtualization hosts, you really really really need to disable
 disk write caches, and this affects softupdates, journalling, asynch
 file systems, just about everything.

 The fact that makes matters worse is that journalling or softupdates
 allow you to mount a silently-corrupted file system, whereas the
 traditional UFS/UFS2 sync/asynch mounts will fsck themselves in the
 foreground, so they get fixed before the FS panics.

 So can you be sure that:

 - your driver, chip set and hard disk execute ordered writes in order,

If they don't, it's a bug.  Not that there isn't buggy firmware out
there, but each layer of software does need to rely on the one below
actually doing what it's promised.

 - your driver, chip set and hard disk actually write data to permanent
 storage BEFORE acknowledging a successful write?

Not required by SU as they use an explicit BIO_FLUSH which should be
handled by the driver.

 Whenever I fixed these issues, I had no more corruptions.

 For ata and sata, there are loader tunables you will want to set,
 hw.ata.wc=0 and kern.cam.ada.write_cache=0.

This should not be necessary if the driver and firmware are not buggy.

 If your drives are under ada, ad, or ahci related control, try these
 settings.  For SCSI, use camcontrol to turn the write cache off.
 softupdates is supposed to rectify most of the performance penalties
 incurred.

 Note also that you needed to set ahci_load=YES and atapicam_load=YES in
 8.X, I've never bothered to check 7.X or 9.X WRT these settings.

 This is a bit off-topic, but I'm curious what the effect of NCQ is on
 softupdates? Since that too has the ability to reorder writes to disk,
 should it be disabled in addition to cache?

SU doesn't care about write ordering, as long as everything before a
BIO_FLUSH is really flushed by the time the BIO_FLUSH is acknowledged.

Cheers,
matthew

 Also, I would say that if you are using a hardware raid controller
 with a BBU, then allowing the use of controller's cache and write-back
 policy should be safe for use with softupdates. Any caching mechanism,
 for that matter, that has a separate power supply source should be ok.
 For example, the Intel 320 SSDs have a few on-board capacitors that
 are used to flush the cache in the event of a power loss.

 - Max
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread mdf
On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin j...@freebsd.org wrote:
 On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
 On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev kab...@gmail.com wrote:
  On Sun, 18 Dec 2011 01:09:00 +0100
  O. Hartmann ohart...@zedat.fu-berlin.de wrote:
 
  Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
  panic: sleeping thread
  cpuid = 0
 
  PID 16 is always USB on my box.
 
  You really need to give us a backtrace when you quote panics. It is
  impossible to make any sense of the above panic message without more
  context.

 In the case of this panic, the stack of the thread which panics is
 useless; it's someone trying to propagate priority that discovered it.
  A backtrace on tid 100033 would be useful.

 With WITNESS enabled, it's possible to have this panic display the
 stack of the incorrectly sleeping thread at the time it acquired the
 lock, as well, but this code isn't in CURRENT or any release.  I have
 a patch at $WORK I can dig up on Monday.

 Huh?  The stock kernel dumps a stack trace of the offending thread if you have
 DDB enabled:

                /*
                 * If the thread is asleep, then we are probably about
                 * to deadlock.  To make debugging this easier, just
                 * panic and tell the user which thread misbehaved so
                 * they can hopefully get a stack trace from the truly
                 * misbehaving thread.
                 */
                if (TD_IS_SLEEPING(td)) {
                        printf(
                Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n,
                            td-td_tid, td-td_proc-p_pid);
 #ifdef DDB
                        db_trace_thread(td, -1);
 #endif
                        panic(sleeping thread);
                }

Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
added code to print *which* lock it holds (using WITNESS data).  I do
recall that this panic alone was often not sufficient to debug the
problem.

Thanks,
matthew


 It may be that we can make use of the STACK API here instead to output this
 trace even when DDB isn't enabled.  The patch below tries to do that
 (untested).  It does some odd thigns though since it is effectively running
 from a panic context already, so it uses a statically allocated 'struct stack'
 rather than using stack_create() and uses stack_print_ddb() since it is
 holding spin locks and can't possibly grab an sx lock:

 Index: subr_turnstile.c
 ===
 --- subr_turnstile.c    (revision 228534)
 +++ subr_turnstile.c    (working copy)
 @@ -72,6 +72,7 @@ __FBSDID($FreeBSD$);
  #include sys/proc.h
  #include sys/queue.h
  #include sys/sched.h
 +#include sys/stack.h
  #include sys/sysctl.h
  #include sys/turnstile.h

 @@ -175,6 +176,7 @@ static void turnstile_fini(void *mem, int size);
  static void
  propagate_priority(struct thread *td)
  {
 +       static struct stack st;
        struct turnstile *ts;
        int pri;

 @@ -217,8 +219,10 @@ propagate_priority(struct thread *td)
                        printf(
                Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n,
                            td-td_tid, td-td_proc-p_pid);
 -#ifdef DDB
 -                       db_trace_thread(td, -1);
 +#ifdef STACK
 +                       stack_zero(st);
 +                       stack_save_td(st, td);
 +                       stack_print_ddb(st);
  #endif
                        panic(sleeping thread);
                }

 --
 John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-20 Thread mdf
On Tue, Dec 20, 2011 at 6:32 AM, John Baldwin j...@freebsd.org wrote:
 On Tuesday, December 20, 2011 9:22:48 am m...@freebsd.org wrote:
 On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin j...@freebsd.org wrote:
  On Saturday, December 17, 2011 10:41:15 pm m...@freebsd.org wrote:
  On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev kab...@gmail.com 
  wrote:
   On Sun, 18 Dec 2011 01:09:00 +0100
   O. Hartmann ohart...@zedat.fu-berlin.de wrote:
  
   Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
   panic: sleeping thread
   cpuid = 0
  
   PID 16 is always USB on my box.
  
   You really need to give us a backtrace when you quote panics. It is
   impossible to make any sense of the above panic message without more
   context.
 
  In the case of this panic, the stack of the thread which panics is
  useless; it's someone trying to propagate priority that discovered it.
   A backtrace on tid 100033 would be useful.
 
  With WITNESS enabled, it's possible to have this panic display the
  stack of the incorrectly sleeping thread at the time it acquired the
  lock, as well, but this code isn't in CURRENT or any release.  I have
  a patch at $WORK I can dig up on Monday.
 
  Huh?  The stock kernel dumps a stack trace of the offending thread if you 
  have
  DDB enabled:
 
                 /*
                  * If the thread is asleep, then we are probably about
                  * to deadlock.  To make debugging this easier, just
                  * panic and tell the user which thread misbehaved so
                  * they can hopefully get a stack trace from the truly
                  * misbehaving thread.
                  */
                 if (TD_IS_SLEEPING(td)) {
                         printf(
                 Sleeping thread (tid %d, pid %d) owns a non-sleepable 
  lock\n,
                             td-td_tid, td-td_proc-p_pid);
  #ifdef DDB
                         db_trace_thread(td, -1);
  #endif
                         panic(sleeping thread);
                 }

 Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
 added code to print *which* lock it holds (using WITNESS data).  I do
 recall that this panic alone was often not sufficient to debug the
 problem.

 I think the db_trace_thread() has been around for a while (since 5 or 6),
 but it is true that we don't tell you which lock is held even with this.
 That might be a useful thing to output before the panic.


This patch isn't quite right since I had to hand-edit it.  There's a
small chance I can commit this in the near future, but of someone else
wants to take it, feel free.  Style isn't yet fixed up to be FreeBSD
standard either.


--- /data/sb/bsd.git/sys/kern/subr_turnstile.c  2011-12-12
10:23:12.542196632 -0800
+++ kern/subr_turnstile.c   2011-12-09 10:59:29.882643558 -0800
@@ -165,10 +165,43 @@
 static voidturnstile_dtor(void *mem, int size, void *arg);
 #endif
 static int turnstile_init(void *mem, int size, int flags);
 static voidturnstile_fini(void *mem, int size);

+#ifdef INVARIANTS
+static void
+sleeping_thread_owns_a_nonsleepable_lock(struct thread *td)
+{
+   printf(Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n,
+   td-td_tid, td-td_proc-p_pid);
+#ifdef DDB
+   db_trace_thread(td, -1);
+#endif
+#ifdef WITNESS
+   struct lock_list_entry *lock_list, *lle;
+   int i;
+
+   lock_list = td-td_sleeplocks;
+   if (lock_list == NULL || lock_list-ll_count == 0) {
+   printf(Thread does not appear to hold any mutexes!\n);
+   return;
+   }
+
+   for (lle = lock_list; lle != NULL; lle = lle-ll_next) {
+   for (i = lle-ll_count - 1; i = 0; i--) {
+   struct lock_instance *li = lle-ll_children[i];
+
+   printf(Lock %s acquired at %s:%d\n,
+   li-li_lock-lo_name, li-li_file, li-li_line);
+   }
+   }
+#endif /* WITNESS */
+}
+#else
+#define sleeping_thread_owns_a_nonsleepable_lock(td) do { } while (0)
+#endif /* INVARIANTS */
+
 /*
  * Walks the chain of turnstiles and their owners to propagate the priority
  * of the thread being blocked to all the threads holding locks that have to
  * release their locks before this thread can run again.
  */
@@ -210,19 +243,31 @@
 * If the thread is asleep, then we are probably about
 * to deadlock.  To make debugging this easier, just
 * panic and tell the user which thread misbehaved so
 * they can hopefully get a stack trace from the truly
 * misbehaving thread.
 */
if (TD_IS_SLEEPING(td)) {
-   printf(
-   Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n,
-   td-td_tid, td-td_proc-p_pid);
-#ifdef DDB
-   db_trace_thread(td, -1);
-#endif
-   panic(sleeping thread);
+   

Re: extattr_set_*() return type

2011-12-20 Thread mdf
On Tue, Dec 20, 2011 at 1:49 PM, John Baldwin j...@freebsd.org wrote:
 Hmm, if these functions are expected to operate like 'write(2)' and are
 supposed to return the number of bytes written, shouldn't their return value
 be 'ssize_t' instead of 'int'?  It looks like the system calls themselves
 already do the right thing in setting td_retval[] (they assign a ssize_t to it
 and td_retval[0] can hold a ssize_t on all of our current platforms).  It
 would seem that the only change would be to the header and probably
 syscalls.master.  I guess this would require a symver bump to fix though.

An extended attribute larger than 2GB is a programming abuse, though.
Technically int may not be 32 bits but it is on all supported
platforms now.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD 10.0-CURRENT/amd64 r228662

2011-12-17 Thread mdf
On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev kab...@gmail.com wrote:
 On Sun, 18 Dec 2011 01:09:00 +0100
 O. Hartmann ohart...@zedat.fu-berlin.de wrote:

 Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
 panic: sleeping thread
 cpuid = 0

 PID 16 is always USB on my box.

 You really need to give us a backtrace when you quote panics. It is
 impossible to make any sense of the above panic message without more
 context.

In the case of this panic, the stack of the thread which panics is
useless; it's someone trying to propagate priority that discovered it.
 A backtrace on tid 100033 would be useful.

With WITNESS enabled, it's possible to have this panic display the
stack of the incorrectly sleeping thread at the time it acquired the
lock, as well, but this code isn't in CURRENT or any release.  I have
a patch at $WORK I can dig up on Monday.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SCHED_ULE should not be the default

2011-12-13 Thread mdf
On Tue, Dec 13, 2011 at 3:39 PM, Ivan Klymenko fi...@ukr.net wrote:
 В Wed, 14 Dec 2011 00:04:42 +0100
 Jilles Tjoelker jil...@stack.nl пишет:

 On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
  If the algorithm ULE does not contain problems - it means the
  problem has Core2Duo, or in a piece of code that uses the ULE
  scheduler. I already wrote in a mailing list that specifically in
  my case (Core2Duo) partially helps the following patch:
  --- sched_ule.c.orig        2011-11-24 18:11:48.0 +0200
  +++ sched_ule.c     2011-12-10 22:47:08.0 +0200
  @@ -794,7 +794,8 @@
       * 1.5 * balance_interval.
       */
      balance_ticks = max(balance_interval / 2, 1);
  -   balance_ticks += random() % balance_interval;
  +// balance_ticks += random() % balance_interval;
  +   balance_ticks += ((int)random()) % balance_interval;
      if (smp_started == 0 || rebalance == 0)
              return;
      tdq = TDQ_SELF();

 This avoids a 64-bit division on 64-bit platforms but seems to have no
 effect otherwise. Because this function is not called very often, the
 change seems unlikely to help.

 Yes, this section does not apply to this problem :)
 Just I posted the latest patch which i using now...


  @@ -2118,13 +2119,21 @@
      struct td_sched *ts;
 
      THREAD_LOCK_ASSERT(td, MA_OWNED);
  +   if (td-td_pri_class  PRI_FIFO_BIT)
  +           return;
  +   ts = td-td_sched;
  +   /*
  +    * We used up one time slice.
  +    */
  +   if (--ts-ts_slice  0)
  +           return;

 This skips most of the periodic functionality (long term load
 balancer, saving switch count (?), insert index (?), interactivity
 score update for long running thread) if the thread is not going to
 be rescheduled right now.

 It looks wrong but it is a data point if it helps your workload.

 Yes, I did it for as long as possible to delay the execution of the code in 
 section:
 ...
 #ifdef SMP
        /*
         * We run the long term load balancer infrequently on the first cpu.
         */
        if (balance_tdq == tdq) {
                if (balance_ticks  --balance_ticks == 0)
                        sched_balance();
        }
 #endif
 ...


      tdq = TDQ_SELF();
   #ifdef SMP
      /*
       * We run the long term load balancer infrequently on the
  first cpu. */
  -   if (balance_tdq == tdq) {
  -           if (balance_ticks  --balance_ticks == 0)
  +   if (balance_ticks  --balance_ticks == 0) {
  +           if (balance_tdq == tdq)
                      sched_balance();
      }
   #endif

 The main effect of this appears to be to disable the long term load
 balancer completely after some time. At some point, a CPU other than
 the first CPU (which uses balance_tdq) will set balance_ticks = 0, and
 sched_balance() will never be called again.


 That is, for the same reason as above in the text...

 It also introduces a hypothetical race condition because the access to
 balance_ticks is no longer restricted to one CPU under a spinlock.

 If the long term load balancer may be causing trouble, try setting
 kern.sched.balance_interval to a higher value with unpatched code.

 I checked it in the first place - but it did not help fix the situation...

 The impression of malfunction rebalancing...
 It seems that the thread is passed on to the same core that is loaded and 
 so...
 Perhaps this is a consequence of an incorrect definition of the topology CPU?


  @@ -2144,9 +2153,6 @@
              if
  (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
  tdq-tdq_ridx = tdq-tdq_idx; }
  -   ts = td-td_sched;
  -   if (td-td_pri_class  PRI_FIFO_BIT)
  -           return;
      if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
              /*
               * We used a tick; charge it to the thread so
  @@ -2157,11 +2163,6 @@
              sched_priority(td);
      }
      /*
  -    * We used up one time slice.
  -    */
  -   if (--ts-ts_slice  0)
  -           return;
  -   /*
       * We're out of time, force a requeue at userret().
       */
      ts-ts_slice = sched_slice;

  and refusal to use options FULL_PREEMPTION
  But no one has unsubscribed to my letter, my patch helps or not in
  the case of Core2Duo...
  There is a suspicion that the problems stem from the sections of
  code associated with the SMP...
  Maybe I'm in something wrong, but I want to help in solving this
  problem ...


Has anyone experiencing problems tried to set sysctl kern.sched.steal_thresh=1 ?

I don't remember what our specific problem at $WORK was, perhaps it
was just interrupt threads not getting serviced fast enough, but we've
hard-coded this to 1 and removed the code that sets it in
sched_initticks().  The same effect should be had by setting the
sysctl after a box is up.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SCHED_ULE should not be the default

2011-12-12 Thread mdf
On Mon, Dec 12, 2011 at 7:32 AM, Gary Jennejohn
gljennj...@googlemail.com wrote:
 On Mon, 12 Dec 2011 15:13:00 +
 Vincent Hoffman vi...@unsane.co.uk wrote:


 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 12/12/2011 13:47, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 It all a little old now but some if the stuff in
 http://people.freebsd.org/~kris/scaling/
 covers improvements that were seen.

 http://jeffr-tech.livejournal.com/5705.html
 shows a little too, reading though Jeffs blog is worth it as it has some
 interesting stuff on SHED_ULE.

 I thought there were some more benchmarks floating round but cant find
 any with a quick google.


 Vince

 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 

 These observations are not scientific, but I have a CPU from AMD with
 6 cores (AMD Phenom(tm) II X6 1090T Processor).

 My simple test was ``make buildkernel'' while watching the core usage with
 gkrellm.

 With SCHED_4BSD all 6 cores are loaded to 97% during the build phase.
 I've never seen any value above 97% with gkrellm.

 With SCHED_ULE I never saw all 6 cores loaded this heavily.  Usually
 2 or more cores were at or below 90%.  Not really that significant, but
 still a noticeable difference in apparent scheduling behavior.  Whether
 the observed difference is due to some change in data from the kernel to
 gkrellm is beyond me.

SCHED_ULE is much sloppier about calculating which thread used a
timeslice -- unless the timeslice went 100% to a thread, the fraction
it used may get attributed elsewhere.  So top's reporting of thread
usage is not a useful metric.  Total buildworld time is, potentially.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Stop scheduler on panic

2011-12-02 Thread mdf
On Fri, Dec 2, 2011 at 2:05 AM, Andriy Gapon a...@freebsd.org wrote:
 on 02/12/2011 06:36 John Baldwin said the following:
 Ah, ok (I had thought SCHEDULER_STOPPED was going to always be true when kdb 
 was
 active).  But I think these two changes should cover critical_exit() ok.


 I attempted to start a discussion about this a few times already :-)
 Should we treat kdb context the same as SCHEDULER_STOPPED context (in the
 current definition) ?  That is, skip all locks in the same fashion?
 There are pros and contras.

Does kdb pause all CPUs with an interrupt (NMI or regular interrupt, I
can no longer remember...) when it enters?  If so, then I'd say
whether it enters via sysctl or panic doesn't matter.  It's in a
special environment where nothing else is running, which is what is
needed for proper exploration of the machine (via breakpoint, for
debugging a hang, etc).

Maybe the question is, why wouldn't SCHEDULER_STOPPED be true
regardless of how kdb is entered?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Stop scheduler on panic

2011-11-17 Thread mdf
On Thu, Nov 17, 2011 at 12:54 PM, Attilio Rao atti...@freebsd.org wrote:
 2011/11/17 Andriy Gapon a...@freebsd.org:
 BTW, it is my opinion that we really should not let the debugger code call
 mi_switch for any reason.

 Yes, I agree with this, this is why the sched_bind() in boot() is
 broken (immagine calling things like doadump from KDB. KDB right now
 can be thought as a first cut of this patch because it does disable
 the CPUs when entering the context, thus, the bug here is that if you
 stop all CPUs including CPU0 and later on you want bind on it you are
 death).

Another patch related to this area we have at $WORK:

 #if defined(SMP)
-   /*
-* Bind us to CPU 0 so that all shutdown code runs there.  Some
-* systems don't shutdown properly (i.e., ACPI power off) if we
-* run on another processor.
-*/
-   thread_lock(curthread);
-   sched_bind(curthread, 0);
-   thread_unlock(curthread);
-   KASSERT(PCPU_GET(cpuid) == 0, (%s: not running on cpu 0, __func__));
+   /*
+* sched_bind can't be done reliably inside of panic.  cpu_reset() will
+* rebind us in any case, more reliably.
+*/
+   if (panicstr == NULL) {
+   /*
+* Bind us to CPU 0 so that all shutdown code runs there.  Some
+* systems don't shutdown properly (i.e., ACPI power off) if we
+* run on another processor.
+*/
+   thread_lock(curthread);
+   sched_bind(curthread, 0);
+   thread_unlock(curthread);
+   KASSERT(PCPU_GET(cpuid) == 0, (boot: not running on cpu 0));
+   }
 #endif
/* We're in the process of rebooting. */
rebooting = 1;

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Stop scheduler on panic

2011-11-17 Thread mdf
On Thu, Nov 17, 2011 at 1:08 PM, Andriy Gapon a...@freebsd.org wrote:
 on 17/11/2011 23:02 m...@freebsd.org said the following:
 Another patch related to this area we have at $WORK:

  #if defined(SMP)
 -       /*
 -        * Bind us to CPU 0 so that all shutdown code runs there.  Some
 -        * systems don't shutdown properly (i.e., ACPI power off) if we
 -        * run on another processor.
 -        */
 -       thread_lock(curthread);
 -       sched_bind(curthread, 0);
 -       thread_unlock(curthread);
 -       KASSERT(PCPU_GET(cpuid) == 0, (%s: not running on cpu 0, 
 __func__));
 +       /*
 +        * sched_bind can't be done reliably inside of panic.  cpu_reset() 
 will
 +        * rebind us in any case, more reliably.
 +        */
 +       if (panicstr == NULL) {
 +               /*
 +                * Bind us to CPU 0 so that all shutdown code runs there.  
 Some
 +                * systems don't shutdown properly (i.e., ACPI power off) if 
 we
 +                * run on another processor.
 +                */
 +               thread_lock(curthread);
 +               sched_bind(curthread, 0);
 +               thread_unlock(curthread);
 +               KASSERT(PCPU_GET(cpuid) == 0, (boot: not running on cpu 
 0));
 +       }
  #endif
         /* We're in the process of rebooting. */
         rebooting = 1;

 Yes, you have contributed this patch before and it is a part of the bigger
 stop-scheduler-on-panic patches.  Have you had a chance to review those? :)

It's been so long I don't remember what I've sent where.  I did look
over the patch but had no additional comments.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]

2011-11-15 Thread mdf
On Tue, Nov 15, 2011 at 10:15 AM, Attilio Rao atti...@freebsd.org wrote:
 2011/11/7 Kostik Belousov kostik...@gmail.com:
 On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote:
 Ok.  I'll offer one final suggestion.  Please consider an alternative
 suffix to func.  Perhaps, kbi or KBI.  In other words, something
 that hints at the function's reason for existing.

 Sure. Below is the extraction of only vm_page_lock() bits, together
 with the suggested rename. When Attilio provides the promised simplification
 of the mutex KPI, this can be reduced.

 My tentative patch is here:
 http://www.freebsd.org/~attilio/mutexfileline.patch

 I need to make more compile testing later, but it already compiles
 GENERIC + modules fine on HEAD.

 The patch provides a common entrypoint, option independent, for both
 fast case and debug/compat case.
 Additively, it almost entirely fixes the standard violation of the
 reserved namespace, as you described (the notable exception being the
 macro used in the fast path, that I want to fix as well, but in a
 separate commit).

 Now the file/line couplet can be passed to the _ suffix variant of
 the flag functions.

 eadler@ reviewed the mutex.h comment.

 Please let me know what you think about it, as long as we agree on the
 patch I'll commit it.

Out of curiosity, why are function names explicitly spelled out in
panic and log messages, instead of using %s and __func__?  I've seen
this all around FreeBSD, and if there's no reason otherwise, I'd just
as soon change to a version that doesn't need updating when the
function names change.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]

2011-11-07 Thread mdf
On Mon, Nov 7, 2011 at 11:35 AM, Kostik Belousov kostik...@gmail.com wrote:
 On Mon, Nov 07, 2011 at 11:45:38AM -0600, Alan Cox wrote:
 Ok.  I'll offer one final suggestion.  Please consider an alternative
 suffix to func.  Perhaps, kbi or KBI.  In other words, something
 that hints at the function's reason for existing.

 Sure. Below is the extraction of only vm_page_lock() bits, together
 with the suggested rename. When Attilio provides the promised simplification
 of the mutex KPI, this can be reduced.

 diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c
 index 389aea5..ea4ea34 100644
 --- a/sys/vm/vm_page.c
 +++ b/sys/vm/vm_page.c
 @@ -2677,6 +2677,44 @@ vm_page_test_dirty(vm_page_t m)
                vm_page_dirty(m);
  }

 +void
 +vm_page_lock_KBI(vm_page_t m, const char *file, int line)
 +{
 +
 +#if LOCK_DEBUG  0 || defined(MUTEX_NOINLINE)
 +       _mtx_lock_flags(vm_page_lockptr(m), 0, file, line);
 +#else
 +       __mtx_lock(vm_page_lockptr(m), 0, file, line);
 +#endif
 +}
 +
 +void
 +vm_page_unlock_KBI(vm_page_t m, const char *file, int line)
 +{
 +
 +#if LOCK_DEBUG  0 || defined(MUTEX_NOINLINE)
 +       _mtx_unlock_flags(vm_page_lockptr(m), 0, file, line);
 +#else
 +       __mtx_unlock(vm_page_lockptr(m), curthread, 0, file, line);
 +#endif
 +}
 +
 +int
 +vm_page_trylock_KBI(vm_page_t m, const char *file, int line)
 +{
 +
 +       return (_mtx_trylock(vm_page_lockptr(m), 0, file, line));
 +}
 +
 +void
 +vm_page_lock_assert_KBI(vm_page_t m, int a, const char *file, int line)
 +{
 +
 +#ifdef INVARIANTS
 +       _mtx_assert(vm_page_lockptr(m), a, file, line);
 +#endif
 +}
 +
  int so_zerocp_fullpage = 0;

  /*
 diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h
 index 7099b70..a5604b7 100644
 --- a/sys/vm/vm_page.h
 +++ b/sys/vm/vm_page.h
 @@ -218,11 +218,23 @@ extern struct vpglocks pa_lock[];

  #define        PA_LOCK_ASSERT(pa, a)   mtx_assert(PA_LOCKPTR(pa), (a))

 +#ifdef KLD_MODULE
 +#define        vm_page_lock(m)         vm_page_lock_KBI((m), LOCK_FILE, 
 LOCK_LINE)
 +#define        vm_page_unlock(m)       vm_page_unlock_KBI((m), LOCK_FILE, 
 LOCK_LINE)
 +#define        vm_page_trylock(m)      vm_page_trylock_KBI((m), LOCK_FILE, 
 LOCK_LINE)
 +#ifdef INVARIANTS
 +#define        vm_page_lock_assert(m, a)       \
 +    vm_page_lock_assert_KBI((m), (a), LOCK_FILE, LOCK_LINE)
 +#else
 +#define        vm_page_lock_assert(m, a)
 +#endif
 +#else  /* KLD_MODULE */
  #define        vm_page_lockptr(m)      (PA_LOCKPTR(VM_PAGE_TO_PHYS((m

Is it not possible to have vm_page_lockptr() be a function for the
KLD_MODULE case?  Because then the vm_page_lock() functions and
friends would all just use mtx_lock, etc., in both the KLD and non-KLD
case.

Or am I missing something?

Thanks,
matthew

  #define        vm_page_lock(m)         mtx_lock(vm_page_lockptr((m)))
  #define        vm_page_unlock(m)       mtx_unlock(vm_page_lockptr((m)))
  #define        vm_page_trylock(m)      mtx_trylock(vm_page_lockptr((m)))
  #define        vm_page_lock_assert(m, a)       
 mtx_assert(vm_page_lockptr((m)), (a))
 +#endif

  #define        vm_page_queue_free_mtx  vm_page_queue_free_lock.data
  /*
 @@ -403,6 +415,11 @@ void vm_page_cowfault (vm_page_t);
  int vm_page_cowsetup(vm_page_t);
  void vm_page_cowclear (vm_page_t);

 +void vm_page_lock_KBI(vm_page_t m, const char *file, int line);
 +void vm_page_unlock_KBI(vm_page_t m, const char *file, int line);
 +int vm_page_trylock_KBI(vm_page_t m, const char *file, int line);
 +void vm_page_lock_assert_KBI(vm_page_t m, int a, const char *file, int line);
 +
  #ifdef INVARIANTS
  void vm_page_object_lock_assert(vm_page_t m);
  #define        VM_PAGE_OBJECT_LOCK_ASSERT(m)   vm_page_object_lock_assert(m)

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]

2011-11-06 Thread mdf
On Sun, Nov 6, 2011 at 4:43 AM, Kostik Belousov kostik...@gmail.com wrote:
 Regarding the _vm_page_lock() vs. vm_page_lock_func(), the mutex.h has
 a lot of violations in regard of the namespaces, IMO. The __* namespace
 is reserved for the language implementation, so our freestanding program
 (kernel) ignores the requirements of the C standard with the names like
 __mtx_lock_spin(). Using the name _vm_page_lock() is valid, but makes
 it not unreasonable for other developers to introduce reserved names.
 So I decided to use the suffixes. vm_map.h locking is free of these
 violations.

I'm pretty sure that when the C standard says, the implementation,
they're referring to the compiler and OS it runs on.  Which makes the
FreeBSD kernel part of the implementation, which is precisely why so
many headers have defines that start with __ and then, if certain
posix defines are set, also uses non-__ versions of the name.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: vm_page_t related KBI [Was: Re: panic at vm_page_wire with FreeBSD 9.0 Beta 3]

2011-11-05 Thread mdf
On Sat, Nov 5, 2011 at 7:13 AM, Kostik Belousov kostik...@gmail.com wrote:
 On Fri, Nov 04, 2011 at 06:03:39PM +0200, Kostik Belousov wrote:

 Below is the KBI patch after vm_page_bits_t merge is done.
 Again, I did not spent time converting all in-tree consumers
 from the (potentially) loadable modules to the new KPI until it
 is agreed upon.

I like my bikeshed orange...

I would think a more canonical name would be get/set rather than
read/write, especially since these operations aren't reading and
writing the page (neither are they getting the page, but at least set
is pretty unambiguous).

Thanks,
matthew


 diff --git a/sys/nfsclient/nfs_bio.c b/sys/nfsclient/nfs_bio.c
 index 305c189..7264cd1 100644
 --- a/sys/nfsclient/nfs_bio.c
 +++ b/sys/nfsclient/nfs_bio.c
 @@ -128,7 +128,7 @@ nfs_getpages(struct vop_getpages_args *ap)
         * can only occur at the file EOF.
         */
        VM_OBJECT_LOCK(object);
 -       if (pages[ap-a_reqpage]-valid != 0) {
 +       if (vm_page_read_valid(pages[ap-a_reqpage]) != 0) {
                for (i = 0; i  npages; ++i) {
                        if (i != ap-a_reqpage) {
                                vm_page_lock(pages[i]);
 @@ -198,16 +198,16 @@ nfs_getpages(struct vop_getpages_args *ap)
                        /*
                         * Read operation filled an entire page
                         */
 -                       m-valid = VM_PAGE_BITS_ALL;
 -                       KASSERT(m-dirty == 0,
 +                       vm_page_write_valid(m, VM_PAGE_BITS_ALL);
 +                       KASSERT(vm_page_read_dirty(m) == 0,
                            (nfs_getpages: page %p is dirty, m));
                } else if (size  toff) {
                        /*
                         * Read operation filled a partial page.
                         */
 -                       m-valid = 0;
 +                       vm_page_write_valid(m, 0);
                        vm_page_set_valid(m, 0, size - toff);
 -                       KASSERT(m-dirty == 0,
 +                       KASSERT(vm_page_read_dirty(m) == 0,
                            (nfs_getpages: page %p is dirty, m));
                } else {
                        /*
 diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c
 index 389aea5..2f41e70 100644
 --- a/sys/vm/vm_page.c
 +++ b/sys/vm/vm_page.c
 @@ -2677,6 +2677,66 @@ vm_page_test_dirty(vm_page_t m)
                vm_page_dirty(m);
  }

 +void
 +vm_page_lock_func(vm_page_t m, const char *file, int line)
 +{
 +
 +#if LOCK_DEBUG  0 || defined(MUTEX_NOINLINE)
 +       _mtx_lock_flags(vm_page_lockptr(m), 0, file, line);
 +#else
 +       __mtx_lock(vm_page_lockptr(m), 0, file, line);
 +#endif
 +}
 +
 +void
 +vm_page_unlock_func(vm_page_t m, const char *file, int line)
 +{
 +
 +#if LOCK_DEBUG  0 || defined(MUTEX_NOINLINE)
 +       _mtx_unlock_flags(vm_page_lockptr(m), 0, file, line);
 +#else
 +       __mtx_unlock(vm_page_lockptr(m), curthread, 0, file, line);
 +#endif
 +}
 +
 +int
 +vm_page_trylock_func(vm_page_t m, const char *file, int line)
 +{
 +
 +       return (_mtx_trylock(vm_page_lockptr(m), 0, file, line));
 +}
 +
 +void
 +vm_page_lock_assert_func(vm_page_t m, int a, const char *file, int line)
 +{
 +
 +#ifdef INVARIANTS
 +       _mtx_assert(vm_page_lockptr(m), a, file, line);
 +#endif
 +}
 +
 +vm_page_bits_t
 +vm_page_read_dirty_func(vm_page_t m)
 +{
 +
 +       return (m-dirty);
 +}
 +
 +vm_page_bits_t
 +vm_page_read_valid_func(vm_page_t m)
 +{
 +
 +       return (m-valid);
 +}
 +
 +void
 +vm_page_write_valid_func(vm_page_t m, vm_page_bits_t v)
 +{
 +
 +       m-valid = v;
 +}
 +
 +
  int so_zerocp_fullpage = 0;

  /*
 diff --git a/sys/vm/vm_page.h b/sys/vm/vm_page.h
 index 7099b70..4f8f71e 100644
 --- a/sys/vm/vm_page.h
 +++ b/sys/vm/vm_page.h
 @@ -218,12 +218,50 @@ extern struct vpglocks pa_lock[];

  #define        PA_LOCK_ASSERT(pa, a)   mtx_assert(PA_LOCKPTR(pa), (a))

 +#ifdef KLD_MODULE
 +#define        vm_page_lock(m)         vm_page_lock_func((m), LOCK_FILE, 
 LOCK_LINE)
 +#define        vm_page_unlock(m)       vm_page_unlock_func((m), LOCK_FILE, 
 LOCK_LINE)
 +#define        vm_page_trylock(m)      vm_page_trylock_func((m), LOCK_FILE, 
 LOCK_LINE)
 +#ifdef INVARIANTS
 +#define        vm_page_lock_assert(m, a)       \
 +    vm_page_lock_assert_func((m), (a), LOCK_FILE, LOCK_LINE)
 +#else
 +#define        vm_page_lock_assert(m, a)
 +#endif
 +
 +#define        vm_page_read_dirty(m)   vm_page_read_dirty_func((m))
 +#define        vm_page_read_valid(m)   vm_page_read_valid_func((m))
 +#define        vm_page_write_valid(m, v)       vm_page_write_valid_func((m), 
 (v))
 +
 +#else  /* KLD_MODULE */
  #define        vm_page_lockptr(m)      (PA_LOCKPTR(VM_PAGE_TO_PHYS((m
  #define        vm_page_lock(m)         mtx_lock(vm_page_lockptr((m)))
  #define        vm_page_unlock(m)       mtx_unlock(vm_page_lockptr((m)))
  #define        vm_page_trylock(m)      mtx_trylock(vm_page_lockptr((m)))
  #define        vm_page_lock_assert(m, a)       
 

Re: smp_rendezvous runs with interrupts and preemption enabled on unicore systems

2011-10-28 Thread mdf
On Fri, Oct 28, 2011 at 8:37 AM, Ryan Stone ryst...@gmail.com wrote:
 I'm seeing issues on a unicore systems running a derivative of FreeBSD
 8.2-RELEASE if something calls mem_range_attr_set.  It turns out that
 the root cause is a bug in smp_rendezvous_cpus.  The first part of
 smp_rendezvous_cpus attempts to short-circuit the non-SMP case(note
 that smp_started is never set to 1 on a unicore system):

        if (!smp_started) {
                if (setup_func != NULL)
                        setup_func(arg);
                if (action_func != NULL)
                        action_func(arg);
                if (teardown_func != NULL)
                        teardown_func(arg);
                return;
        }

 The problem is that this runs with interrupts enabled, outside of a
 critical section.  My system runs with device_polling enabled with hz
 set to 2500, so its quite easy to wedge the system by having a thread
 run mem_range_attr_set.  That has to do a smp_rendezvous, and if a
 timer interrupt happens to go off half-way through the action_func and
 preempt this thread, the system ends up deadlocked(although once it's
 wedged, typing at the serial console stands a good chance of unwedging
 the system.  Go figure).

 I know that smp_rendezvous was reworked substantially on HEAD, but by
 inspection it looks like the bug is still present, as the
 short-circuit behaviour is still there.

 I am not entirely sure of the best way to fix this.  Is it as simple
 as doing a spinlock_enter before setup_func and a spinlock_exit after
 teardown_func?  It seems to boot fine, but I'm not at all confident
 that I understand the nuances of smp_rendezvous to be sure that there
 aren't side effects that I don't know about.

Looks like Max didn't have time to upstream this fix:

 struct mtx smp_ipi_mtx;
+MTX_SYSINIT(smp_ipi_mtx, smp_ipi_mtx, smp rendezvous, MTX_SPIN);

...

 static void
 mp_start(void *dummy)
 {

-   mtx_init(smp_ipi_mtx, smp rendezvous, NULL, MTX_SPIN);

...

if (!smp_started) {
+   mtx_lock_spin(smp_ipi_mtx);
if (setup_func != NULL)
setup_func(arg);
if (action_func != NULL)
action_func(arg);
if (teardown_func != NULL)
teardown_func(arg);
+   mtx_unlock_spin(smp_ipi_mtx);
return;
}

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces

2011-08-18 Thread mdf
On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote:
    When loading if_alc as a module on my netbook and running
 /etc/rc.d/netif restart, I can deterministically panic my netbook with
 the following message:

 ) at _bus_dmamap_sync+0x51
 alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e
 alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at alc_ioctl+0x22e
 ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at ifioctl+0xc98
 soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at soo_ioctl+0x401
 kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7
 ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118
 syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f
 syscall(e6ca3d28) at syscall+0x2e
 Xint0x80_syscall() at Xint0x80_syscall+0x21
 --- syscall (54kernel trap 12 with interrupts disabled
 Kernel page fault with the following non-sleepable locks held:
 exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked
 @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362
 KDB: stack backtrace:
 db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at
 db_trace_self_wrapper+0x26
 kdb_backtrace(93a,0,,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a
 _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e
 witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1
 trap(e6ca32dc) at trap+0x15a
 calltrap() at calltrap+0x6

    I tried to track down what the exact issue was, but I got lost
 (the locking sort of looks ok to me, but I'm still not an expert with
 mutex(9)).
    I still have the vmcore and can provide more helpful details when 
 requested.

The locking itself is almost certainly fine.  The error message is not
very helpful, but what went wrong was the page fault.  You just happen
to panic on a witness warning before vm_fault can panic due to a bad
address.

The alc(4) maintainer would probably like info on the trap (line of
code and where the bad pointer came from).

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: variable init

2011-08-01 Thread mdf
On Mon, Aug 1, 2011 at 12:09 AM, Andrew Thompson a...@fud.org.nz wrote:
 Hi,

 Looking at,

 http://www.freebsd.org/cgi/query-pr.cgi?pr=159345

 The lock is a global variable, declared as

 static struct mtx       lagg_list_mtx;

 I would expect this to be zeroed memory, is this guaranteed?

It depends. :-)  The C standard mandates that this storage be zero.
However, seeing as we're the operating system, it's our job to do it.
For an application program the zero'd storage is zeroed by the loader.
 For the operating system, exactly when it's initialized is an
architectural decision.  I believe that the zeroed storage of FreeBSD
is cleared by the boot loader before it transfers control to the
operating system.  (For reference, on AIX the storage is cleared by
the virtual memory management code during bringup of the VMM leading
to interesting bugs when some variables used in early boot weren't
initialized).

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sgl (or uio) in struct buf / bio

2011-07-18 Thread mdf
On Mon, Jul 18, 2011 at 3:15 PM, Joel Jacobson jjacob...@panasas.com wrote:
 a few years ago (2008), i asked about the prospect of getting an sgl passed 
 through struct buf / bio, and was referred to the jhb_bio branch.  i never 
 did figure out how to view it, though, and it fell off my radar for a while.

 did anything ever come of it, and if so/not, are there any plans for 
 switching to something like using a uio instead of a pointer/length pair?


To the best of my knowledge, this is still something many of us would
like to see done, and no one has had any time to work on.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Heavy I/O blocks FreeBSD box for several seconds

2011-07-11 Thread mdf
On Mon, Jul 11, 2011 at 4:00 PM, Ali Mashtizadeh mashtiza...@gmail.com wrote:
 Maybe someone can setup something like reviewboard [1] for developers
 to use. This may also help folks who want to keep abreast of the
 current work in a particular subsystem or get involved into the
 development process more. At my company we use reviews and it seems to
 help the catch some bugs and help new engineers ramp up faster.

 [1] http://www.reviewboard.org/

FreeBSD development is completely open; anyone can sign up for the
svn-src-* mailing list they are interested in, including
svn-src-head@.  Code reviews are plenty as well; just check the list
archives for discussion of bugs, poor design choices and unintended
effects.  But most reviews are silent and after-the-fact by looking at
the list mail.  It's a system that seems to be working just fine for
the FreeBSD project so far.  This isn't a job for most anyone; it's a
volunteer project and so anything that raises the barrier to getting
work done for the project should be looked at with skepticism.

Is there a specific deficit that you want to address?

Thanks,
matthew

 On Mon, Jul 11, 2011 at 2:48 PM, Arnaud Lacombe lacom...@gmail.com wrote:
 Hi,

 On Mon, Jul 11, 2011 at 5:14 PM, Andriy Gapon a...@freebsd.org wrote:
 on 11/07/2011 23:33 Arnaud Lacombe said the following:
 For the record, I would like to see enforced public review for _every_
 patch *before* it is checked in, as a strong rule. gcc system is
 particularly interesting. But it is not likely to happen in FreeBSD
 where FreeBSD committers are clearly more free than other at
 checking-in un-publicly-reviewed stuff (especially _bad_ stuff).

 This would of course apply even to long-time committers, no matter how
 it hurt their ego (which I definitively do not care about).

 Have you just volunteered to review all of the patches that I would like to
 commit?  And are you prepared to take responsibility for quality of your 
 reviews?
 I am sure that other developers will gladly accept your offer too.

 _No-one_ can do all the reviews, especially not me (on a purely
 technical level). ACK must come from subsystem maintainers. Having
 public review would allow the community review, which is now just not
 possible today. As about patches from the maintainer, they might be
 committed without his approval, but still sent for review. If a
 maintainer goes outside his area, he has to get approval from the
 other subsystem maintainer.

  - Arnaud
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: kern.sync_on_panic

2011-06-27 Thread mdf
On Mon, Jun 27, 2011 at 3:08 AM, Andriy Gapon a...@freebsd.org wrote:
 on 26/06/2011 08:51 Warner Losh said the following:

 On Jun 25, 2011, at 8:49 AM, Andriy Gapon wrote:
 Does anybody actually use kern.sync_on_panic tunable/sysctl? If yes, then
 in what circumstances do you need it? That is, why any other alternative
 doesn't work for you? Like: 1. remounting filesystems R/O before panic if
 you knowingly provoke it for testing 2. using netboot for your test system
 3. using su+j, gjournal or a different filesystem altogether 4. using fsck
 after reboot

 It seems to me that syncing filesystems in panic context is an adventure.
 And it may become even more of an adventure if we introduce code that
 completely stops scheduler in and after panic.

 I've used it in the past when I was developing a device driver that was in
 the late stages of maturing.  Since all the panics in the system were when
 the driver dereferenced NULL in that driver, sync was safe because all the
 data structures were sane except the aforementioned driver.

 (1) It was a production system, and everything that could be was already
 mounted r/w.  However, some small, but every critical, amount of data was
 still r/w and it was very important to not lose this data.  Production here
 likely should be in quotes, because it was in the late stages of
 testing/validation.  The problem was without this sometimes the saved state
 of the GPS receiver and other hardware would wind up being zero, which meant
 that we'd have to do a cold start which cost us a few hours of time.  At the
 time I was doing this, we saw zero files a couple times a day without this
 turned on. (2) netbooting wasn't an option since we were qualifying a
 non-netbooting system. (3) these weren't available at the time, but the goal
 was to prevent data loss, not to necessarily have to avoid fsck on boot. (4)
 Data loss without it.

 Now, I'll be the first to admit this has been a few years, and I haven't done
 a fresh evaluation to see if things are still safe.  I'll also be the first
 to admit that this was a useful debugging setting late in development, and
 not in production.  I'm also the first to admit this isn't what I'd call a
 very wide-spread case.  But it did come in very handy when chasing a few bugs
 to be able to do 10 panic/reboot cycles an hour rather than 2 a day.

 A fine enough use-case for me.  I guess the problem ultimately boiled down to
 peculiarities of UFS behavior, but still...
 However, please be aware that sync_on_panic might get broken when/if we start
 stopping scheduler in panic.

The entirety of the sync code should be a subroutine in vfs_bio.c so
the 'buf' variable is static to the file.  At that point it would be
reasonable to explicitly call it at the beginning of panic(9) for the
sync-on-panic case, either before IPIing the other CPUs, or at least
before entering the critical section that prevents the scheduler from
running.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [head tinderbox] failure on ia64/ia64

2011-06-20 Thread mdf
On Mon, Jun 20, 2011 at 11:40 AM, Garrett Cooper yaneg...@gmail.com wrote:
 On Mon, Jun 20, 2011 at 11:37 AM, FreeBSD Tinderbox
 tinder...@freebsd.org wrote:
 TB --- 2011-06-20 17:09:28 - tinderbox 2.7 running on 
 freebsd-current.sentex.ca
 TB --- 2011-06-20 17:09:28 - starting HEAD tinderbox run for ia64/ia64
 TB --- 2011-06-20 17:09:28 - cleaning the object tree
 TB --- 2011-06-20 17:09:40 - cvsupping the source tree
 TB --- 2011-06-20 17:09:40 - /usr/bin/csup -z -r 3 -g -L 1 -h 
 cvsup.sentex.ca /tinderbox/HEAD/ia64/ia64/supfile
 TB --- 2011-06-20 17:10:27 - building world
 TB --- 2011-06-20 17:10:27 - MAKEOBJDIRPREFIX=/obj
 TB --- 2011-06-20 17:10:27 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
 TB --- 2011-06-20 17:10:27 - TARGET=ia64
 TB --- 2011-06-20 17:10:27 - TARGET_ARCH=ia64
 TB --- 2011-06-20 17:10:27 - TZ=UTC
 TB --- 2011-06-20 17:10:27 - __MAKE_CONF=/dev/null
 TB --- 2011-06-20 17:10:27 - cd /src
 TB --- 2011-06-20 17:10:27 - /usr/bin/make -B buildworld
 World build started on Mon Jun 20 17:10:28 UTC 2011
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
 World build completed on Mon Jun 20 18:35:55 UTC 2011
 TB --- 2011-06-20 18:35:56 - generating LINT kernel config
 TB --- 2011-06-20 18:35:56 - cd /src/sys/ia64/conf
 TB --- 2011-06-20 18:35:56 - /usr/bin/make -B LINT
 TB --- 2011-06-20 18:35:56 - building LINT kernel
 TB --- 2011-06-20 18:35:56 - MAKEOBJDIRPREFIX=/obj
 TB --- 2011-06-20 18:35:56 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
 TB --- 2011-06-20 18:35:56 - TARGET=ia64
 TB --- 2011-06-20 18:35:56 - TARGET_ARCH=ia64
 TB --- 2011-06-20 18:35:56 - TZ=UTC
 TB --- 2011-06-20 18:35:56 - __MAKE_CONF=/dev/null
 TB --- 2011-06-20 18:35:56 - cd /src
 TB --- 2011-06-20 18:35:56 - /usr/bin/make -B buildkernel KERNCONF=LINT
 Kernel build for LINT started on Mon Jun 20 18:35:56 UTC 2011
 stage 1: configuring the kernel
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3.1: making dependencies
 [...]
 awk -f /src/sys/tools/makeobjops.awk /src/sys/kgssapi/kgss_if.m -h
 awk -f /src/sys/tools/makeobjops.awk /src/sys/libkern/iconv_converter_if.m -h
 awk -f /src/sys/tools/makeobjops.awk /src/sys/opencrypto/cryptodev_if.m -h
 awk -f /src/sys/tools/makeobjops.awk /src/sys/dev/acpica/acpi_if.m -h
 rm -f .newdep
 /usr/bin/make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES |  MKDEP_CPP=cc -E 
 CC=cc xargs mkdep -a -f .newdep -O2 -pipe -fno-strict-aliasing  -std=c99  
 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
 -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
 -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs 
 -fdiagnostics-show-option -nostdinc  -I. -I/src/sys -I/src/sys/contrib/altq 
 -I/src/sys/contrib/ipfilter -I/src/sys/contrib/pf -I/src/sys/dev/ath 
 -I/src/sys/dev/ath/ath_hal -I/src/sys/contrib/ngatm -I/src/sys/dev/twa 
 -I/src/sys/gnu/fs/xfs/FreeBSD -I/src/sys/gnu/fs/xfs/FreeBSD/support 
 -I/src/sys/gnu/fs/xfs -I/src/sys/dev/cxgb -I/src/sys/dev/cxgbe 
 -I/src/sys/contrib/ia64/libuwx/src -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS 
 -include opt_global.h -fno-common -finline-limit=15000 --param 
 inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin 
 -mconstant-gp -ffixed-r13 -mfixed-range=f32-f127 -fpic -ffreestanding
 /src/sys/vm/vm_page.c:2356:2: error: #error PAGE_SIZE is not supported.
 mkdep: compile failed
 *** Error code 1

 Stop in /obj/ia64.ia64/src/sys/LINT.
 *** Error code 1

 Stop in /src.
 *** Error code 1

    r223307 broke tinderbox on ia64. What should the PAGE_SIZE be for
 that architecture?

The LINT configuration file uses 115, or 32768.  It should just be a
matter of adding a case and using atomic_clear_long.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: best way to do FreeBSD kernel/userland development on OSX ?

2011-06-20 Thread mdf
On Mon, Jun 20, 2011 at 1:53 PM, Luigi Rizzo ri...@iet.unipi.it wrote:
 i recently replaced my laptop with a mac, which opens the problem on
 how do i do FreeBSD development (including kernel work) on it.

 the option i am trying now is running a freebsd VM in virtualbox,
 but it is slightly slow and probably energy hungry.

 I'd like to keep the svn checkut outside the disk image but
 unfortunately the default filesystem on osx is case-insensitive
 so i can't do that -- unless someone shows me how to change
 the filesystem conventions.

 suggestions anyone ?

With the help of google I did something for this a few months ago.
You can create a new disk image, and use case-sensitive for this
image.  I created mine as something like 8GB but thinly provisioned so
it didn't take up all the space until used.

I use VMWare for testing code changes ($WORK paid for a license for
me), but it's a CPU/battery hog, so I'm not wild about it, but it does
work.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: `hw.acpi.thermal.tz0.temperature' disappeared

2011-04-19 Thread mdf
On Tue, Apr 19, 2011 at 7:48 AM, Taku YAMAMOTO t...@tackymt.homeip.net wrote:
 This patch works.

 FreeBSD 9.0-CURRENT #1: Tue Apr 19 10:52:58 MSD 2011

 # sysctl hw.acpi.thermal
 hw.acpi.thermal.min_runtime: 0
 hw.acpi.thermal.polling_rate: 10
 hw.acpi.thermal.user_override: 0
 hw.acpi.thermal.tz0.temperature: 67.5C
 hw.acpi.thermal.tz0.active: -1
 hw.acpi.thermal.tz0.passive_cooling: 0
 hw.acpi.thermal.tz0.thermal_flags: 0
 hw.acpi.thermal.tz0._PSV: -1
 hw.acpi.thermal.tz0._HOT: -1
 hw.acpi.thermal.tz0._CRT: 90.0C
 hw.acpi.thermal.tz0._ACx: -1
 hw.acpi.thermal.tz0._TC1: -1
 hw.acpi.thermal.tz0._TC2: -1
 hw.acpi.thermal.tz0._TSP: -1

 We should have 10 _ACx values like this:
 hw.acpi.thermal.tz0._ACx: -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

D'oh, I didn't read the original source carefully enough.

Can someone test this patch?  As an aside, what kind of h/w do I need
for hw.acpi.thermal to show up?  I don't see it on my Dell desktop...

Thanks,
matthew
diff --git a/sys/dev/acpica/acpi_thermal.c b/sys/dev/acpica/acpi_thermal.c
index b64d8df..7226b6c 100644
--- a/sys/dev/acpica/acpi_thermal.c
+++ b/sys/dev/acpica/acpi_thermal.c
@@ -288,7 +288,8 @@ acpi_tz_attach(device_t dev)
 		critical temp setpoint (shutdown now));
 SYSCTL_ADD_PROC(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
 		OID_AUTO, _ACx, CTLTYPE_INT | CTLFLAG_RD,
-		sc-tz_zone.ac, 0, sysctl_handle_int, IK, );
+		sc-tz_zone.ac, sizeof(sc-tz_zone.ac),
+		sysctl_handle_opaque, IK, );
 SYSCTL_ADD_PROC(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
 		OID_AUTO, _TC1, CTLTYPE_INT | CTLFLAG_RW,
 		sc, offsetof(struct acpi_tz_softc, tz_zone.tc1),
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: `hw.acpi.thermal.tz0.temperature' disappeared

2011-04-19 Thread mdf
On Tue, Apr 19, 2011 at 11:02 AM, Doug Barton do...@freebsd.org wrote:
 On 4/19/2011 9:44 AM, m...@freebsd.org wrote:

 As an aside, what kind of h/w do I need
 for hw.acpi.thermal to show up?  I don't see it on my Dell desktop...

 The hardware is likely to be there for any reasonably modern Dell desktop.
 Do you have coretemp loaded?

I didn't (I had assumed since the relevant sysctls are defined in
acpi_thermal.c that having acpi was sufficient), so I just tried that,
but still no hw.acpi.thermal node.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: `hw.acpi.thermal.tz0.temperature' disappeared

2011-04-18 Thread mdf
On Mon, Apr 18, 2011 at 5:05 AM, John Baldwin j...@freebsd.org wrote:
 On Saturday, April 16, 2011 11:51:22 am Nick Ulen wrote:
 FreeBSD was successfully upgraded.

 uname -v
 FreeBSD 9.0-CURRENT #0: Mon Apr 11 18:14:36 MSD 2011
 root@test:/usr/obj/usr/src/sys/GENERIC

 Everything seems to be working well except
 `hw.acpi.thermal.tz0.temperature' disappeared from the list of available
 sysctl variables.

 sysctl hw.acpi.thermal.

 hw.acpi.thermal.min_runtime: 0
 hw.acpi.thermal.polling_rate: 10
 hw.acpi.thermal.user_override: 0
 hw.acpi.thermal.tz0.active: -1
 hw.acpi.thermal.tz0.passive_cooling: 0
 hw.acpi.thermal.tz0.thermal_flags: 0
 hw.acpi.thermal.tz0._PSV: -1
 hw.acpi.thermal.tz0._HOT: -1
 hw.acpi.thermal.tz0._CRT: 90.0C
 hw.acpi.thermal.tz0._TC1: -1
 hw.acpi.thermal.tz0._TC2: -1
 hw.acpi.thermal.tz0._TSP: -1

 output from:
  sysctl -a |grep acpi
 is here: https://privatepaste.com/ca08d4658b

 I suspect it is still there, but sysctl doesn't know how to display it
 anymore.  This is probably due to the changes with formatting of sysctl
 information.  mdf@ is probably responsible in that case.

    SYSCTL_ADD_OPAQUE(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
                      OID_AUTO, temperature, CTLFLAG_RD, sc-tz_temperature,
                      sizeof(sc-tz_temperature), IK,
                      current thermal zone temperature);


I don't seem to have a hw.acpi.thermal sysctl node on my box.  Can
someone please try this patch?

Thanks,
matthew
diff --git a/sys/dev/acpica/acpi_thermal.c b/sys/dev/acpica/acpi_thermal.c
index 515a742..b64d8df 100644
--- a/sys/dev/acpica/acpi_thermal.c
+++ b/sys/dev/acpica/acpi_thermal.c
@@ -257,10 +257,10 @@ acpi_tz_attach(device_t dev)
 sc-tz_sysctl_tree = SYSCTL_ADD_NODE(sc-tz_sysctl_ctx,
 	 SYSCTL_CHILDREN(acpi_tz_sysctl_tree),
 	 OID_AUTO, oidname, CTLFLAG_RD, 0, );
-SYSCTL_ADD_OPAQUE(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
-		  OID_AUTO, temperature, CTLFLAG_RD, sc-tz_temperature,
-		  sizeof(sc-tz_temperature), IK,
-		  current thermal zone temperature);
+SYSCTL_ADD_PROC(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
+		OID_AUTO, temperature, CTLTYPE_INT | CTLFLAG_RD,
+		sc-tz_temperature, 0, sysctl_handle_int,
+		IK, current thermal zone temperature);
 SYSCTL_ADD_PROC(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
 		OID_AUTO, active, CTLTYPE_INT | CTLFLAG_RW,
 		sc, 0, acpi_tz_active_sysctl, I, cooling is active);
@@ -286,9 +286,9 @@ acpi_tz_attach(device_t dev)
 		sc, offsetof(struct acpi_tz_softc, tz_zone.crt),
 		acpi_tz_temp_sysctl, IK,
 		critical temp setpoint (shutdown now));
-SYSCTL_ADD_OPAQUE(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
-		  OID_AUTO, _ACx, CTLFLAG_RD, sc-tz_zone.ac,
-		  sizeof(sc-tz_zone.ac), IK, );
+SYSCTL_ADD_PROC(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
+		OID_AUTO, _ACx, CTLTYPE_INT | CTLFLAG_RD,
+		sc-tz_zone.ac, 0, sysctl_handle_int, IK, );
 SYSCTL_ADD_PROC(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
 		OID_AUTO, _TC1, CTLTYPE_INT | CTLFLAG_RW,
 		sc, offsetof(struct acpi_tz_softc, tz_zone.tc1),
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: `hw.acpi.thermal.tz0.temperature' disappeared

2011-04-18 Thread mdf
On Mon, Apr 18, 2011 at 5:05 AM, John Baldwin j...@freebsd.org wrote:
 On Saturday, April 16, 2011 11:51:22 am Nick Ulen wrote:
 FreeBSD was successfully upgraded.

 uname -v
 FreeBSD 9.0-CURRENT #0: Mon Apr 11 18:14:36 MSD 2011
 root@test:/usr/obj/usr/src/sys/GENERIC

 Everything seems to be working well except
 `hw.acpi.thermal.tz0.temperature' disappeared from the list of available
 sysctl variables.

 sysctl hw.acpi.thermal.

 hw.acpi.thermal.min_runtime: 0
 hw.acpi.thermal.polling_rate: 10
 hw.acpi.thermal.user_override: 0
 hw.acpi.thermal.tz0.active: -1
 hw.acpi.thermal.tz0.passive_cooling: 0
 hw.acpi.thermal.tz0.thermal_flags: 0
 hw.acpi.thermal.tz0._PSV: -1
 hw.acpi.thermal.tz0._HOT: -1
 hw.acpi.thermal.tz0._CRT: 90.0C
 hw.acpi.thermal.tz0._TC1: -1
 hw.acpi.thermal.tz0._TC2: -1
 hw.acpi.thermal.tz0._TSP: -1

 output from:
  sysctl -a |grep acpi
 is here: https://privatepaste.com/ca08d4658b

 I suspect it is still there, but sysctl doesn't know how to display it
 anymore.  This is probably due to the changes with formatting of sysctl
 information.  mdf@ is probably responsible in that case.

    SYSCTL_ADD_OPAQUE(sc-tz_sysctl_ctx, SYSCTL_CHILDREN(sc-tz_sysctl_tree),
                      OID_AUTO, temperature, CTLFLAG_RD, sc-tz_temperature,
                      sizeof(sc-tz_temperature), IK,
                      current thermal zone temperature);

Oops, yes.  The change in r217586 required the type to be set to
CTLTYPE_INT to print as format IK.  My grep of the source tree shows
that acpi_thermal.c is the only affected source file that was using
OPAQUE.  I'm testing out the fix now.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic with mca trap

2011-02-03 Thread mdf
On Thu, Feb 3, 2011 at 7:09 AM, John Baldwin j...@freebsd.org wrote:
 On Thursday, February 03, 2011 8:05:31 am John Baldwin wrote:
 On Tuesday, February 01, 2011 11:58:12 am m...@freebsd.org wrote:
  On a piece of hardware trying to verify basic build tests, we got an
  MCA exception that then panic'd the kernel due to WITNESS/INVARIANTS
  interaction.
 
  panic @ time 1296563157.510, thread 0xff000554: blockable
  sleep lock (sleep mutex) 128 @ /build/mnt/src/sys/vm/uma_core.c:1872
 
  Stack: --
  kernel:witness_checkorder+0x7a2
  kernel:_mtx_lock_flags+0x81
  kernel:uma_zalloc_arg+0x256
  kernel:malloc+0xc5
  kernel:mca_record_entry+0x30
  kernel:mca_scan+0xc9
  kernel:mca_intr+0x79
  kernel:trap+0x30b
  kernel:witness_checkorder+0x66
  kernel:_mtx_lock_spin_flags+0xa4
  kernel:witness_checkorder+0x2a8
  kernel:_mtx_lock_spin_flags+0xa4
  kernel:tdq_idled+0xe8
  kernel:sched_idletd+0x5b
  kernel:fork_exit+0x9b
 
  That's this bit of code in uma_zalloc_arg():
 
  #ifdef INVARIANTS
                          ZONE_LOCK(zone);
                          uma_dbg_alloc(zone, NULL, item);
                          ZONE_UNLOCK(zone);
  #endif
 
 
  I don't know uma(9) well enough to know the best workaround.  Clearly
  there are times we can be in uma_zalloc_arg() and taking a regular
  mutex is not acceptable.  But what to do for the uma_dbg_free() call
  so it's happy, and whether to guard taking the ZONE lock with M_NOWAIT
  or td_critnest  0 or both is outside my current knowledge.
 
  I don't expect we'll see this panic again any time soon, but it would
  be nice to fix the story for WITNESS of when an M_NOWAIT allocation
  can be done.

 Actually, this is more my fault.  The machine check happened while the
 interrupted thread was already in a critical section (hence the WITNESS
 complaint).  However, it really isn't correct to be calling malloc() from an
 arbitrary exception handler, especially one like MC# which can fire pretty
 much anywhere.  I think instead that we should use malloc() when polling the
 machine check banks, but keep a pre-allocated pool of structures for use with
 MC# exceptions and CMC interrupts and replenish the pool via an asynchronous
 task.

 This is what I'm thinking:

 Index: mca.c
 ===
 --- mca.c       (revision 218221)
 +++ mca.c       (working copy)
 @@ -85,6 +85,7 @@
  static MALLOC_DEFINE(M_MCA, MCA, Machine Check Architecture);

  static int mca_count;          /* Number of records stored. */
 +static int mca_banks;          /* Number of per-CPU register banks. */

  SYSCTL_NODE(_hw, OID_AUTO, mca, CTLFLAG_RD, NULL, Machine Check 
 Architecture);

 @@ -102,6 +103,8 @@
  SYSCTL_INT(_hw_mca, OID_AUTO, erratum383, CTLFLAG_RD, 
 workaround_erratum383, 0,
     Is the workaround for Erratum 383 on AMD Family 10h processors 
 enabled?);

 +static STAILQ_HEAD(, mca_internal) mca_freelist;
 +static int mca_freecount;
  static STAILQ_HEAD(, mca_internal) mca_records;
  static struct callout mca_timer;
  static int mca_ticks = 3600;   /* Check hourly by default. */
 @@ -111,7 +114,6 @@

  #ifdef DEV_APIC
  static struct cmc_state **cmc_state;   /* Indexed by cpuid, bank */
 -static int cmc_banks;
  static int cmc_throttle = 60;  /* Time in seconds to throttle CMCI. */
  #endif

 @@ -415,21 +417,51 @@
        return (1);
  }

 -static void __nonnull(1)
 -mca_record_entry(const struct mca_record *record)
 +static void
 +mca_fill_freelist(void)
  {
        struct mca_internal *rec;
 +       int desired;

 -       rec = malloc(sizeof(*rec), M_MCA, M_NOWAIT);
 -       if (rec == NULL) {
 -               printf(MCA: Unable to allocate space for an event.\n);
 -               mca_log(record);
 -               return;
 +       /*
 +        * Ensure we have at least one record for each bank and one
 +        * record per CPU.
 +        */
 +       desired = imax(mp_ncpus, mca_banks);
 +       mtx_lock_spin(mca_lock);
 +       while (desired  mca_freecount) {
 +               mtx_unlock_spin(mca_lock);
 +               rec = malloc(sizeof(*rec), M_MCA, M_WAITOK);
 +               mtx_lock_spin(mca_lock);
 +               STAILQ_INSERT_TAIL(mca_freelist, rec, link);
 +               mca_freecount++;
        }
 +       mtx_unlock_spin(mca_lock);
 +}

 +static void __nonnull(2)
 +mca_record_entry(enum scan_mode mode, const struct mca_record *record)
 +{
 +       struct mca_internal *rec;
 +
 +       if (mode == POLLED) {
 +               rec = malloc(sizeof(*rec), M_MCA, M_WAITOK);
 +               mtx_lock_spin(mca_lock);
 +       } else {
 +               mtx_lock_spin(mca_lock);
 +               rec = STAILQ_FIRST(mca_freelist);
 +               if (rec == NULL) {
 +                       mtx_unlock_spin(mca_lock);
 +                       printf(MCA: Unable to allocate space for an 
 event.\n);
 +                       mca_log(record);
 +                       return;
 +    

Panic with mca trap

2011-02-01 Thread mdf
On a piece of hardware trying to verify basic build tests, we got an
MCA exception that then panic'd the kernel due to WITNESS/INVARIANTS
interaction.

panic @ time 1296563157.510, thread 0xff000554: blockable
sleep lock (sleep mutex) 128 @ /build/mnt/src/sys/vm/uma_core.c:1872

Stack: --
kernel:witness_checkorder+0x7a2
kernel:_mtx_lock_flags+0x81
kernel:uma_zalloc_arg+0x256
kernel:malloc+0xc5
kernel:mca_record_entry+0x30
kernel:mca_scan+0xc9
kernel:mca_intr+0x79
kernel:trap+0x30b
kernel:witness_checkorder+0x66
kernel:_mtx_lock_spin_flags+0xa4
kernel:witness_checkorder+0x2a8
kernel:_mtx_lock_spin_flags+0xa4
kernel:tdq_idled+0xe8
kernel:sched_idletd+0x5b
kernel:fork_exit+0x9b

That's this bit of code in uma_zalloc_arg():

#ifdef INVARIANTS
ZONE_LOCK(zone);
uma_dbg_alloc(zone, NULL, item);
ZONE_UNLOCK(zone);
#endif


I don't know uma(9) well enough to know the best workaround.  Clearly
there are times we can be in uma_zalloc_arg() and taking a regular
mutex is not acceptable.  But what to do for the uma_dbg_free() call
so it's happy, and whether to guard taking the ZONE lock with M_NOWAIT
or td_critnest  0 or both is outside my current knowledge.

I don't expect we'll see this panic again any time soon, but it would
be nice to fix the story for WITNESS of when an M_NOWAIT allocation
can be done.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sleep bug in taskqueue(9)

2010-11-12 Thread mdf
On Fri, Nov 12, 2010 at 12:56 AM, Hans Petter Selasky hsela...@c2i.net wrote:
 On Thursday 29 April 2010 01:59:58 Matthew Fleming wrote:
 It looks to me like taskqueue_drain(taskqueue_thread, foo) will not
 correctly detect whether or not a task is currently running.  The check
 is against a field in the taskqueue struct, but for the taskqueue_thread
 queue with more than one thread, multiple threads can simultaneously be
 running a task, thus stomping over the tq_running field.

 I have not seen any problem with the code as-is in actual use, so this
 is purely an inspection bug.

 The following patch should fix the problem.  Because it changes the size
 of struct task I'm not sure if it would be suitable for MFC.


 1) The u_char is going to leave a hole in that structure on ARM platforms for
 example.

 2) The existing taskqueue implementation also has a missing check for the
 pending count wrapping to zero. I.E. it should stick at 0x and not wrap to
 0.

This commit mail is rather old, and this fix was incorrect, because
the task cannot be referenced after it has been run.  Some task
handlers will free the task as part of the handler.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sleep bug in taskqueue(9)

2010-11-12 Thread mdf
On Fri, Nov 12, 2010 at 6:23 AM, Hans Petter Selasky hsela...@c2i.net wrote:
 On Friday 12 November 2010 15:18:46 m...@freebsd.org wrote:
 On Fri, Nov 12, 2010 at 12:56 AM, Hans Petter Selasky hsela...@c2i.net
 wrote:
  On Thursday 29 April 2010 01:59:58 Matthew Fleming wrote:
  It looks to me like taskqueue_drain(taskqueue_thread, foo) will not
  correctly detect whether or not a task is currently running.  The check
  is against a field in the taskqueue struct, but for the taskqueue_thread
  queue with more than one thread, multiple threads can simultaneously be
  running a task, thus stomping over the tq_running field.
 
  I have not seen any problem with the code as-is in actual use, so this
  is purely an inspection bug.
 
  The following patch should fix the problem.  Because it changes the size
  of struct task I'm not sure if it would be suitable for MFC.
 
  1) The u_char is going to leave a hole in that structure on ARM platforms
  for example.
 
  2) The existing taskqueue implementation also has a missing check for the
  pending count wrapping to zero. I.E. it should stick at 0x and not
  wrap to 0.

 This commit mail is rather old, and this fix was incorrect, because
 the task cannot be referenced after it has been run.  Some task
 handlers will free the task as part of the handler.

 Ok, maybe the e-mail got stuck somewhere. Have you fixed the above mentioned
 issues in a newer patch?

If you look at the file history for subr_taskqueue.c:

http://svn.freebsd.org/viewvc/base/head/sys/kern/subr_taskqueue.c

You will see quite a few commits by me.  The most recent relating to
detecting if a task is running is being MFC'd today:


Revision 213813 - (view) (annotate) - [select for diffs]
Modified Wed Oct 13 22:59:04 2010 UTC (4 weeks, 1 day ago) by mdf
File length: 10831 byte(s)
Diff to previous 213739
Use a safer mechanism for determining if a task is currently running,
that does not rely on the lifetime of pointers being the same. This also
restores the task KBI.

Suggested by:   jhb
MFC after:  1 month

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sleep bug in taskqueue(9)

2010-11-12 Thread mdf
On Fri, Nov 12, 2010 at 12:25 PM, Hans Petter Selasky hsela...@c2i.net wrote:
 On Friday 12 November 2010 17:38:38 m...@freebsd.org wrote:
 On Fri, Nov 12, 2010 at 6:23 AM, Hans Petter Selasky hsela...@c2i.net
 wrote:
  On Friday 12 November 2010 15:18:46 m...@freebsd.org wrote:
  On Fri, Nov 12, 2010 at 12:56 AM, Hans Petter Selasky hsela...@c2i.net
 
  wrote:
   On Thursday 29 April 2010 01:59:58 Matthew Fleming wrote:
   It looks to me like taskqueue_drain(taskqueue_thread, foo) will not
   correctly detect whether or not a task is currently running.  The
   check is against a field in the taskqueue struct, but for the
   taskqueue_thread queue with more than one thread, multiple threads
   can simultaneously be running a task, thus stomping over the
   tq_running field.
  
   I have not seen any problem with the code as-is in actual use, so
   this is purely an inspection bug.
  
   The following patch should fix the problem.  Because it changes the
   size of struct task I'm not sure if it would be suitable for MFC.
  
   1) The u_char is going to leave a hole in that structure on ARM
   platforms for example.
  
   2) The existing taskqueue implementation also has a missing check for
   the pending count wrapping to zero. I.E. it should stick at 0x
   and not wrap to 0.
 
  This commit mail is rather old, and this fix was incorrect, because
  the task cannot be referenced after it has been run.  Some task
  handlers will free the task as part of the handler.
 
  Ok, maybe the e-mail got stuck somewhere. Have you fixed the above
  mentioned issues in a newer patch?

 If you look at the file history for subr_taskqueue.c:

 http://svn.freebsd.org/viewvc/base/head/sys/kern/subr_taskqueue.c

 You will see quite a few commits by me.  The most recent relating to
 detecting if a task is running is being MFC'd today:

 Yes, and I see that this code needs an overflow check, which is one of the
 issues still not fixed:

You keep bringing this up.  It is not a new issue.  It is not a bug in
any of the patches.  It is extremely unlikely that a task will be
queued 65536 times before execution.  It is more worthy of an assert
rather than a check, because if a task is enqueued that many times
without being run then there's likely a stuck task in the queue.

The patch you posted will lie as well, so I would not consider it
sufficient if someone wanted to address the issue.

Thanks,
matthew



 Before:

        /*
         * Count multiple enqueues.
         */
        if (task-ta_pending) {
                task-ta_pending++;
                TQ_UNLOCK(queue);
                return 0;
        }


 After:

        /*
         * Count multiple enqueues.
         */
        if (task-ta_pending) {
                if (task-ta_pending != 0x)
                        task-ta_pending++;
                TQ_UNLOCK(queue);
                return 0;
        }

 Else the ta_pending can wrap to zero and the code will not do what it
 announces it does.

 --HPS

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


MTX_DEF versus MTX_SPIN

2010-11-03 Thread mdf
It's not clear to me from the man pages (perhaps I didn't look at the
right one?) in which environments I need a spinlock.  For example, I
wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
(i.e one that was registered with BUS_SETUP_INTR), but I see some code
lying around here that does it and nothing I'm aware of has broken.

Perhaps this comes to me still not understanding exactly how
interrupts work on FreeBSD.  If I capture the stack in a hard
interrupt, it looks something like:

#0 0xff87b07f43b5 at rnv_hard_intr+0x35
#1 0x8026e7ce at ithread_execute_handlers+0x9e
#2 0x8026ead0 at ithread_loop+0x70
#3 0x8026b84c at fork_exit+0x9c
#4 0x804a7f7e at fork_trampoline+0xe

And there the stack ends.  From my perspective, this doesn't look like
anything was actually interrupted.

By way of explaining what I mean, on AIX we defined 10 levels of
software interrupt, and we trained the kernel debugger to understand
the save frames that were acquired on interrupt to print a full stack.
 So a full stack dump might show something like a few frames from one
interrupt handler, then a save area, then a frames from a lower
priority interrupt, then another save area, then the base-level stack.
 In this kind of programming environment, it was important to know at
what interrupt level your handler would execute, so that locks
acquired to synchronize between the top-half and bottom-half were
acquired with interupts disabled to the same level.

So, back to my question.  Is it safe to take a MTX_DEF in a hard
interrupt?  What about a soft interrupt?  I have to assume it's okay
in a soft-interrupt context (swi_sched, callout, etc.), since
softclock() will acquire a MTX_DEF on behalf of a callout.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MTX_DEF versus MTX_SPIN

2010-11-03 Thread mdf
On Wed, Nov 3, 2010 at 9:42 AM, Andriy Gapon a...@icyb.net.ua wrote:
 on 03/11/2010 18:27 m...@freebsd.org said the following:
 It's not clear to me from the man pages (perhaps I didn't look at the
 right one?) in which environments I need a spinlock.  For example, I
 wouldn't think it's safe to use a MTX_DEF in a hard interrupt handler
 (i.e one that was registered with BUS_SETUP_INTR), but I see some code
 lying around here that does it and nothing I'm aware of has broken.

 Such a handler runs in an interrupt thread.
 The harder interrupt handler is called interrupt filter in FreeBSD 
 terminology.
  I think that it was formerly known as fast interrupt.

So a MTX_DEF is okay in that environment?

What would best practices be considered for what code should be run
in the interrupt handler versus a soft interrupt?  In this case the
kinds of things we have to do at some level of interrupt are:

 - handle a heartbeat interrupt from firmware a few times a second
 - get a DMA completion interrupt (completely handling this requires
calling biodone on all the associated bios)
 - receive an ECC interrupt (this requires reading registers off the
card for details)

At the moment we're on stable/7, but we will be migrating the code
base to something more recent in another year or so, if that affects
the answer.

Is there any documentation on best practices for writing a FreeBSD driver?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


kldload of compressed modules

2010-10-20 Thread mdf
I'm trying to add support for kldload of compressed modules; they work
at boot time but not at runtime.

I'm having trouble uncompressing the module in the kernel to try
processing it.  The difficulty is several-fold:

1) gzip(1) doesn't seem to produce a compressed image that net/zlib.c
understands.  It gets hung up on the first byte, since it's not 0x8
indicating Z_DEFLATED.  I tried setting next_in to the input buffer +
2 since that is the Z_DEFLATED marker, but it dies one byte later with
the incorrect header check failure.

2) I was using inflate in kern/inflate.c wrong earlier (not skipping
the first 10 bytes + comment), but even so that code and the net/zlib
code aren't compatible, since they both define an inflate() function
that takes differing numbers of arguments, etc.  And link_elf.c
already can include net/zlib.h when DDB_CTF is set, so that would
conflict with sys/inflate.h.

It's likely that kern/inflate.c is what I want, since it's the code
the boot loader uses, but I'm annoyed that it's only available with
device gzip (I can change this, of course) and that it's incompatible
with what seems to be the somewhat standard compress/decompress code
in net/zlib.c.

All of this is on stable/7 at the moment, though if I had something
that worked I want to commit to CURRENT and MFC.

So, what's the basic problem?  Should net/zlib support this gzipped
file?  Should I have compressed with some other user-space utility?
Should I add a new link_elf_compress.c file to support compressed
images (and use sys/inflate.h instead of net/zlib.h) instead of trying
to add it to the existing link_elf_load_file() functions?

For reference, here's the beginning of the compressed module (on amd64):

# od -h mnvf.ko.gz | less
000  8b1f0808fd2a4cbe03006e6d66766b2e
020  006f7ded740fd554f7b5484df8426613
040  03404401182e28200484120834254ca3
060  d198440901213a51924c190932496663
100  04261a2a8c9a9c65e6c6d6d9fb5a3cb4
120  5adb9ffa5fb60979362d4b586b42a529
140  54a852ad72ada0c7c5467f886f207def
160  99ce73b947336b7d7df5adeb7d6fb8b3
200  7b9c677e7d9fdef69ce7cf7d9b9fef73
220  3ab652ca525328c637fc594aa4a95840
240  8b304bc5de3979fff625552b0396a95d
260  ad92224c74dafe4546fb626528a8fa66
300  f31fc06597b57be3cd2b6109264197c3
320  26674be6f9e56ff6e8624dfe84e64d50
340  1251914205dd7d57f912c23606b525ae
360  d570b5c2092ab54d9e21d70b7e22628f
400  80cc8eabf3c7bae1df8206df6ad76cb8
420  2d706b865c1dc39745b5abc7774292bf
440  c28765d434babe61e15b90da46440f2b

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


uma_zfree(NULL) is broken

2010-10-18 Thread mdf
There's explicit protection for free(NULL, M_FOO), but uma_zfree(zone,
NULL) will put NULL in the local bucket and then probably return it
later from a uma_zalloc call.  Obviously it's not a good idea to call
uma_zfree(9) on NULL, but in this case it's an easy mistake to make
when e.g. converting a set of malloc(9)/free(9) uses into uma(9).

So is the right thing to allow a uma_zfree(NULL) and silently
succeed, like for free(9)?  That would be my guess, but I'm open to
alternatives.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: MAXCPU preparations

2010-09-27 Thread mdf
On Mon, Sep 27, 2010 at 9:21 AM, Sean Bruno sean...@yahoo-inc.com wrote:
 On Mon, 2010-09-27 at 08:53 -0700, Julian Elischer wrote:
 On 9/27/10 8:26 AM, Sean Bruno wrote:
  Does this look like an appropriate modification to libmemstat?
 
  Sean
 
 
   //depot/yahoo/ybsd_7/src/lib/libmemstat/memstat.h#4
  - /home/seanbru/ybsd_7/src/lib/libmemstat/memstat.h 
  @@ -28,12 +28,13 @@
 
    #ifndef _MEMSTAT_H_
    #define        _MEMSTAT_H_
  +#includesys/param.h
 
    /*
     * Number of CPU slots in library-internal data structures.  This
  should be
     * at least the value of MAXCPU from param.h.
     */
  -#define        MEMSTAT_MAXCPU  64
  +#define        MEMSTAT_MAXCPU  MAXCPU /* defined in
  sys/${ARCH}/include/param.h */
 


 wouldn't it be better to do a sysctlbyname() and use the real value
 for the system?


 That was my initial thought (as prodded by scottl and peter).

 If it is made dynamic, could this be opening a race condition where the
 call to sysctlbyname() returns a count of CPUS that is in turn changed
 by the offlining of a CPU?  Or am I thinking to much about this?

The maximum number of CPUs supported by a running instance will not
change.  Only, potentially, the current number of CPUs.  So a sysctl
to fetch the kernel's compiled-in MAXCPU is safe.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


sys/conf/files aicasm

2010-09-23 Thread mdf
I can't say I understand much about the syntax of the top part of
sys/conf/files, but this line:


aicasm  optional ahc | ahd \
dependency  $S/dev/aic7xxx/aicasm/*.[chyl]   \
compile-withCC='${CC}' ${MAKE} -f $S/dev/aic7xxx/aicasm/Makefile
MAKESRCPATH=$S/dev/aic7xxx/aicasm \
no-obj no-implicit-rule\
clean   aicasm* y.tab.h

looks to me like aicasm should only be built if I have device ahc or
device ahd in my kernel configuration file.

But even if I make a config file without those devices (e.g. by doing
include GENERIC then nodevices ahc, ahd) the file is still built.  Am
I missing something about how these lines in sys/conf/files work?

As a side question, why does the Makefile for dev/aic7xxx/aicasm have
-I/usr/include in CFLAGS instead of using some marcos to get to the
source tree I'm actually building from?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: g_event spinning 100% when doing 'gpart add'

2010-09-11 Thread mdf
On Fri, Sep 10, 2010 at 6:57 PM, Yuri Pankov yuri.pan...@gmail.com wrote:
 Nevermind me. That's what I thought why I was getting the same gpart behavior
 switching between kernels, with and without DEBUG_LOCKS. Sorry about that.

 Same here, gpart hangs on:
 3826 gpart    CALL  __sysctl(0x7fffa250,0x3,0,0x7fffa268,0,0)
 3826 gpart    SCTL  kern.geom.confxml

When I was doing my sbuf changes I temporarily had a patch to change
these sysctls to use the new drains.  I couldn't get any output from
the three kern.geom.conf{txt,dot,xml} sysctls (they didn't hang but
the output was empty) so I reverted those parts before committing.  I
doubt my changes affected this, but I will check on Monday when I'm
back at work to see if the lack of output is due to my changes or
predates me.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


deprecating sprintf(9)

2010-09-08 Thread mdf
Looking at the uses of kvprintf(9), only [v]sprintf(9) doesn't have a
callback function.  It seems a little sketchy to me to be doing unsafe
sprintf in the kernel anyways.

Should we (and by we, I mean me) deprecate sprintf(9) and convert the
existing 1200+ uses to strcpy(9) for fixed strings (also potentially
bad, but a different beast), or snprintf(9) where the size of the
buffer is known?

It seems like a large project, but OTOH sprintf(9) is mighty unsafe in
the kernel.  It's disapproved of for user-space as being unsafe for
security reasons as well, but the potential downsides aren't the same,
and we'll never clean up ports anyways. :-)

Thoughts?
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: deprecating sprintf(9)

2010-09-08 Thread mdf
On Wed, Sep 8, 2010 at 9:15 AM, Rink Springer r...@freebsd.org wrote:
 Hi,

 On Wed, Sep 08, 2010 at 08:51:57AM -0700, m...@freebsd.org wrote:
 It seems like a large project, but OTOH sprintf(9) is mighty unsafe in
 the kernel.  It's disapproved of for user-space as being unsafe for
 security reasons as well, but the potential downsides aren't the same,
 and we'll never clean up ports anyways. :-)

 Deprecating it may be usable, yet I don't believe we can easily enforce
 such a policy [1].

If the kernel sources don't use it then the prototype can be removed.

 Have you looked at how many (potentially) unsecure
 uses there are in the kernel, to give an idea how useful such an effort
 would be?

I presume all the kernel uses are safe at the moment, but it's an
error prone construction.

As of this morning grep found 1277 occurrences of sprintf(9) in sys/
and 23 occurrences of vsprintf(9) in sys/.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_pin() bug in SCHED_ULE

2010-09-01 Thread mdf
On Wed, Sep 1, 2010 at 6:49 AM, John Baldwin j...@freebsd.org wrote:
 On Tuesday, August 31, 2010 2:53:12 pm m...@freebsd.org wrote:
 On Tue, Aug 31, 2010 at 10:16 AM,  m...@freebsd.org wrote:
  I recorded the stack any time ts-ts_cpu was set and when a thread was
  migrated by sched_switch() I printed out the recorded info.  Here's
  what I found:
 
 
  XXX bug 67957: moving 0xff003ff9b800 from 3 to 1
  [1]: pin 0 state 4 move 3 - 1 done by 0xff000cc44000:
  #0 0x802b36b4 at bug67957+0x84
  #1 0x802b5dd4 at sched_affinity+0xd4
  #2 0x8024a707 at cpuset_setthread+0x137
  #3 0x8024aeae at cpuset_setaffinity+0x21e
  #4 0x804a82df at freebsd32_cpuset_setaffinity+0x4f
  #5 0x80295f49 at isi_syscall+0x99
  #6 0x804a630e at ia32_syscall+0x1ce
  #7 0x8046dc60 at Xint0x80_syscall+0x60
  [0]: pin 0 state 2 move 0 - 3 done by 0xff000cc44000:
  #0 0x802b36b4 at bug67957+0x84
  #1 0x802b4ad8 at sched_add+0xe8
  #2 0x8029b96a at create_thread+0x34a
  #3 0x8029badc at kern_thr_new+0x8c
  #4 0x804a8912 at freebsd32_thr_new+0x122
  #5 0x80295f49 at isi_syscall+0x99
  #6 0x804a630e at ia32_syscall+0x1ce
  #7 0x8046dc60 at Xint0x80_syscall+0x60
 
  So one thread in the process called cpuset_setaffinity(2), and another
  thread in the process was forcibly migrated by the IPI while returning
  from a syscall, while it had td_pinned set.
 
  Given this path, it seems reasonable to me to skip the migrate if we
  notice THREAD_CAN_MIGRATE is false.
 
  Opinions?  My debug code is below.  I'll try to write a short testcase
  that exhibits this bug.

 Just a few more thoughts on this.  The check in sched_affinity for
 THREAD_CAN_MIGRATE is racy.  Since witness uses sched_pin, it's not
 simple to take the THREAD lock around an increment of td_pinned.  So
 I'm looking for suggestions on the best way to fix this issue.  My
 thoughts:

 1) add a check in sched_switch() for THREAD_CAN_MIGRATE
 2) have WITNESS not use sched_pin, and take the THREAD lock when
 modifying td_pinned
 3) have the IPI_PREEMPT handler notice the thread is pinned (and not
 trying to bind) and postpone the mi_switch, just like it postpones
 when a thread is in a critical section.

 Except for the potential complexity of implementation, I think (2) is
 the best solution.

 I don't like 2) only because you'd really need to fix all the other places
 that use sched_pin() to grab the thread lock.  That would be a bit expensive.
 I think the problem is code that checks THREAD_CAN_MIGRATE() for running
 threads that aren't curthread.

 The sched_affinity() bug is probably my fault FWIW.  I think something along
 the lines of 1) is the best approach.  Maybe something like the patch below.
 It moves the CPU assignment out of sched_affinity() and moves it into
 sched_switch() instead so that it takes effect on the next context switch for
 this thread.  That is the point at which the CPU binding actually takes effect
 anyway since sched_affinity() doesn't force an immediate context switch.  This
 patch is relative to 7.  The patch for 9 is nearly identical (just a fixup for
 ipi_cpu() vs ipi_selected()).

 Index: sched_ule.c
 ===
 --- sched_ule.c (revision 212065)
 +++ sched_ule.c (working copy)
 @@ -1885,6 +1885,8 @@
                srqflag = (flags  SW_PREEMPT) ?
                    SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED :
                    SRQ_OURSELF|SRQ_YIELDING;
 +               if (THREAD_CAN_MIGRATE(td)  !THREAD_CAN_SCHED(td, 
 ts-tscpu))
 +                       ts-ts_cpu = sched_pickcpu(td, 0);
                if (ts-ts_cpu == cpuid)
                        tdq_add(tdq, td, srqflag);
                else
 @@ -2536,7 +2538,6 @@
  {
  #ifdef SMP
        struct td_sched *ts;
 -       int cpu;

        THREAD_LOCK_ASSERT(td, MA_OWNED);
        ts = td-td_sched;
 @@ -2550,17 +2551,13 @@
        if (!TD_IS_RUNNING(td))
                return;
        td-td_flags |= TDF_NEEDRESCHED;
 -       if (!THREAD_CAN_MIGRATE(td))
 -               return;
        /*
 -        * Assign the new cpu and force a switch before returning to
 -        * userspace.  If the target thread is not running locally send
 -        * an ipi to force the issue.
 +        * Force a switch before returning to userspace.  If the
 +        * target thread is not running locally send an ipi to force
 +        * the issue.
         */
 -       cpu = ts-ts_cpu;
 -       ts-ts_cpu = sched_pickcpu(td, 0);
 -       if (cpu != PCPU_GET(cpuid))
 -               ipi_selected(1  cpu, IPI_PREEMPT);
 +       if (td != curthread)
 +               ipi_selected(1  ts-ts_cpu, IPI_PREEMPT);
  #endif
  }

I will test this patch out; thanks for the help!

Two questions:

1) How does a thread get moved between CPUs when it's not running?  I
see that we change the runqueue for non-running threads that are on a

Re: sched_pin() bug in SCHED_ULE

2010-08-31 Thread mdf
I recorded the stack any time ts-ts_cpu was set and when a thread was
migrated by sched_switch() I printed out the recorded info.  Here's
what I found:


XXX bug 67957: moving 0xff003ff9b800 from 3 to 1
[1]: pin 0 state 4 move 3 - 1 done by 0xff000cc44000:
#0 0x802b36b4 at bug67957+0x84
#1 0x802b5dd4 at sched_affinity+0xd4
#2 0x8024a707 at cpuset_setthread+0x137
#3 0x8024aeae at cpuset_setaffinity+0x21e
#4 0x804a82df at freebsd32_cpuset_setaffinity+0x4f
#5 0x80295f49 at isi_syscall+0x99
#6 0x804a630e at ia32_syscall+0x1ce
#7 0x8046dc60 at Xint0x80_syscall+0x60
[0]: pin 0 state 2 move 0 - 3 done by 0xff000cc44000:
#0 0x802b36b4 at bug67957+0x84
#1 0x802b4ad8 at sched_add+0xe8
#2 0x8029b96a at create_thread+0x34a
#3 0x8029badc at kern_thr_new+0x8c
#4 0x804a8912 at freebsd32_thr_new+0x122
#5 0x80295f49 at isi_syscall+0x99
#6 0x804a630e at ia32_syscall+0x1ce
#7 0x8046dc60 at Xint0x80_syscall+0x60

So one thread in the process called cpuset_setaffinity(2), and another
thread in the process was forcibly migrated by the IPI while returning
from a syscall, while it had td_pinned set.

Given this path, it seems reasonable to me to skip the migrate if we
notice THREAD_CAN_MIGRATE is false.

Opinions?  My debug code is below.  I'll try to write a short testcase
that exhibits this bug.

Thanks,
matthew


Index: kern/sched_ule.c
===
--- kern/sched_ule.c(revision 158580)
+++ kern/sched_ule.c(working copy)
@@ -697,6 +697,41 @@
return;
 }

+static void
+bug67957(struct thread *td)
+{
+   int idx;
+
+   THREAD_LOCK_ASSERT(td, MA_OWNED);
+   idx = (td-xxx_idx++ % 5);
+   stack_save(td-xxx[idx].td_preempt);
+   td-xxx[idx].td_moveto = td-td_sched-ts_cpu;
+   td-xxx[idx].td_movefrom = (td-td_oncpu == NOCPU) ? td-td_lastcpu
: td-td_oncpu;
+   td-xxx[idx].td_statewas = td-td_state;
+   td-xxx[idx].td_pinned = td-td_pinned;
+   td-xxx[idx].td_by = curthread;
+}
+
+static void
+pr_bug67957(struct thread *td, int idx)
+{
+   int idx, i;
+
+   printf(XXX bug 67957: moving %p from %d to %d\n,
+   td, td-td_lastcpu, td-td_sched-ts_cpu);
+   for (i = 0, idx = td-xxx_idx - 1;
+   i  5  idx = 0;
+   i++, idx--) {
+   printf([%d]: pin %d state %d move %d - %d done by %p:\n,
+   idx, td-xxx[idx % 5].td_pinned,
+   td-xxx[idx % 5].td_statewas,
+   td-xxx[idx % 5].td_movefrom,
+   td-xxx[idx % 5].td_moveto,
+   td-xxx[idx % 5].td_by);
+   stack_print_ddb(td-xxx[idx % 5].td_preempt);
+   }
+}
+
 /*
  * Move a thread from one thread queue to another.
  */
@@ -739,6 +774,7 @@
TDQ_UNLOCK(from);
sched_rem(td);
ts-ts_cpu = cpu;
+   bug67957(td);
td-td_lock = TDQ_LOCKPTR(to);
tdq_add(to, td, SRQ_YIELDING);
 }
@@ -971,6 +1007,7 @@
tdq = TDQ_CPU(cpu);
td = ts-ts_thread;
ts-ts_cpu = cpu;
+   bug67957(td);

/* If the lock matches just return the queue. */
if (td-td_lock == TDQ_LOCKPTR(tdq))
@@ -1890,8 +1964,15 @@
SRQ_OURSELF|SRQ_YIELDING;
if (ts-ts_cpu == cpuid)
tdq_add(tdq, td, srqflag);
-   else
+   else {
+   if (!THREAD_CAN_MIGRATE(td) 
+   (ts-ts_flags  TSF_BOUND) == 0) {
+   pr_bug67957(td, idx);
+   panic(XXX);
+   }
mtx = sched_switch_migrate(tdq, td, srqflag);
+   }
+   td-xxx_idx = 0;
} else {
/* This thread must be going to sleep. */
TDQ_LOCK(tdq);
@@ -2479,8 +2560,10 @@
 * target cpu.
 */
if (td-td_priority = PRI_MAX_ITHD  THREAD_CAN_MIGRATE(td) 
-   curthread-td_intr_nesting_level)
+   curthread-td_intr_nesting_level) {
ts-ts_cpu = cpuid;
+   bug67957(td);
+   }
if (!THREAD_CAN_MIGRATE(td))
cpu = ts-ts_cpu;
else
@@ -2590,6 +2673,7 @@
 */
cpu = ts-ts_cpu;
ts-ts_cpu = sched_pickcpu(td, 0);
+   bug67957(td);
if (cpu != PCPU_GET(cpuid))
ipi_selected(1  cpu, IPI_PREEMPT);
 #endif
@@ -2613,6 +2697,7 @@
if (PCPU_GET(cpuid) == cpu)
return;
ts-ts_cpu = cpu;
+   bug67957(td);
/* When we return from mi_switch we'll be on the correct cpu. */
mi_switch(SW_VOL, NULL);
 #endif
Index: sys/proc.h
===
--- sys/proc.h  (revision 158580)
+++ sys/proc.h  (working copy)
@@ -68,6 +68,8 @@
 #include sys/isi_mountroot.h
 #include 

Re: sched_pin() bug in SCHED_ULE

2010-08-31 Thread mdf
On Tue, Aug 31, 2010 at 10:16 AM,  m...@freebsd.org wrote:
 I recorded the stack any time ts-ts_cpu was set and when a thread was
 migrated by sched_switch() I printed out the recorded info.  Here's
 what I found:


 XXX bug 67957: moving 0xff003ff9b800 from 3 to 1
 [1]: pin 0 state 4 move 3 - 1 done by 0xff000cc44000:
 #0 0x802b36b4 at bug67957+0x84
 #1 0x802b5dd4 at sched_affinity+0xd4
 #2 0x8024a707 at cpuset_setthread+0x137
 #3 0x8024aeae at cpuset_setaffinity+0x21e
 #4 0x804a82df at freebsd32_cpuset_setaffinity+0x4f
 #5 0x80295f49 at isi_syscall+0x99
 #6 0x804a630e at ia32_syscall+0x1ce
 #7 0x8046dc60 at Xint0x80_syscall+0x60
 [0]: pin 0 state 2 move 0 - 3 done by 0xff000cc44000:
 #0 0x802b36b4 at bug67957+0x84
 #1 0x802b4ad8 at sched_add+0xe8
 #2 0x8029b96a at create_thread+0x34a
 #3 0x8029badc at kern_thr_new+0x8c
 #4 0x804a8912 at freebsd32_thr_new+0x122
 #5 0x80295f49 at isi_syscall+0x99
 #6 0x804a630e at ia32_syscall+0x1ce
 #7 0x8046dc60 at Xint0x80_syscall+0x60

 So one thread in the process called cpuset_setaffinity(2), and another
 thread in the process was forcibly migrated by the IPI while returning
 from a syscall, while it had td_pinned set.

 Given this path, it seems reasonable to me to skip the migrate if we
 notice THREAD_CAN_MIGRATE is false.

 Opinions?  My debug code is below.  I'll try to write a short testcase
 that exhibits this bug.

Just a few more thoughts on this.  The check in sched_affinity for
THREAD_CAN_MIGRATE is racy.  Since witness uses sched_pin, it's not
simple to take the THREAD lock around an increment of td_pinned.  So
I'm looking for suggestions on the best way to fix this issue.  My
thoughts:

1) add a check in sched_switch() for THREAD_CAN_MIGRATE
2) have WITNESS not use sched_pin, and take the THREAD lock when
modifying td_pinned
3) have the IPI_PREEMPT handler notice the thread is pinned (and not
trying to bind) and postpone the mi_switch, just like it postpones
when a thread is in a critical section.

Except for the potential complexity of implementation, I think (2) is
the best solution.

For those who want to play at home, I have a small test program that
exhibits this behavior at
http://people.freebsd.org/~mdf/cpu_affinity_test.c.  It seems to
require 4 or more CPUs to hit the assert.  You'll also need to patch
the kernel to assert when migrating a pinned thread:

Index: kern/sched_ule.c
===
--- kern/sched_ule.c(revision 158580)
+++ kern/sched_ule.c(working copy)
@@ -1888,11 +1889,26 @@ sched_switch(struct thread *td, struct t
srqflag = (flags  SW_PREEMPT) ?
SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED :
SRQ_OURSELF|SRQ_YIELDING;
if (ts-ts_cpu == cpuid)
tdq_add(tdq, td, srqflag);
-   else
+   else {
+   KASSERT(THREAD_CAN_MIGRATE(td) ||
+   (ts-ts_flags  TSF_BOUND) != 0,
+   (Thread %p shouldn't migrate!, td));
mtx = sched_switch_migrate(tdq, td, srqflag);
+   }
} else {
/* This thread must be going to sleep. */
TDQ_LOCK(tdq);
mtx = thread_lock_block(td);

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


sched_pin() bug in SCHED_ULE

2010-08-26 Thread mdf
Back at the beginning of August I posted about issues with sched_pin()
and witness_warn():

http://lists.freebsd.org/pipermail/freebsd-hackers/2010-August/032553.html

After a lot of debugging I think I've basically found the issue.  I
found this bug on stable/7, but looking at the commit log for CURRENT
I don't see any reason the bug doesn't exist there.  This analysis is
specific to amd64/i386 but the problem is likely to exist on most
arches.

The basic problem is that in both tdq_move() and sched_setcpu() can
manipulate the ts-ts_cpu variable of another process, which may be
running.  This means that a process running on CPU N may be set to run
on CPU M the next context switch.

Then, that process may call sched_pin(), then grab a PCPU variable.
An IPI_PREEMPT can come in, causing the thread to call mi_switch() in
ipi_bitmap_handler().  sched_switch() will then notice that ts-ts_cpu
is not PCPU_GET(cpuid), and call sched_switch_migrate(), migrating the
thread to the intended CPU.  Thus after sched_pin() and potentially
before using any PCPU variable the process can get migrated to another
CPU, and now it is using a PCPU variable for the wrong processor.

Given that ts_cpu can be set by other threads, it doesn't seem worth
checking at sched_pin time whether ts_cpu is not the same as td_oncpu,
because to make the check would require taking the thread_lock.
Instead, I propose adding a check for !THREAD_CAN_MIGRATE() before
calling sched_switch_migrate().  However, sched_pin() is also used by
sched_bind(9) to force the thread to stay on the intended cpu.  I
would handle this by adding a TSF_BINDING state that is different from
TSF_BOUND.

The thing that has me wondering whether this is all correct is that I
don't see any checks in tdq_move() or sched_setcpu() for either
TSF_BOUND or THREAD_CAN_MIGRATE().  Perhaps that logic is managed in
the various calling functions; in that case I would propose adding
asserts that the thread is THREAD_CAN_MIGRATE() unless it's marked
TSF_BINDING.

Does this analysis seem correct?

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sched_pin() bug in SCHED_ULE

2010-08-26 Thread mdf
On Thu, Aug 26, 2010 at 1:49 PM, John Baldwin j...@freebsd.org wrote:
 On Thursday, August 26, 2010 4:03:38 pm m...@freebsd.org wrote:
 Back at the beginning of August I posted about issues with sched_pin()
 and witness_warn():

 http://lists.freebsd.org/pipermail/freebsd-hackers/2010-August/032553.html

 After a lot of debugging I think I've basically found the issue.  I
 found this bug on stable/7, but looking at the commit log for CURRENT
 I don't see any reason the bug doesn't exist there.  This analysis is
 specific to amd64/i386 but the problem is likely to exist on most
 arches.

 The basic problem is that in both tdq_move() and sched_setcpu() can
 manipulate the ts-ts_cpu variable of another process, which may be
 running.  This means that a process running on CPU N may be set to run
 on CPU M the next context switch.

 Then, that process may call sched_pin(), then grab a PCPU variable.
 An IPI_PREEMPT can come in, causing the thread to call mi_switch() in
 ipi_bitmap_handler().  sched_switch() will then notice that ts-ts_cpu
 is not PCPU_GET(cpuid), and call sched_switch_migrate(), migrating the
 thread to the intended CPU.  Thus after sched_pin() and potentially
 before using any PCPU variable the process can get migrated to another
 CPU, and now it is using a PCPU variable for the wrong processor.

 Given that ts_cpu can be set by other threads, it doesn't seem worth
 checking at sched_pin time whether ts_cpu is not the same as td_oncpu,
 because to make the check would require taking the thread_lock.
 Instead, I propose adding a check for !THREAD_CAN_MIGRATE() before
 calling sched_switch_migrate().  However, sched_pin() is also used by
 sched_bind(9) to force the thread to stay on the intended cpu.  I
 would handle this by adding a TSF_BINDING state that is different from
 TSF_BOUND.

 The thing that has me wondering whether this is all correct is that I
 don't see any checks in tdq_move() or sched_setcpu() for either
 TSF_BOUND or THREAD_CAN_MIGRATE().  Perhaps that logic is managed in
 the various calling functions; in that case I would propose adding
 asserts that the thread is THREAD_CAN_MIGRATE() unless it's marked
 TSF_BINDING.

 Does this analysis seem correct?

 Calling code does check THREAD_CAN_MIGRATE(), but the problem is that it is
 not safe to call that on anything but curthread as td_pinned is not locked.
 It is assumed that only curthread would ever check td_pinned, not other
 threads.  I suspect what is happening is that another CPU is seeing a stale
 value of td_pinned.  You could fix this by grabbing the thread lock in
 sched_pin()/unpin() for ULE, but that is a bit expensive perhaps.

I tried making sched_pin() a real function which used
intr_disable/intr_restore around saving off td-td_oncpu,
td-td_lastcpu and ts-ts_cpu, and the stack at the time of call.  In
sched_switch when I saw an unexpected migration I printed all that
out.  I found that on my boxes, at sched_pin() time ts_cpu was already
different from td-td_oncpu, so the specific problem I was having was
that while another thread can change ts_cpu it has no way to force
that thread to immediately migrate.

I believe the below patch fixes the issue, though I'm open to other methods:


Index: kern/sched_ule.c
===
--- kern/sched_ule.c(.../head/src/sys)  (revision 158279)
+++ kern/sched_ule.c(.../branches/BR_BUG_67957/src/sys) (revision 
158279)
@@ -112,6 +112,7 @@
 /* flags kept in ts_flags */
 #defineTSF_BOUND   0x0001  /* Thread can not migrate. */
 #defineTSF_XFERABLE0x0002  /* Thread was added as 
transferable. */
+#defineTSF_BINDING 0x0004  /* Thread is being bound. */

 static struct td_sched td_sched0;

@@ -1859,6 +1858,7 @@
struct mtx *mtx;
int srqflag;
int cpuid;
+   int do_switch;

THREAD_LOCK_ASSERT(td, MA_OWNED);

@@ -1888,10 +1888,21 @@
srqflag = (flags  SW_PREEMPT) ?
SRQ_OURSELF|SRQ_YIELDING|SRQ_PREEMPTED :
SRQ_OURSELF|SRQ_YIELDING;
-   if (ts-ts_cpu == cpuid)
+   /*
+* Allow the switch to another processor as requested
+* only if the thread can migrate or we are in the
+* middle of binding for sched_bind(9).  This keeps
+* sched_pin() quick, since other threads can
+* manipulate ts_cpu.
+*/
+   do_switch = (ts-ts_cpu != cpuid);
+   if (!THREAD_CAN_MIGRATE(td) 
+   (ts-ts_flags  TSF_BINDING) == 0)
+   do_switch = 0;
+   if (do_switch)
+   mtx = sched_switch_migrate(tdq, td, srqflag);
+   else
tdq_add(tdq, td, srqflag);
-   else
-   mtx = sched_switch_migrate(tdq, td, srqflag);
} else {

Re: sched_pin() bug in SCHED_ULE

2010-08-26 Thread mdf
On Thu, Aug 26, 2010 at 3:10 PM, Andriy Gapon a...@icyb.net.ua wrote:
 on 27/08/2010 00:20 m...@freebsd.org said the following:

 I tried making sched_pin() a real function which used
 intr_disable/intr_restore around saving off td-td_oncpu,
 td-td_lastcpu and ts-ts_cpu, and the stack at the time of call.  In
 sched_switch when I saw an unexpected migration I printed all that
 out.  I found that on my boxes, at sched_pin() time ts_cpu was already
 different from td-td_oncpu, so the specific problem I was having was
 that while another thread can change ts_cpu it has no way to force
 that thread to immediately migrate.

 Like e.g. in sched_affinity where ts_cpu is first changed and then the old cpu
 is ipi-ed?

That could do it.  I didn't follow the stack of the places that were
touching ts_cpu for non-curthread.

 I believe the below patch fixes the issue, though I'm open to other methods:


 Index: kern/sched_ule.c
 ===
 --- kern/sched_ule.c  (.../head/src/sys)      (revision 158279)
 +++ kern/sched_ule.c  (.../branches/BR_BUG_67957/src/sys)     (revision 
 158279)
 @@ -112,6 +112,7 @@
  /* flags kept in ts_flags */
  #define      TSF_BOUND       0x0001          /* Thread can not migrate. */
  #define      TSF_XFERABLE    0x0002          /* Thread was added as 
 transferable. */
 +#define      TSF_BINDING     0x0004          /* Thread is being bound. */

 I don't really follow why TSF_BINDING is needed, i.e. why TSF_BOUND is not
 sufficient in this case?

I wanted to tighten up the asserts, so that the only time it was okay
for a td_pinned thread to migrate was the one time it was moving to
the target cpu.  Having a separate flag allows that.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [head tinderbox] failure on amd64/amd64

2010-08-12 Thread mdf
The tinderbox break is my bad.  I will have it fixed by the end of the day.

:-(
matthew

On Thu, Aug 12, 2010 at 3:25 AM, FreeBSD Tinderbox
tinder...@freebsd.org wrote:
 TB --- 2010-08-12 01:30:00 - tinderbox 2.6 running on 
 freebsd-current.sentex.ca
 TB --- 2010-08-12 01:30:00 - starting HEAD tinderbox run for amd64/amd64
 TB --- 2010-08-12 01:30:00 - cleaning the object tree
 TB --- 2010-08-12 01:30:27 - cvsupping the source tree
 TB --- 2010-08-12 01:30:27 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
 /tinderbox/HEAD/amd64/amd64/supfile
 TB --- 2010-08-12 01:46:12 - building world
 TB --- 2010-08-12 01:46:12 - MAKEOBJDIRPREFIX=/obj
 TB --- 2010-08-12 01:46:12 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
 TB --- 2010-08-12 01:46:12 - TARGET=amd64
 TB --- 2010-08-12 01:46:12 - TARGET_ARCH=amd64
 TB --- 2010-08-12 01:46:12 - TZ=UTC
 TB --- 2010-08-12 01:46:12 - __MAKE_CONF=/dev/null
 TB --- 2010-08-12 01:46:12 - cd /src
 TB --- 2010-08-12 01:46:12 - /usr/bin/make -B buildworld
 World build started on Thu Aug 12 01:46:13 UTC 2010
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
 [...]
 cc -O2 -pipe  -DACPI_EXEC_APP -fno-strict-aliasing 
 -I/src/usr.sbin/acpi/acpidb/../../../sys -std=gnu99 -fstack-protector 
 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
 -Wno-pointer-sign -c 
 /src/usr.sbin/acpi/acpidb/../../../sys/contrib/dev/acpica/utilities/utxface.c
 cc -O2 -pipe  -DACPI_EXEC_APP -fno-strict-aliasing 
 -I/src/usr.sbin/acpi/acpidb/../../../sys -std=gnu99 -fstack-protector 
 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
 -Wno-pointer-sign  -o acpidb acpidb.o osunixxf.o dbcmds.o dbdisply.o dbexec.o 
 dbfileio.o dbhistry.o dbinput.o dbstats.o dbutils.o dbxface.o dmbuffer.o 
 dmnames.o dmobject.o dmopcode.o dmresrc.o dmresrcl.o dmresrcs.o dmutils.o 
 dmwalk.o evevent.o evgpe.o evgpeblk.o evgpeinit.o evgpeutil.o evmisc.o 
 evregion.o evrgnini.o evsci.o evxface.o evxfevnt.o evxfregn.o hwacpi.o 
 hwgpe.o hwregs.o hwsleep.o hwvalid.o hwxface.o dsfield.o dsinit.o dsmethod.o 
 dsmthdat.o dsobject.o dsopcode.o dsutils.o dswexec.o dswload.o dswscope.o 
 dswstate.o exconfig.o exconvrt.o excreate.o exdebug.o exdump.o exfield.o 
 exfldio.o exmisc.o exmutex.o exnames.o exoparg1.o exoparg2.o exoparg3.o 
 exoparg6.o exprep.o exregion.o exresnte.o exresolv.o exresop.o exstore.o 
 exstoren.o exstorob.o exsystem.o exutils.o psargs.o psloop.o psopc!
  ode.o psparse.o psscope.o pstree.o psutils.o pswalk.o psxface.o nsaccess.o 
 nsalloc.o nsdump.o nseval.o nsinit.o nsload.o nsnames.o nsobject.o nsparse.o 
 nspredef.o nsrepair.o nsrepair2.o nssearch.o nsutils.o nswalk.o nsxfeval.o 
 nsxfname.o nsxfobj.o rsaddr.o rscalc.o rscreate.o rsdump.o rsinfo.o rsio.o 
 rsirq.o rslist.o rsmemory.o rsmisc.o rsutils.o rsxface.o tbfadt.o tbfind.o 
 tbinstal.o tbutils.o tbxface.o tbxfroot.o utalloc.o utcache.o utcopy.o 
 utdebug.o utdelete.o uteval.o utglobal.o utids.o utinit.o utlock.o utmath.o 
 utmisc.o utmutex.o utobject.o utosi.o utresrc.o utstate.o uttrack.o utxface.o 
 -lpthread
 gzip -cn /src/usr.sbin/acpi/acpidb/acpidb.8  acpidb.8.gz
 === usr.sbin/acpi/acpidump (all)
 cc -O2 -pipe  -I/src/usr.sbin/acpi/acpidump/../../../sys -std=gnu99 
 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W 
 -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes 
 -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow 
 -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs 
 -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -c 
 /src/usr.sbin/acpi/acpidump/acpi.c
 cc1: warnings being treated as errors
 /src/usr.sbin/acpi/acpidump/acpi.c: In function 'acpi_handle_tcpa':
 /src/usr.sbin/acpi/acpidump/acpi.c:650: warning: format '%lld' expects type 
 'long long int', but argument 4 has type 'u_int64_t'
 *** Error code 1

 Stop in /src/usr.sbin/acpi/acpidump.
 *** Error code 1

 Stop in /src/usr.sbin/acpi.
 *** Error code 1

 Stop in /src/usr.sbin.
 *** Error code 1

 Stop in /src.
 *** Error code 1

 Stop in /src.
 *** Error code 1

 Stop in /src.
 TB --- 2010-08-12 03:25:04 - WARNING: /usr/bin/make returned exit code  1
 TB --- 2010-08-12 03:25:04 - ERROR: failed to build world
 TB --- 2010-08-12 03:25:04 - 4691.49 user 830.94 system 6903.97 real


 http://tinderbox.freebsd.org/tinderbox-head-HEAD-amd64-amd64.full
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list

Re: [head tinderbox] failure on amd64/amd64

2010-08-12 Thread mdf
On Thu, Aug 12, 2010 at 1:51 PM,  m...@freebsd.org wrote:
 The tinderbox break is my bad.  I will have it fixed by the end of the day.

Actually, the amd64 break is not me, because it broke before it got to
my mistake.  i386 and presumably all the other 32-bit builds are due
to my memguard(9) patch.

 :-(
 matthew

 On Thu, Aug 12, 2010 at 3:25 AM, FreeBSD Tinderbox
 tinder...@freebsd.org wrote:
 TB --- 2010-08-12 01:30:00 - tinderbox 2.6 running on 
 freebsd-current.sentex.ca
 TB --- 2010-08-12 01:30:00 - starting HEAD tinderbox run for amd64/amd64
 TB --- 2010-08-12 01:30:00 - cleaning the object tree
 TB --- 2010-08-12 01:30:27 - cvsupping the source tree
 TB --- 2010-08-12 01:30:27 - /usr/bin/csup -z -r 3 -g -L 1 -h 
 cvsup.sentex.ca /tinderbox/HEAD/amd64/amd64/supfile
 TB --- 2010-08-12 01:46:12 - building world
 TB --- 2010-08-12 01:46:12 - MAKEOBJDIRPREFIX=/obj
 TB --- 2010-08-12 01:46:12 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
 TB --- 2010-08-12 01:46:12 - TARGET=amd64
 TB --- 2010-08-12 01:46:12 - TARGET_ARCH=amd64
 TB --- 2010-08-12 01:46:12 - TZ=UTC
 TB --- 2010-08-12 01:46:12 - __MAKE_CONF=/dev/null
 TB --- 2010-08-12 01:46:12 - cd /src
 TB --- 2010-08-12 01:46:12 - /usr/bin/make -B buildworld
 World build started on Thu Aug 12 01:46:13 UTC 2010
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
 [...]
 cc -O2 -pipe  -DACPI_EXEC_APP -fno-strict-aliasing 
 -I/src/usr.sbin/acpi/acpidb/../../../sys -std=gnu99 -fstack-protector 
 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
 -Wno-pointer-sign -c 
 /src/usr.sbin/acpi/acpidb/../../../sys/contrib/dev/acpica/utilities/utxface.c
 cc -O2 -pipe  -DACPI_EXEC_APP -fno-strict-aliasing 
 -I/src/usr.sbin/acpi/acpidb/../../../sys -std=gnu99 -fstack-protector 
 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
 -Wno-pointer-sign  -o acpidb acpidb.o osunixxf.o dbcmds.o dbdisply.o 
 dbexec.o dbfileio.o dbhistry.o dbinput.o dbstats.o dbutils.o dbxface.o 
 dmbuffer.o dmnames.o dmobject.o dmopcode.o dmresrc.o dmresrcl.o dmresrcs.o 
 dmutils.o dmwalk.o evevent.o evgpe.o evgpeblk.o evgpeinit.o evgpeutil.o 
 evmisc.o evregion.o evrgnini.o evsci.o evxface.o evxfevnt.o evxfregn.o 
 hwacpi.o hwgpe.o hwregs.o hwsleep.o hwvalid.o hwxface.o dsfield.o dsinit.o 
 dsmethod.o dsmthdat.o dsobject.o dsopcode.o dsutils.o dswexec.o dswload.o 
 dswscope.o dswstate.o exconfig.o exconvrt.o excreate.o exdebug.o exdump.o 
 exfield.o exfldio.o exmisc.o exmutex.o exnames.o exoparg1.o exoparg2.o 
 exoparg3.o exoparg6.o exprep.o exregion.o exresnte.o exresolv.o exresop.o 
 exstore.o exstoren.o exstorob.o exsystem.o exutils.o psargs.o psloop.o psopc!
  ode.o psparse.o psscope.o pstree.o psutils.o pswalk.o psxface.o nsaccess.o 
 nsalloc.o nsdump.o nseval.o nsinit.o nsload.o nsnames.o nsobject.o nsparse.o 
 nspredef.o nsrepair.o nsrepair2.o nssearch.o nsutils.o nswalk.o nsxfeval.o 
 nsxfname.o nsxfobj.o rsaddr.o rscalc.o rscreate.o rsdump.o rsinfo.o rsio.o 
 rsirq.o rslist.o rsmemory.o rsmisc.o rsutils.o rsxface.o tbfadt.o tbfind.o 
 tbinstal.o tbutils.o tbxface.o tbxfroot.o utalloc.o utcache.o utcopy.o 
 utdebug.o utdelete.o uteval.o utglobal.o utids.o utinit.o utlock.o utmath.o 
 utmisc.o utmutex.o utobject.o utosi.o utresrc.o utstate.o uttrack.o 
 utxface.o -lpthread
 gzip -cn /src/usr.sbin/acpi/acpidb/acpidb.8  acpidb.8.gz
 === usr.sbin/acpi/acpidump (all)
 cc -O2 -pipe  -I/src/usr.sbin/acpi/acpidump/../../../sys -std=gnu99 
 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -W 
 -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes 
 -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow 
 -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs 
 -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -c 
 /src/usr.sbin/acpi/acpidump/acpi.c
 cc1: warnings being treated as errors
 /src/usr.sbin/acpi/acpidump/acpi.c: In function 'acpi_handle_tcpa':
 /src/usr.sbin/acpi/acpidump/acpi.c:650: warning: format '%lld' expects type 
 'long long int', but argument 4 has type 'u_int64_t'
 *** Error code 1

 Stop in /src/usr.sbin/acpi/acpidump.
 *** Error code 1

 Stop in /src/usr.sbin/acpi.
 *** Error code 1

 Stop in /src/usr.sbin.
 *** Error code 1

 Stop in /src.
 *** Error code 1

 Stop in /src.
 *** Error code 1

 Stop in /src.
 TB --- 2010-08-12 03:25:04 - WARNING: /usr/bin/make returned exit code  1
 TB --- 2010-08-12 03:25:04 - ERROR: failed to build world
 TB --- 2010-08-12 03:25:04 - 4691.49 user 830.94 system 6903.97 real


 http://tinderbox.freebsd.org/tinderbox-head-HEAD-amd64-amd64.full
 ___
 freebsd-current@freebsd.org mailing list
 

Re: Panic booting vmware i386 after SRAT update

2010-07-29 Thread mdf
On Thu, Jul 29, 2010 at 7:18 AM, John Baldwin j...@freebsd.org wrote:
 On Wednesday, July 28, 2010 1:37:42 pm m...@freebsd.org wrote:
 I have a 2 cpu virtual image of FreeBSD current.  It panics during
 boot after building in the NUMA support.

 I'll transcribe the SRAT bootverbose messages and panic message as best I 
 can.

 Table 'SRAT' at 0xfef07f6
 SRAT: Found table at 0xfef07f6
 SRAT: Found memory domain 0 addr 0 len a: enabled
 SRAT: Found memory domain 0 addr 10 len ff0: enabled

 then some MADT: messages about finding cpu 0 and 1

  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
 panic: SRAT: CPU with APIC ID 0 is not known

 I'm playing around now with trying to figure out what went missing,
 but I thought I'd send this out now.

 Hmm, check_domains() in srat.c should reject the SRAT table in this case.
 Oh, I see the problem, try this:

 Index: srat.c
 ===
 --- srat.c      (revision 210552)
 +++ srat.c      (working copy)
 @@ -150,7 +150,8 @@
        for (i = 0; i  num_mem; i++) {
                found = 0;
                for (j = 0; j = MAX_APIC_ID; j++)
 -                       if (cpus[j].domain == mem_info[i].domain) {
 +                       if (cpus[j].enabled 
 +                           cpus[j].domain == mem_info[i].domain) {
                                cpus[j].has_memory = 1;
                                found++;
                        }

This appears to fix the issue for me.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Panic booting vmware i386 after SRAT update

2010-07-28 Thread mdf
I have a 2 cpu virtual image of FreeBSD current.  It panics during
boot after building in the NUMA support.

I'll transcribe the SRAT bootverbose messages and panic message as best I can.

Table 'SRAT' at 0xfef07f6
SRAT: Found table at 0xfef07f6
SRAT: Found memory domain 0 addr 0 len a: enabled
SRAT: Found memory domain 0 addr 10 len ff0: enabled

then some MADT: messages about finding cpu 0 and 1

 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
panic: SRAT: CPU with APIC ID 0 is not known

I'm playing around now with trying to figure out what went missing,
but I thought I'd send this out now.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic booting vmware i386 after SRAT update

2010-07-28 Thread mdf
On Wed, Jul 28, 2010 at 10:37 AM,  m...@freebsd.org wrote:
 I have a 2 cpu virtual image of FreeBSD current.  It panics during
 boot after building in the NUMA support.

 I'll transcribe the SRAT bootverbose messages and panic message as best I can.

 Table 'SRAT' at 0xfef07f6
 SRAT: Found table at 0xfef07f6
 SRAT: Found memory domain 0 addr 0 len a: enabled
 SRAT: Found memory domain 0 addr 10 len ff0: enabled

 then some MADT: messages about finding cpu 0 and 1

  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
 panic: SRAT: CPU with APIC ID 0 is not known

 I'm playing around now with trying to figure out what went missing,
 but I thought I'd send this out now.

Okay, apparently VMWare is providing two entries of type
ACPI_SRAT_TYPE_MEMORY_AFFINITY but no entries of type
ACPI_SRAT_TYPE_CPU_AFFINITY.  This leads to the assert since no CPUs
are enabled; that is there's no affinity information for them.  This
is probably a VMWare bug.

Setting hint.srat.0.disabled=1 in /boot/device.hints works around the issue.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Panic booting vmware i386 after SRAT update

2010-07-28 Thread mdf
On Wed, Jul 28, 2010 at 11:19 AM, David Cornejo d...@dogwood.com wrote:
 On Wed, Jul 28, 2010 at 7:37 AM, m...@freebsd.org wrote:

 I have a 2 cpu virtual image of FreeBSD current.  It panics during
 boot after building in the NUMA support.

 I'll transcribe the SRAT bootverbose messages and panic message as best I
 can.

 Table 'SRAT' at 0xfef07f6
 SRAT: Found table at 0xfef07f6
 SRAT: Found memory domain 0 addr 0 len a: enabled
 SRAT: Found memory domain 0 addr 10 len ff0: enabled

 then some MADT: messages about finding cpu 0 and 1

  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
 panic: SRAT: CPU with APIC ID 0 is not known

 I'm playing around now with trying to figure out what went missing,
 but I thought I'd send this out now.

 I have been seeing this since yesterday with amd64 custom kernels - just
 compiled a GENERIC kernel a few minutes ago and it shows the same symptom.

Is this on VMWare or physical hardware?  I isolated my issue to VMWare
reporting memory domains properties but not CPU affinity.  Setting
hint.srat.0.disabled=1 in /boot/device.hints fixes the issue for me.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: strange scsi/CAM related dmesg output

2010-06-18 Thread mdf
On Fri, Jun 18, 2010 at 8:11 AM, Alexander Best
alexbes...@uni-muenster.de wrote:
 On Mon, Jun 7, 2010 at 3:57 PM, John Baldwin j...@freebsd.org wrote:
 It can happen because the print buffer size thing is not line-buffered, it is
 printf-invocation buffered.

 hmmm...can this somehow be fixed? i'm not sure this is specific to
 scsi/cam. the other day i bootd my system and almost all of the dmesg
 output was displayed incorrectly. would increasing PRINTF_BUFR_SIZE
 from 128 to lets say 512 or 1024 solve the issue?

I think what jhb meant was that we could look for the '\n' and flush
to console there, instead of waiting for the end of the buffer.  This
would perhaps have more interleaved full lines, but likely fewer
interleaved partial lines.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org