buildworld is broken ?

2013-01-18 Thread Sergey V. Dyatko
Hi,

subj
head, amd64 Revision: 245588


protector -Wsystem-headers -Werror -Wall -Wno-format-y2k
-Wno-uninitialized -Wno-pointer-sign
-c 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmopcode.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I.
-I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector
-Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized
-Wno-pointer-sign
-c 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
cc1: warnings being treated as
errors 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
In function
'AcpiDmIsResourceTemplate': 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
warning: dereferencing type-punned pointer will break strict-aliasing
rules *** [dmresrc.o] Error code 1

Stop in /usr/src/usr.sbin/acpi/iasl.
*** [all] Error code 1

Stop in /usr/src/usr.sbin/acpi.
*** [all] Error code 1

Stop in /usr/src/usr.sbin.
*** [usr.sbin.all__D] Error code 1

Stop in /usr/src.
*** [everything] Error code 1

Stop in /usr/src.
*** [buildworld] Error code 1

-- 
wbr, tiger
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Gleb Smirnoff
On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote:
V Hello.
V 
V After upgrading server from old hardware/software to freebsd current (## SVN 
## Exported commit - http://svnweb.freebsd.org/changeset/base/245479),
V system hung's with message - 
V  panic: make_dev_alias_v: bad si_name (error=22 
si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)

EINVAL (22) is caused by space character in the si_name:

si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7
  ^

I think Alexander (in Cc) has idea on why did that happen and how
should that be fixed.

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Vitalij Satanivskij
Alexander Motin wrote:
AM On 18.01.2013 11:44, Gleb Smirnoff wrote:
AM  On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote:
AM  V After upgrading server from old hardware/software to freebsd current 
(## SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479),
AM  V system hung's with message - 
AM  V  panic: make_dev_alias_v: bad si_name (error=22 
si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
AM  
AM  EINVAL (22) is caused by space character in the si_name:
AM  
AM  si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7
AM  
AM  I think Alexander (in Cc) has idea on why did that happen and how
AM  should that be fixed.
AM 
AM The panic is triggered by the check added by the recent r244584 change.
AM The space in device name came from the enclosure device, and I guess it
AM may be quite often situation. Using human readable name supposed to help
AM system administrators, but with spaces banned that may be a problem.
AM 

That's was not created by human, it was generated (I think so) by system. 

May be problem not in r244584 at all but in incorect generation of the si_name 
? 

More info 

drive (actualy drives, all 36 have same problem) inserted in backplane on 
supermicro chasis with LSI CORP SAS2X36 0417 on board.

All of them attached to lsi sas 9211-4i controler in HBA mode.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Alexander Motin
On 18.01.2013 13:39, Vitalij Satanivskij wrote:
 Alexander Motin wrote:
 AM On 18.01.2013 11:44, Gleb Smirnoff wrote:
 AM  On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote:
 AM  V After upgrading server from old hardware/software to freebsd current 
 (## SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479),
 AM  V system hung's with message - 
 AM  V  panic: make_dev_alias_v: bad si_name (error=22 
 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
 AM  
 AM  EINVAL (22) is caused by space character in the si_name:
 AM  
 AM  si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7
 AM  
 AM  I think Alexander (in Cc) has idea on why did that happen and how
 AM  should that be fixed.
 AM 
 AM The panic is triggered by the check added by the recent r244584 change.
 AM The space in device name came from the enclosure device, and I guess it
 AM may be quite often situation. Using human readable name supposed to help
 AM system administrators, but with spaces banned that may be a problem.
 
 That's was not created by human, it was generated (I think so) by system. 

These strings are flashed into enclosure firmware by manufacturer.

 May be problem not in r244584 at all but in incorect generation of the 
 si_name ? 

May be. But before r244584 it didn't cause panics and most of people
were happy, except devctl consumers, who can't parse these events properly.

 More info 
 
 drive (actualy drives, all 36 have same problem) inserted in backplane on 
 supermicro chasis with LSI CORP SAS2X36 0417 on board.
 
 All of them attached to lsi sas 9211-4i controler in HBA mode.

-- 
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: buildworld is broken ?

2013-01-18 Thread Gleb Smirnoff
On Fri, Jan 18, 2013 at 11:30:17AM +0300, Sergey V. Dyatko wrote:
S subj
S head, amd64 Revision: 245588

Works for me:

Revision: 245593
Last Changed Rev: 245584
Last Changed Date: 2013-01-18 06:36:06 +0400 (пт, 18 янв 2013)

Also, there is not tinderbox complaints on the mailing list.

-- 
Totus tuus, Glebius.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: panic after r244584

2013-01-18 Thread Alexander Motin
On 18.01.2013 11:44, Gleb Smirnoff wrote:
 On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote:
 V After upgrading server from old hardware/software to freebsd current (## 
 SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479),
 V system hung's with message - 
 V  panic: make_dev_alias_v: bad si_name (error=22 
 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
 
 EINVAL (22) is caused by space character in the si_name:
 
 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7
 
 I think Alexander (in Cc) has idea on why did that happen and how
 should that be fixed.

The panic is triggered by the check added by the recent r244584 change.
The space in device name came from the enclosure device, and I guess it
may be quite often situation. Using human readable name supposed to help
system administrators, but with spaces banned that may be a problem.

-- 
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


[head tinderbox] failure on ia64/ia64

2013-01-18 Thread FreeBSD Tinderbox
TB --- 2013-01-18 10:27:26 - tinderbox 2.10 running on freebsd-current.sentex.ca
TB --- 2013-01-18 10:27:26 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE 
FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 
d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC  amd64
TB --- 2013-01-18 10:27:26 - starting HEAD tinderbox run for ia64/ia64
TB --- 2013-01-18 10:27:26 - cleaning the object tree
TB --- 2013-01-18 10:27:26 - /usr/local/bin/svn stat /src
TB --- 2013-01-18 10:27:29 - At svn revision 245589
TB --- 2013-01-18 10:27:30 - building world
TB --- 2013-01-18 10:27:30 - CROSS_BUILD_TESTING=YES
TB --- 2013-01-18 10:27:30 - MAKEOBJDIRPREFIX=/obj
TB --- 2013-01-18 10:27:30 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2013-01-18 10:27:30 - SRCCONF=/dev/null
TB --- 2013-01-18 10:27:30 - TARGET=ia64
TB --- 2013-01-18 10:27:30 - TARGET_ARCH=ia64
TB --- 2013-01-18 10:27:30 - TZ=UTC
TB --- 2013-01-18 10:27:30 - __MAKE_CONF=/dev/null
TB --- 2013-01-18 10:27:30 - cd /src
TB --- 2013-01-18 10:27:30 - /usr/bin/make -B buildworld
 Building an up-to-date make(1)
 World build started on Fri Jan 18 10:27:34 UTC 2013
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
[...]
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmbuffer.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmdeferred.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmnames.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmopcode.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
cc1: warnings being treated as errors
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
 In function 'AcpiDmIsResourceTemplate':
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
 warning: dereferencing type-punned pointer will break strict-aliasing rules
*** [dmresrc.o] Error code 1

Stop in /src/usr.sbin/acpi/iasl.
*** [all] Error code 1

Stop in /src/usr.sbin/acpi.
*** [all] Error code 1

Stop in /src/usr.sbin.
*** [usr.sbin.all__D] Error code 1

Stop in /src.
*** [everything] Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2013-01-18 11:55:37 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2013-01-18 11:55:37 - ERROR: failed to build world
TB --- 2013-01-18 11:55:37 - 4029.16 user 947.42 system 5290.92 real


http://tinderbox.freebsd.org/tinderbox-head-ss-build-HEAD-ia64-ia64.full
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: buildworld is broken ?

2013-01-18 Thread Sergey V. Dyatko
On Fri, 18 Jan 2013 15:47:13 +0400
Gleb Smirnoff gleb...@freebsd.org wrote:

 On Fri, Jan 18, 2013 at 11:30:17AM +0300, Sergey V. Dyatko wrote:
 S subj
 S head, amd64 Revision: 245588
 
 Works for me:
 
 Revision: 245593
 Last Changed Rev: 245584
 Last Changed Date: 2013-01-18 06:36:06 +0400 (пт, 18 янв 2013)
 
strange :(
laptop# cd /usr/src/usr.sbin/acpi/iasl
laptop# make 
cc -O2 -pipe  -DACPI_ASL_COMPILER -I.
-I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector
-Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized
-Wno-pointer-sign
-c 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
cc1: warnings being treated as
errors 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
In function
'AcpiDmIsResourceTemplate': 
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
warning: dereferencing type-punned pointer will break strict-aliasing
rules *** [dmresrc.o] Error code 1

Stop in /usr/src/usr.sbin/acpi/iasl.

 Also, there is not tinderbox complaints on the mailing list.
 

Subject: [head tinderbox] failure on ia64/ia64
Date: Fri, 18 Jan 2013 11:55:37 GMT


-- 
wbr, tiger
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC/RFT] calloutng

2013-01-18 Thread Bruce Evans

On Thu, 17 Jan 2013, Ian Lepore wrote:


On Mon, 2013-01-14 at 11:38 +1100, Bruce Evans wrote:



Er, timecounters are called with a spin mutex held in existing code:
though it is dangerous to do so, timecounters are called from fast
interrupt handlers for very timekeeping-critical purposes:
- to implement the TIOCTIMESTAMP ioctl (except this is broken in
   -current).  This was a primitive version of pps timestamping.
- for pps timestamping.  The interrupt handler (which should be a fast
   interrupt handler to minimize latency) calls pps_capture() which
   calls tc_get_timecount() and does other lock-free accesses to the
   timecounter state.  This still works in -current (at least there is
   still code for it).


Unfortunately, calling pps_capture() in the primary interrupt context is
no longer an option with the stock pps driver.  Ever since the ppbus
rewrite all ppbus children must use threaded handlers.  I tried to fix
that a couple different ways, and both ended up with crazy-complex code


Hmm, I didn't notice that ppc supported pps (I try not to look at it since
it is ugly :-), and don't know of any version of it that uses non-threaded
handlers (except in FreeBSD-4 before, where normal interrupt handlers
were non-threaded, so ppc had their high latency but not the even higher
latency and overheads of threaded handlers).

OTOH, my x86 RTC interrupt handler is threaded and supports pps, and
I haven't noticed any latency problems with this.  It just can't
possibly give the  ~1 usec jitter that FreeBSD-[3-4] could give ~15
years ago using a fast interrupt handler (there must be only 1 device
using a fast interrupt handler, with this dedicated to pps, else the
multiple fast interrupt handlers will give latency much larger than
~1 usec to each other.  I don't actually use this for anything except
testing whether the RTC can be used for a poor man's pps.


scattered around the ppbus family just to support the rarely-used pps
capture.  It would have been easier to do if filter and threaded
interrupt handlers had the same function signature.

I ended up writting a separate driver that can be used instead of ppc +
ppbus + pps, since anyone who cares about precise pps capture is
unlikely to be sharing the port with a printer or plip device or some
such.


Probably all pps handlers should be special.  On x86 with reasonable
timecounter hardware, say a TSC, it takes about 10 instructions for
an entire pps interrupt handler:

XintrN:
pushl   %eax
pushl   %edx
rdtsc
# Need some ugliness for EIO here or later.
ss:movl %eax,ppscap # Hopefully lock-free via time-domain locking.
ss:movl %edx,ppscap+4
popl%edx
popl%eax
iret

After capturing the timecounter hardware value here, you convert it
to a pps event at leisure.  But since this only happens once per second,
it wouldn't be very inefficient to turn the interrupt handler into a
slow high-latency one, even a threaded one, to handle the pps event
and/or other devices attached to the interrupt.


   OTOH, all drivers that call pps_capture() from their interrupt handler
   then immediately call pps_event().  This has always been very broken,
   and became even more broken with SMPng.  pps_event() does many more
   timecounter and pps accesses whose locking is unclear at best, and
   in some configurations it calls hardpps(), which is only locked by
   Giant, despite comments in kern_ntptime.c still saying that it (and
   many other functions in kern_ntptime.c) must be called at splclock()
   or higher.  splclock() is of course now null, but the locking
   requirements in kern_ntptime.c haven't changed much.  kern_ntptime.c
   always needed to be locked by the equivalent of a spin mutex, which
   is stronger locking than was given by splclock().  pps_event() would
   have to aquire the spin mutex before calling hardpps(), although
   this is bad for fast interrupt handlers.  The correct implementation
   is probably to only do the capture part from fast interrupt handlers.


In my rewritten dedicated pps driver I call pps_capture() from the
filter handler and pps_event() from the threaded handler.  I never found


That seems right.


any good documentation on the low-level details of this stuff, and there
isn't enough good example code to work from.  My hazy memory is that I


THere seem to be no good examples.


ended up studying the pps_capture() and pps_event() code enough to infer
that their design intent seems to be to allow you to capture with no
locking and do the event processing later in some sort of deferred or
threaded context.


That seems to be the design, but there are no examples of separating
the event from the capture.

I think the correct locking is:
- capture in a fast interrupt handler, into a per-device state that
  is locked by whatever locks all of the state accessed by the fast
  interrupt handler
- switch to a less critical context later:
  - lock this step 

Re: panic after r244584

2013-01-18 Thread Jaakko Heinonen
On 2013-01-18, Alexander Motin wrote:
  AM  V  panic: make_dev_alias_v: bad si_name (error=22 
  si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)

  AM The panic is triggered by the check added by the recent r244584 change.
  AM The space in device name came from the enclosure device, and I guess it
  AM may be quite often situation. Using human readable name supposed to help
  AM system administrators, but with spaces banned that may be a problem.
  
  That's was not created by human, it was generated (I think so) by system. 
 
 These strings are flashed into enclosure firmware by manufacturer.

You can't rely on that any string can be safely used as a device name
even if spaces were allowed. Consider for example duplicate names and
../.

Where these names are generated? The original report didn't contain a
backtrace.

-- 
Jaakko
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Alexander Motin
On 18.01.2013 15:19, Jaakko Heinonen wrote:
 On 2013-01-18, Alexander Motin wrote:
 AM  V  panic: make_dev_alias_v: bad si_name (error=22 
 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
 
 AM The panic is triggered by the check added by the recent r244584 change.
 AM The space in device name came from the enclosure device, and I guess it
 AM may be quite often situation. Using human readable name supposed to help
 AM system administrators, but with spaces banned that may be a problem.

 That's was not created by human, it was generated (I think so) by system. 

 These strings are flashed into enclosure firmware by manufacturer.
 
 You can't rely on that any string can be safely used as a device name
 even if spaces were allowed. Consider for example duplicate names and
 ../.
 
 Where these names are generated? The original report didn't contain a
 backtrace.

At cam/scsi/ses_set_physpath.c ses_set_physpath(). Duplicate names are
impossible there, as previous name components are unique. Special
characters haven't yet seen, but I think theoretically possible.
Interesting what Solaris does in such cases, mangles them somehow or
removes completely?

-- 
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Vitalij Satanivskij
Jaakko Heinonen wrote:
JH On 2013-01-18, Alexander Motin wrote:
JH   AM  V  panic: make_dev_alias_v: bad si_name (error=22 
si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
JH 
JH   AM The panic is triggered by the check added by the recent r244584 
change.
JH   AM The space in device name came from the enclosure device, and I 
guess it
JH   AM may be quite often situation. Using human readable name supposed to 
help
JH   AM system administrators, but with spaces banned that may be a problem.
JH   
JH   That's was not created by human, it was generated (I think so) by 
system. 
JH  
JH  These strings are flashed into enclosure firmware by manufacturer.
JH 
JH You can't rely on that any string can be safely used as a device name
JH even if spaces were allowed. Consider for example duplicate names and
JH ../.
JH 
JH Where these names are generated? The original report didn't contain a
JH backtrace.

Yes. No backtrace, because of switching off all debuging in kernel. 

For now I can't use that's server for testing, but there are another servers 
waiting for upgrade.

I will try to reproduce problem with  kernel debuger enabled.


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Vitalij Satanivskij

May be just do sanitizing for elmpriv-descr?

something like change whitespace to _ or just delete it?


Vitalij Satanivskij wrote:
VS Jaakko Heinonen wrote:
VS JH On 2013-01-18, Alexander Motin wrote:
VS JH   AM  V  panic: make_dev_alias_v: bad si_name (error=22 
si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
VS JH 
VS JH   AM The panic is triggered by the check added by the recent r244584 
change.
VS JH   AM The space in device name came from the enclosure device, and I 
guess it
VS JH   AM may be quite often situation. Using human readable name 
supposed to help
VS JH   AM system administrators, but with spaces banned that may be a 
problem.
VS JH   
VS JH   That's was not created by human, it was generated (I think so) by 
system. 
VS JH  
VS JH  These strings are flashed into enclosure firmware by manufacturer.
VS JH 
VS JH You can't rely on that any string can be safely used as a device name
VS JH even if spaces were allowed. Consider for example duplicate names and
VS JH ../.
VS JH 
VS JH Where these names are generated? The original report didn't contain a
VS JH backtrace.
VS 
VS Yes. No backtrace, because of switching off all debuging in kernel. 
VS 
VS For now I can't use that's server for testing, but there are another 
servers waiting for upgrade.
VS 
VS I will try to reproduce problem with  kernel debuger enabled.
VS 
VS 
VS ___
VS freebsd-current@freebsd.org mailing list
VS http://lists.freebsd.org/mailman/listinfo/freebsd-current
VS To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: My panic in amd64/pmap

2013-01-18 Thread Andriy Gapon
on 17/01/2013 21:50 Larry Rosenman said the following:
 I've now seen this panic:
  pmap_insert_pt_page: pindex already inserted
 
 on 9.1-RELEASE, 9.1-STABLE, and 10.0-CURRENT
 
 I've got vmcore's from the 9.1-STABLE and 10.0-CURRENT VM's available
 as well as sources.
 
 I have the core.txt.* files available at:
 http://www.lerctr.org/~ler/core.txt.0  (10.0)
 http://www.lerctr.org/~ler/core.txt.2  (9.1-S)
 
 I'm not sure what other debug info you need.
 
 I can provide SSH access to both VM's as well as the host.
 
 These are all in VirtualBox 4.2.6 VM's
 
 Any help would be appreciated.

Hmm, I wonder if VirtualBox is hitting the same popcnt bug that was fixed in 
qemu...

Could you please try a patch from here
http://thread.gmane.org/gmane.comp.emulators.qemu/174532/focus=174567 ?

It should be applied to src/recompiler/target-i386/translate.c, make sure that 
it
goes to a section marked as 'case 0x1b8: /* SSE4.2 popcnt */'.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: My panic in amd64/pmap

2013-01-18 Thread Larry Rosenman

On 2013-01-18 08:17, Andriy Gapon wrote:

on 17/01/2013 21:50 Larry Rosenman said the following:

I've now seen this panic:
 pmap_insert_pt_page: pindex already inserted

on 9.1-RELEASE, 9.1-STABLE, and 10.0-CURRENT

I've got vmcore's from the 9.1-STABLE and 10.0-CURRENT VM's 
available

as well as sources.

I have the core.txt.* files available at:
http://www.lerctr.org/~ler/core.txt.0  (10.0)
http://www.lerctr.org/~ler/core.txt.2  (9.1-S)

I'm not sure what other debug info you need.

I can provide SSH access to both VM's as well as the host.

These are all in VirtualBox 4.2.6 VM's

Any help would be appreciated.


Hmm, I wonder if VirtualBox is hitting the same popcnt bug that was
fixed in qemu...

Could you please try a patch from here
http://thread.gmane.org/gmane.comp.emulators.qemu/174532/focus=174567 
?


It should be applied to src/recompiler/target-i386/translate.c, make
sure that it
goes to a section marked as 'case 0x1b8: /* SSE4.2 popcnt */'.


Should this be on the host or the guest?


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 (c) E-Mail: l...@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Alexander Motin
On 18.01.2013 15:49, Vitalij Satanivskij wrote:
 May be just do sanitizing for elmpriv-descr?
 
 something like change whitespace to _ or just delete it?

Yes, that is not difficult. The only question is how to stay consistent,
compatible, user-readable.

 Vitalij Satanivskij wrote:
 VS Jaakko Heinonen wrote:
 VS JH On 2013-01-18, Alexander Motin wrote:
 VS JH   AM  V  panic: make_dev_alias_v: bad si_name (error=22 
 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7)
 VS JH 
 VS JH   AM The panic is triggered by the check added by the recent 
 r244584 change.
 VS JH   AM The space in device name came from the enclosure device, and 
 I guess it
 VS JH   AM may be quite often situation. Using human readable name 
 supposed to help
 VS JH   AM system administrators, but with spaces banned that may be a 
 problem.
 VS JH   
 VS JH   That's was not created by human, it was generated (I think so) by 
 system. 
 VS JH  
 VS JH  These strings are flashed into enclosure firmware by manufacturer.
 VS JH 
 VS JH You can't rely on that any string can be safely used as a device name
 VS JH even if spaces were allowed. Consider for example duplicate names and
 VS JH ../.
 VS JH 
 VS JH Where these names are generated? The original report didn't contain a
 VS JH backtrace.
 VS 
 VS Yes. No backtrace, because of switching off all debuging in kernel. 
 VS 
 VS For now I can't use that's server for testing, but there are another 
 servers waiting for upgrade.
 VS 
 VS I will try to reproduce problem with  kernel debuger enabled.
 VS 
 VS 
 VS ___
 VS freebsd-current@freebsd.org mailing list
 VS http://lists.freebsd.org/mailman/listinfo/freebsd-current
 VS To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
 


-- 
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Vitalij Satanivskij
Alexander Motin wrote:
AM On 18.01.2013 15:49, Vitalij Satanivskij wrote:
AM  May be just do sanitizing for elmpriv-descr?
AM  
AM  something like change whitespace to _ or just delete it?
AM 
AM Yes, that is not difficult. The only question is how to stay consistent,
AM compatible, user-readable.
AM 

Ok, now I have kernel dump 

kgdb /boot/kernel/kernel vmcore.0 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...

Unread portion of the kernel message buffer:
da0 at mps0 bus 0 scbus7 target 8 lun 0
da0: ATA ST3500630NS G Fixed Direct Access SCSI-6 device 
da0: 300.000MB/s transfers
da0: Command Queueing enabled
da0: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C)
panic: make_dev_alias_v: bad si_name (error=22, 
si_name=enc@n5003048000baa87d/type@0/slot@a/elmdesc@Slot 10/pass7)
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff9b9ec84760
kdb_backtrace() at kdb_backtrace+0x39/frame 0xff9b9ec84810
vpanic() at vpanic+0x127/frame 0xff9b9ec84850
panic() at panic+0x43/frame 0xff9b9ec848b0
make_dev_alias_v() at make_dev_alias_v+0x1d0/frame 0xff9b9ec84900
make_dev_alias_p() at make_dev_alias_p+0x37/frame 0xff9b9ec84960
make_dev_physpath_alias() at make_dev_physpath_alias+0x14a/frame 
0xff9b9ec849c0
pass_add_physpath() at pass_add_physpath+0xbd/frame 0xff9b9ec849f0
taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xff9b9ec84a40
taskqueue_thread_loop() at taskqueue_thread_loop+0x6c/frame 0xff9b9ec84a70
fork_exit() at fork_exit+0x84/frame 0xff9b9ec84ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xff9b9ec84ab0
--- trap 0, rip = 0, rsp = 0xff9b9ec84b70, rbp = 0 ---
KDB: enter: panic


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: panic after r244584

2013-01-18 Thread Vitalij Satanivskij
Vitalij Satanivskij wrote:
VS Alexander Motin wrote:
VS AM On 18.01.2013 15:49, Vitalij Satanivskij wrote:
VS AM  May be just do sanitizing for elmpriv-descr?
VS AM  
VS AM  something like change whitespace to _ or just delete it?
VS AM 
VS AM Yes, that is not difficult. The only question is how to stay consistent,
VS AM compatible, user-readable.
VS AM 
VS 
VS Ok, now I have kernel dump 
VS 
VS kgdb /boot/kernel/kernel vmcore.0 
VS GNU gdb 6.1.1 [FreeBSD]
VS Copyright 2004 Free Software Foundation, Inc.
VS GDB is free software, covered by the GNU General Public License, and you are
VS welcome to change it and/or distribute copies of it under certain 
conditions.
VS Type show copying to see the conditions.
VS There is absolutely no warranty for GDB.  Type show warranty for details.
VS This GDB was configured as amd64-marcel-freebsd...
VS 
VS Unread portion of the kernel message buffer:
VS da0 at mps0 bus 0 scbus7 target 8 lun 0
VS da0: ATA ST3500630NS G Fixed Direct Access SCSI-6 device 
VS da0: 300.000MB/s transfers
VS da0: Command Queueing enabled
VS da0: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C)
VS panic: make_dev_alias_v: bad si_name (error=22, 
si_name=enc@n5003048000baa87d/type@0/slot@a/elmdesc@Slot 10/pass7)
VS cpuid = 0
VS KDB: stack backtrace:
VS db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xff9b9ec84760
VS kdb_backtrace() at kdb_backtrace+0x39/frame 0xff9b9ec84810
VS vpanic() at vpanic+0x127/frame 0xff9b9ec84850
VS panic() at panic+0x43/frame 0xff9b9ec848b0
VS make_dev_alias_v() at make_dev_alias_v+0x1d0/frame 0xff9b9ec84900
VS make_dev_alias_p() at make_dev_alias_p+0x37/frame 0xff9b9ec84960
VS make_dev_physpath_alias() at make_dev_physpath_alias+0x14a/frame 
0xff9b9ec849c0
VS pass_add_physpath() at pass_add_physpath+0xbd/frame 0xff9b9ec849f0
VS taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xff9b9ec84a40
VS taskqueue_thread_loop() at taskqueue_thread_loop+0x6c/frame 
0xff9b9ec84a70
VS fork_exit() at fork_exit+0x84/frame 0xff9b9ec84ab0
VS fork_trampoline() at fork_trampoline+0xe/frame 0xff9b9ec84ab0
VS --- trap 0, rip = 0, rsp = 0xff9b9ec84b70, rbp = 0 ---
VS KDB: enter: panic
VS 
VS 

And of couse 

(kgdb) bt
#0  doadump (textdump=0) at pcpu.h:229
#1  0x8034002e in db_dump (dummy=value optimized out, dummy2=0, 
dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:543
#2  0x8033fada in db_command (last_cmdp=value optimized out, 
cmd_table=value optimized out, dopager=1) at /usr/src/sys/ddb/db_command.c:449
#3  0x8033f892 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:502
#4  0x80342240 in db_trap (type=value optimized out, code=0) at 
/usr/src/sys/ddb/db_main.c:231
#5  0x808b9753 in kdb_trap (type=3, code=0, tf=value optimized out) 
at /usr/src/sys/kern/subr_kdb.c:654
#6  0x80c0d3b8 in trap (frame=0xff9b9ec84740) at 
/usr/src/sys/amd64/amd64/trap.c:579
#7  0x80bf6512 in calltrap () at exception.S:228
#8  0x808b8f3e in kdb_enter (why=0x80e7adb1 panic, msg=value 
optimized out) at cpufunc.h:63
#9  0x80885a47 in vpanic (fmt=value optimized out, ap=value 
optimized out) at /usr/src/sys/kern/kern_shutdown.c:746
#10 0x80885ab3 in panic (fmt=value optimized out) at 
/usr/src/sys/kern/kern_shutdown.c:682
#11 0x8083add0 in make_dev_alias_v (flags=value optimized out, 
cdev=0xfe0031b78cd0, pdev=value optimized out, fmt=value optimized out, 
ap=0xff9b9ec84940) at /usr/src/sys/kern/kern_conf.c:925
#12 0x8083ae27 in make_dev_alias_p (flags=-1631041792, cdev=0x80, 
pdev=0x80e72a0a, fmt=0x80 Address 0x80 out of bounds)
at /usr/src/sys/kern/kern_conf.c:968
#13 0x8083af7a in make_dev_physpath_alias (flags=8, 
cdev=0xfe0031b78cd0, pdev=0xfe042bb8f000, old_alias=0x0, 
physpath=value optimized out)
at /usr/src/sys/kern/kern_conf.c:1025
#14 0x80308b7d in pass_add_physpath (context=0xfe04fe563a00, 
pending=value optimized out) at /usr/src/sys/cam/scsi/scsi_pass.c:258
#15 0x808c8050 in taskqueue_run_locked (queue=0xfe002fddf800) at 
/usr/src/sys/kern/subr_taskqueue.c:312
#16 0x808c87ec in taskqueue_thread_loop (arg=value optimized out) at 
/usr/src/sys/kern/subr_taskqueue.c:501
#17 0x80855444 in fork_exit (callout=0x808c8780 
taskqueue_thread_loop, arg=0x81502690, frame=0xff9b9ec84ac0)
at /usr/src/sys/kern/kern_fork.c:991
#18 0x80bf6a4e in fork_trampoline () at exception.S:602
#19 0x in ?? ()
Current language:  auto; currently minimal
(kgdb)

what next I can do to investigate problem?
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: My panic in amd64/pmap

2013-01-18 Thread Larry Rosenman

On 2013-01-18 09:09, Larry Rosenman wrote:

On 2013-01-18 08:17, Andriy Gapon wrote:

on 17/01/2013 21:50 Larry Rosenman said the following:

I've now seen this panic:
 pmap_insert_pt_page: pindex already inserted

on 9.1-RELEASE, 9.1-STABLE, and 10.0-CURRENT

I've got vmcore's from the 9.1-STABLE and 10.0-CURRENT VM's 
available

as well as sources.

I have the core.txt.* files available at:
http://www.lerctr.org/~ler/core.txt.0  (10.0)
http://www.lerctr.org/~ler/core.txt.2  (9.1-S)

I'm not sure what other debug info you need.

I can provide SSH access to both VM's as well as the host.

These are all in VirtualBox 4.2.6 VM's

Any help would be appreciated.


Hmm, I wonder if VirtualBox is hitting the same popcnt bug that was
fixed in qemu...

Could you please try a patch from here

http://thread.gmane.org/gmane.comp.emulators.qemu/174532/focus=174567 
?


It should be applied to src/recompiler/target-i386/translate.c, make
sure that it
goes to a section marked as 'case 0x1b8: /* SSE4.2 popcnt */'.


Should this be on the host or the guest?
Never mind, it's in VirtualBox itself.  The line is at ~~line 8020 in 
the same file.  I've patched it and am recompiling

VirtualBox.

If I don't see the panic for a few days, I'll submit a PR.


--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 (c) E-Mail: l...@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3893
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


kmem_map auto-sizing and size dependencies

2013-01-18 Thread Andre Oppermann

The autotuning work is reaching into many places of the kernel and
while trying to tie up all lose ends I've got stuck in the kmem_map
and how it works or what its limitations are.

During startup the VM is initialized and an initial kernel virtual
memory map is setup in kmem_init() covering the entire KVM address
range.  Only the kernel itself is actually allocated within that
map.  A bit later on a number of other submaps are allocated (clean_map,
buffer_map, pager_map, exec_map).  Also in kmeminit() (in kern_malloc.c,
different from kmem_init) the kmem_map is allocated.

The (inital?) size of the kmem_map is determined by some voodoo magic,
a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables.
However it seems to work out to an effective kmem_map_size of about 58MB
on my 16GB AMD64 dev machine:

vm.kvm_size: 549755809792
vm.kvm_free: 530233421824
vm.kmem_size: 16,594,300,928
vm.kmem_size_min: 0
vm.kmem_size_max: 329,853,485,875
vm.kmem_size_scale: 1
vm.kmem_map_size: 59,518,976
vm.kmem_map_free: 16,534,777,856

The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing
else that uses UMA for memory allocation.

Mbuf memory too is managed by UMA which obtains the backing kernel memory
from the kmem_map.  The limits of the various mbuf memory types have
been considerably raised recently and may make use of 50-75% of all physically
present memory, or available KVM space, whichever is smaller.

Now my questions/comments are:

 Does the kmem_map automatically extend itself if more memory is requested?

 Should it be set to a larger initial value based on min(physical,KVM) space
 available?

 The use of nmbclusters for the initial kmem_map size calculation isn't
 appropriate anymore due to it being set up later and nmbclusters isn't the
 only mbuf relevant mbuf type.  We make significant use of page sized mbuf
 clusters too.

 The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is
 confusing and not easy to reconcile.  Either we need some more detailing
 more aspects or less.  Plus perhaps sysctl subtrees to better describe the
 hierarchy of the maps.

 Why are separate kmem submaps being used?  Is it to limit memory usage of
 certain subsystems?  Are those limits actually enforced?

--
Andre
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: kmem_map auto-sizing and size dependencies

2013-01-18 Thread Alan Cox
I'll follow up with detailed answers to your questions over the weekend.
For now, I will, however, point out that you've misinterpreted the
tunables.  In fact, they say that your kmem map can hold up to 16GB and the
current used space is about 58MB.  Like other things, the kmem map is
auto-sized based on the available physical memory and capped so as not to
consume too much of the overall kernel address space.

Regards,
Alan

On Fri, Jan 18, 2013 at 9:29 AM, Andre Oppermann an...@freebsd.org wrote:

 The autotuning work is reaching into many places of the kernel and
 while trying to tie up all lose ends I've got stuck in the kmem_map
 and how it works or what its limitations are.

 During startup the VM is initialized and an initial kernel virtual
 memory map is setup in kmem_init() covering the entire KVM address
 range.  Only the kernel itself is actually allocated within that
 map.  A bit later on a number of other submaps are allocated (clean_map,
 buffer_map, pager_map, exec_map).  Also in kmeminit() (in kern_malloc.c,
 different from kmem_init) the kmem_map is allocated.

 The (inital?) size of the kmem_map is determined by some voodoo magic,
 a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables.
 However it seems to work out to an effective kmem_map_size of about 58MB
 on my 16GB AMD64 dev machine:

 vm.kvm_size: 549755809792
 vm.kvm_free: 530233421824
 vm.kmem_size: 16,594,300,928
 vm.kmem_size_min: 0
 vm.kmem_size_max: 329,853,485,875
 vm.kmem_size_scale: 1
 vm.kmem_map_size: 59,518,976
 vm.kmem_map_free: 16,534,777,856

 The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing
 else that uses UMA for memory allocation.

 Mbuf memory too is managed by UMA which obtains the backing kernel memory
 from the kmem_map.  The limits of the various mbuf memory types have
 been considerably raised recently and may make use of 50-75% of all
 physically
 present memory, or available KVM space, whichever is smaller.

 Now my questions/comments are:

  Does the kmem_map automatically extend itself if more memory is requested?

  Should it be set to a larger initial value based on min(physical,KVM)
 space
  available?

  The use of nmbclusters for the initial kmem_map size calculation isn't
  appropriate anymore due to it being set up later and nmbclusters isn't the
  only mbuf relevant mbuf type.  We make significant use of page sized mbuf
  clusters too.

  The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is
  confusing and not easy to reconcile.  Either we need some more detailing
  more aspects or less.  Plus perhaps sysctl subtrees to better describe the
  hierarchy of the maps.

  Why are separate kmem submaps being used?  Is it to limit memory usage of
  certain subsystems?  Are those limits actually enforced?

 --
 Andre
 __**_
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/**mailman/listinfo/freebsd-**currenthttp://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscribe@**
 freebsd.org freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: kmem_map auto-sizing and size dependencies

2013-01-18 Thread mdf
On Fri, Jan 18, 2013 at 7:29 AM, Andre Oppermann an...@freebsd.org wrote:
 The (inital?) size of the kmem_map is determined by some voodoo magic,
 a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables.
 However it seems to work out to an effective kmem_map_size of about 58MB
 on my 16GB AMD64 dev machine:

 vm.kvm_size: 549755809792
 vm.kvm_free: 530233421824
 vm.kmem_size: 16,594,300,928
 vm.kmem_size_min: 0
 vm.kmem_size_max: 329,853,485,875
 vm.kmem_size_scale: 1
 vm.kmem_map_size: 59,518,976
 vm.kmem_map_free: 16,534,777,856

 The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing
 else that uses UMA for memory allocation.

 Mbuf memory too is managed by UMA which obtains the backing kernel memory
 from the kmem_map.  The limits of the various mbuf memory types have
 been considerably raised recently and may make use of 50-75% of all
 physically
 present memory, or available KVM space, whichever is smaller.

 Now my questions/comments are:

  Does the kmem_map automatically extend itself if more memory is requested?

Not that I recall.

  Should it be set to a larger initial value based on min(physical,KVM) space
  available?

It needs to be smaller than the physical space, because the only limit
on the kernel's use of (pinned) memory is the size of the map.  So if
it is too large there is nothing to stop the kernel from consuming all
available memory.  The lowmem handler is called when running out of
virtual space only (i.e. a failure to allocate a range in the map).

  The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is
  confusing and not easy to reconcile.  Either we need some more detailing
  more aspects or less.  Plus perhaps sysctl subtrees to better describe the
  hierarchy of the maps.

  Why are separate kmem submaps being used?  Is it to limit memory usage of
  certain subsystems?  Are those limits actually enforced?

I mostly know about memguard, since I added memguard_fudge().  IIRC
some of the submaps are used.  The memguard_map specifically is used
to know whether an allocation is guarded or not, so at free(9) it can
be handled as normal malloc() or as memguard.

Cheers,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


ULE can leak TDQ_LOCK() if statclock() called outside of critical_enter()

2013-01-18 Thread Ryan Stone
I have been experiencing occasional deadlocks on FreeBSD 8.2 systems using
the ULE scheduler.  The root cause in every case has been that ULE's
TDQ_LOCK for cpu 0 is owned by a thread that is not running.  I have been
investigating the issue, and I believe that I see the issue.  The problem
occurs if the interrupt that drives statclock does not call
critical_enter() upon calling into statclock().  The lapic timer does use
critical_enter(), so default configurations would not see this.  I have
local patches to use the RTC to drive statclock, and from a quick reading
of the eventtimer code in -CURRENT the same thing is possible there.  The
RTC code does not call statclock within a critical section.  So here's the
bug:

1) A thread with interrupts enabled, running on CPU 0, with td_owepreempt=1
and td_critnest=1 calls critical_exit():

void
critical_exit(void)
{
   // ...
if (td-td_critnest == 1) {
td-td_critnest = 0;
if (td-td_owepreempt  !kdb_active) {
// Irrelevant bits snipped

2) td_critnest is set to 0, and then the RTC interrupt fires.

3) rtcintr calls into statclock (8.2) or statclock_cnt(head) with
td_critnest still 0 (on head it goes through the eventtimer code, but it
ends up in statclock eventually).

4) statclock takes the thread_lock on curthread, then calls sched_clock().
sched_clock calls sched_balance();

static void
sched_balance(void)
{
// snip...
tdq = TDQ_SELF();
TDQ_UNLOCK(tdq);
sched_balance_group(cpu_top);
TDQ_LOCK(tdq);
}

TDQ_UNLOCK does a spinlock_exit which does a critical_exit.  td_critnest
will be decremented back to 0 and td_owepreempt is still 1, so this
triggers a preemption.  Note that this TDQ_UNLOCK is (intentionally)
unlocking the thread_lock done by statclock.

5) thread migrates to any other CPU, call it CPU n.  tdq is now stale.
TDQ_LOCK takes the lock for CPU 0 (but really it's intending to re-take the
thread_lock, although a thread_lock() here would be equally incorrect --
sched_balance's caller is going to be mucking around with the TDQ when
sched_balance returns).

6) The thread returns to statclock.  statclock does a thread_unlock(). The
td_lock is TDQ_LOCK(n), which we don't hold.  We mangle the stat of
TDQ_LOCK(n) and leave TDQ_LOCK(0) held.


The simplest solution would be to do a critical_enter() in sched_balance,
although that would be superfluous in the normal case where the lapic timer
is driving statclock.  I'm not sure if there's other code in the
eventtimers infrastructure that's assuming that a preemption or migration
is impossible while handling an event.  A quick look at kern_clocksource.c
turns up worrying comments like Handle all events for specified time on
this CPU and uses of curcpu, so there may well be other issues lurking
here.

It looks to me that the safest thing to do would be to push the
critical_enter() into the eventtimer code or even all the way back to the
interrupt handlers (mirroring what the lapic code already does).
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


-current broken in acpi/iasl

2013-01-18 Thread Andrey Chernov
=== usr.sbin/acpi/iasl (all)
cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I.
-I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector
-Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized
-Wno-pointer-sign -c
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
cc1: warnings being treated as errors
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
In function 'AcpiDmIsResourceTemplate':
/usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
warning: dereferencing type-punned pointer will break strict-aliasing rules
*** [dmresrc.o] Error code 1

Stop in /usr/src/usr.sbin/acpi/iasl.
*** [all] Error code 1

-- 
http://ache.vniz.net/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: -current broken in acpi/iasl

2013-01-18 Thread Sergey V. Dyatko
On Fri, 18 Jan 2013 22:17:27 +0400
Andrey Chernov a...@freebsd.org wrote:

 === usr.sbin/acpi/iasl (all)
 cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I.
 -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99
 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k
 -Wno-uninitialized -Wno-pointer-sign -c
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
 cc1: warnings being treated as errors
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
 In function 'AcpiDmIsResourceTemplate':
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
 warning: dereferencing type-punned pointer will break strict-aliasing
 rules *** [dmresrc.o] Error code 1
 
 Stop in /usr/src/usr.sbin/acpi/iasl.
 *** [all] Error code 1
 

according to rumors buildworld done successfully with the clang. But I
didn't test it. Look at buildworld is broken ? thread 


-- 
wbr, tiger
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: -current broken in acpi/iasl

2013-01-18 Thread David Wolfskill
On Fri, Jan 18, 2013 at 09:34:26PM +0300, Sergey V. Dyatko wrote:
 On Fri, 18 Jan 2013 22:17:27 +0400
 Andrey Chernov a...@freebsd.org wrote:
 
  === usr.sbin/acpi/iasl (all)
  cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I.
  -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99
  -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k
  -Wno-uninitialized -Wno-pointer-sign -c
  /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
  cc1: warnings being treated as errors
  /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
  In function 'AcpiDmIsResourceTemplate':
  /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
  warning: dereferencing type-punned pointer will break strict-aliasing
  rules *** [dmresrc.o] Error code 1
  
  Stop in /usr/src/usr.sbin/acpi/iasl.
  *** [all] Error code 1
  
 
 according to rumors buildworld done successfully with the clang. But I
 didn't test it. Look at buildworld is broken ? thread 
 ...

My head (10.0-CURRENT) builds (on my laptop  build machine) were
uneventful; e.g.:

FreeBSD g1-227.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #796  
r245584M/245600: Fri Jan 18 06:07:23 PST 2013 
r...@g1-227.catwhisker.org:/usr/obj/usr/src/sys/CANARY  i386

And yes, I use clang.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpCfXU5Sqs2v.pgp
Description: PGP signature


Re: -current broken in acpi/iasl

2013-01-18 Thread John-Mark Gurney
Sergey V. Dyatko wrote this message on Fri, Jan 18, 2013 at 21:34 +0300:
 On Fri, 18 Jan 2013 22:17:27 +0400
 Andrey Chernov a...@freebsd.org wrote:
 
  === usr.sbin/acpi/iasl (all)
  cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I.
  -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99
  -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k
  -Wno-uninitialized -Wno-pointer-sign -c
  /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
  cc1: warnings being treated as errors
  /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
  In function 'AcpiDmIsResourceTemplate':
  /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
  warning: dereferencing type-punned pointer will break strict-aliasing
  rules *** [dmresrc.o] Error code 1
  
  Stop in /usr/src/usr.sbin/acpi/iasl.
  *** [all] Error code 1
  
 
 according to rumors buildworld done successfully with the clang. But I
 didn't test it. Look at buildworld is broken ? thread 

Looks like this broken when jkim imported the latest ACPICA code base:
https://svnweb.freebsd.org/base?view=revisionrevision=245582

I've forward the tinderbox failure to him, so hopefully he'll fix it
shortly...

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: -current broken in acpi/iasl

2013-01-18 Thread Andrey Chernov
On 18.01.2013 22:37, David Wolfskill wrote:
 On Fri, Jan 18, 2013 at 09:34:26PM +0300, Sergey V. Dyatko wrote:
 On Fri, 18 Jan 2013 22:17:27 +0400
 Andrey Chernov a...@freebsd.org wrote:

 === usr.sbin/acpi/iasl (all)
 cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I.
 -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99
 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k
 -Wno-uninitialized -Wno-pointer-sign -c
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
 cc1: warnings being treated as errors
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
 In function 'AcpiDmIsResourceTemplate':
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
 warning: dereferencing type-punned pointer will break strict-aliasing
 rules *** [dmresrc.o] Error code 1

 Stop in /usr/src/usr.sbin/acpi/iasl.
 *** [all] Error code 1


 according to rumors buildworld done successfully with the clang. But I
 didn't test it. Look at buildworld is broken ? thread 
 ...
 
 My head (10.0-CURRENT) builds (on my laptop  build machine) were
 uneventful; e.g.:
 
 FreeBSD g1-227.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #796  
 r245584M/245600: Fri Jan 18 06:07:23 PST 2013 
 r...@g1-227.catwhisker.org:/usr/obj/usr/src/sys/CANARY  i386
 
 And yes, I use clang.
 
 Peace,
 david
 

I have no clang bloat, it happens with good old gcc.




signature.asc
Description: OpenPGP digital signature


make buildworld failures with NO_KERBEROS=

2013-01-18 Thread Fabian Keil
Recently make buildworld started failing for me:

8
=== include/xlocale (installincludes)
sh /usr/src/tools/install.sh -C -o root -g wheel -m 444  _ctype.h _inttypes.h 
_langinfo.h _locale.h _monetary.h _stdio.h _stdlib.h _string.h _time.h _wchar.h 
/usr/obj/usr/src/tmp/usr/include/xlocale
=== kerberos5 (includes)
set -e; cd /usr/src/kerberos5; /usr/obj/usr/src/make.amd64/make buildincludes; 
/usr/obj/usr/src/make.amd64/make installincludes
=== kerberos5/doc (buildincludes)
=== kerberos5/lib (buildincludes)
=== kerberos5/lib/libasn1 (buildincludes)
compile_et 
/usr/src/kerberos5/lib/libasn1/../../../crypto/heimdal/lib/asn1/asn1_err.et
compile_et: No such file or directory
*** [asn1_err.h] Error code 1

Stop in /usr/src/kerberos5/lib/libasn1.
*** [buildincludes] Error code 1

Stop in /usr/src/kerberos5/lib.
*** [buildincludes] Error code 1

Stop in /usr/src/kerberos5.
*** [includes] Error code 1

Stop in /usr/src/kerberos5.
*** [kerberos5.includes__D] Error code 1

Stop in /usr/src.
*** [_includes] Error code 1

Stop in /usr/src.
*** [buildworld] Error code 1

Stop in /usr/src.
8

I was still using the recently de-supported NO_KERBEROS= and
changing it to WITHOUT_KERBEROS= got it working again.

I'm still wondering if this is the expected behaviour, though.

Shouldn't buildworld create a usable compile_et instead of
relying on compile_et's existence in /usr/bin?

Fabian


signature.asc
Description: PGP signature


Re: make buildworld failures with NO_KERBEROS=

2013-01-18 Thread Steve Kargl
On Fri, Jan 18, 2013 at 08:54:03PM +0100, Fabian Keil wrote:
 Recently make buildworld started failing for me:
 
 8
 === include/xlocale (installincludes)
 sh /usr/src/tools/install.sh -C -o root -g wheel -m 444  _ctype.h _inttypes.h 
 _langinfo.h _locale.h _monetary.h _stdio.h _stdlib.h _string.h _time.h 
 _wchar.h /usr/obj/usr/src/tmp/usr/include/xlocale
 === kerberos5 (includes)
 set -e; cd /usr/src/kerberos5; /usr/obj/usr/src/make.amd64/make 
 buildincludes; /usr/obj/usr/src/make.amd64/make installincludes
 === kerberos5/doc (buildincludes)
 === kerberos5/lib (buildincludes)
 === kerberos5/lib/libasn1 (buildincludes)
 compile_et 
 /usr/src/kerberos5/lib/libasn1/../../../crypto/heimdal/lib/asn1/asn1_err.et
 compile_et: No such file or directory
 *** [asn1_err.h] Error code 1
 

See the thread started here:
http://lists.freebsd.org/pipermail/freebsd-current/2013-January/039083.html

AFAICT, it is an ordering issue.  usr.bin/compile_et needs to
be a bootstrap tool, but it depends on some bits from kerberos5
and building kerberos5 needs compile_et.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sysctl -a causes kernel trap 12

2013-01-18 Thread Brandon Gooch
On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 To all: this became more and more hard to replicate lately.  I've
 tried these options and the most important progress is that it's
 possible to get a crashdump when debug.debugger_on_panic=0 and I
 managed to get a backtrace which indicates the panic occur when trying
 to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait -
 propagate_priority, but after I've added some instruments to the
 surrounding code and enabled INVARIANT and/or WITNESS, it mysteriously
 went away.

 Reverting my instruments code and update to latest svn makes the issue
 disappear for one day.  I've hit it again today but unfortunately
 didn't get a successful dump and after reboot I can't reproduce it
 again :(

 Still trying...


Any updates Xin?

I was actually hitting what I believe to be exactly the same issue as you
on one of my systems, and, as you've seen, adding any extra debugging or
diagnostics seemed to eliminate the issue.

I was able to generate quite a few vmcores and still have these sitting
around in my filesystem (along with the kernels that helped produce them).

I can recreate this crash on my system by compiling the NVIDIA driver with
clang at -01 and above. Although it's been noted that this issue has been
seen in scenarios without an NIVIDIA driver in the mix, whatever is
happening in the kernel to cause the panic is somehow triggered by this, at
least on my system.

-Brandon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sysctl -a causes kernel trap 12

2013-01-18 Thread Xin Li
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 01/18/13 12:50, Brandon Gooch wrote:
 On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net 
 mailto:delp...@delphij.net wrote:
 
 -BEGIN PGP SIGNED MESSAGE- Hash: SHA256
 
 To all: this became more and more hard to replicate lately.  I've 
 tried these options and the most important progress is that it's 
 possible to get a crashdump when debug.debugger_on_panic=0 and I 
 managed to get a backtrace which indicates the panic occur when
 trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait
 - propagate_priority, but after I've added some instruments to
 the surrounding code and enabled INVARIANT and/or WITNESS, it
 mysteriously went away.
 
 Reverting my instruments code and update to latest svn makes the
 issue disappear for one day.  I've hit it again today but
 unfortunately didn't get a successful dump and after reboot I can't
 reproduce it again :(
 
 Still trying...
 
 
 Any updates Xin?

No, it mysteriously disappeared for now.  According to my
understanding to recent svn commits, I didn't see anybody committing
something that fixes it but I can no longer panic my system, with or
without debugging code :(

 I was actually hitting what I believe to be exactly the same issue
 as you on one of my systems, and, as you've seen, adding any extra 
 debugging or diagnostics seemed to eliminate the issue.
 
 I was able to generate quite a few vmcores and still have these
 sitting around in my filesystem (along with the kernels that helped
 produce them).
 
 I can recreate this crash on my system by compiling the NVIDIA
 driver with clang at -01 and above. Although it's been noted that
 this issue has been seen in scenarios without an NIVIDIA driver in
 the mix, whatever is happening in the kernel to cause the panic is
 somehow triggered by this, at least on my system.

I'm not sure if this is the same problem.  Could you please try using
gcc to compile the nVIdia driver and see if that fixes the problem?

Cheers,
- -- 
Xin LI delp...@delphij.nethttps://www.delphij.net/
FreeBSD - The Power to Serve!   Live free or die
-BEGIN PGP SIGNATURE-

iQEcBAEBCgAGBQJQ+bcKAAoJEG80Jeu8UPuz5D8H/RFSmPv2nNqGmLCNZpElesN5
IYHWTNwxekFLC5M/jeYCLePLGEozBqOBzryrVr1xslvIJJf2w0NLCEIzyC+kdWy9
ksi+DihihuwqEp7BIieQi/HQkwhFKxm0SmovPYu8Al3rFFyazuMCHstuToWyT9sN
OV8ZjyinFIyb8EPqm7V6Ziwi7A6sApHO5SlQXscqANrT03FrU4I8tseNzdDX9uwQ
zzewf05rkcko771Vk7JI9Xwu7VHZ+eN4NbujBhuVhMWw+utZSJFOf67o11JZw9B0
aM1PCfZef2NM9OfAN40JTY4/Hjk6TSygJKu3mGd3R5tjcRywU0ypwPXgOsUxlVg=
=3Kk8
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


[head tinderbox] failure on ia64/ia64

2013-01-18 Thread FreeBSD Tinderbox
TB --- 2013-01-18 20:22:39 - tinderbox 2.10 running on freebsd-current.sentex.ca
TB --- 2013-01-18 20:22:39 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE 
FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 
d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC  amd64
TB --- 2013-01-18 20:22:39 - starting HEAD tinderbox run for ia64/ia64
TB --- 2013-01-18 20:22:39 - cleaning the object tree
TB --- 2013-01-18 20:23:50 - /usr/local/bin/svn stat /src
TB --- 2013-01-18 20:23:54 - At svn revision 245609
TB --- 2013-01-18 20:23:55 - building world
TB --- 2013-01-18 20:23:55 - CROSS_BUILD_TESTING=YES
TB --- 2013-01-18 20:23:55 - MAKEOBJDIRPREFIX=/obj
TB --- 2013-01-18 20:23:55 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
TB --- 2013-01-18 20:23:55 - SRCCONF=/dev/null
TB --- 2013-01-18 20:23:55 - TARGET=ia64
TB --- 2013-01-18 20:23:55 - TARGET_ARCH=ia64
TB --- 2013-01-18 20:23:55 - TZ=UTC
TB --- 2013-01-18 20:23:55 - __MAKE_CONF=/dev/null
TB --- 2013-01-18 20:23:55 - cd /src
TB --- 2013-01-18 20:23:55 - /usr/bin/make -B buildworld
 Building an up-to-date make(1)
 World build started on Fri Jan 18 20:23:59 UTC 2013
 Rebuilding the temporary build tree
 stage 1.1: legacy release compatibility shims
 stage 1.2: bootstrap tools
 stage 2.1: cleaning up the object tree
 stage 2.2: rebuilding the object tree
 stage 2.3: build tools
 stage 3: cross tools
 stage 4.1: building includes
 stage 4.2: building libraries
 stage 4.3: make dependencies
 stage 4.4: building everything
[...]
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmbuffer.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmdeferred.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmnames.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmopcode.c
cc -O2 -pipe  -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys 
-std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized 
-Wno-pointer-sign -c 
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c
cc1: warnings being treated as errors
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:
 In function 'AcpiDmIsResourceTemplate':
/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:
 warning: dereferencing type-punned pointer will break strict-aliasing rules
*** [dmresrc.o] Error code 1

Stop in /src/usr.sbin/acpi/iasl.
*** [all] Error code 1

Stop in /src/usr.sbin/acpi.
*** [all] Error code 1

Stop in /src/usr.sbin.
*** [usr.sbin.all__D] Error code 1

Stop in /src.
*** [everything] Error code 1

Stop in /src.
*** Error code 1

Stop in /src.
TB --- 2013-01-18 21:51:26 - WARNING: /usr/bin/make returned exit code  1 
TB --- 2013-01-18 21:51:26 - ERROR: failed to build world
TB --- 2013-01-18 21:51:26 - 4028.99 user 952.67 system 5326.66 real


http://tinderbox.freebsd.org/tinderbox-head-ss-build-HEAD-ia64-ia64.full
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: -current broken in acpi/iasl

2013-01-18 Thread Jung-uk Kim
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2013-01-18 13:39:01 -0500, John-Mark Gurney wrote:
 Sergey V. Dyatko wrote this message on Fri, Jan 18, 2013 at 21:34
 +0300:
 On Fri, 18 Jan 2013 22:17:27 +0400 Andrey Chernov
 a...@freebsd.org wrote:
 
 === usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2
 -DACPI_ASL_COMPILER -I. 
 -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 
 -fstack-protector -Wsystem-headers -Werror -Wall
 -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c 
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c

 
cc1: warnings being treated as errors
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:

 
In function 'AcpiDmIsResourceTemplate':
 /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419:

 
warning: dereferencing type-punned pointer will break strict-aliasing
 rules *** [dmresrc.o] Error code 1
 
 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1
 
 
 according to rumors buildworld done successfully with the clang.
 But I didn't test it. Look at buildworld is broken ? thread
 
 Looks like this broken when jkim imported the latest ACPICA code
 base: 
 https://svnweb.freebsd.org/base?view=revisionrevision=245582
 
 I've forward the tinderbox failure to him, so hopefully he'll fix
 it shortly...

It should be fixed now (r245636).  Sorry for the breakage.

Jung-uk Kim
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (FreeBSD)

iQEcBAEBAgAGBQJQ+esDAAoJECXpabHZMqHOypgIAILl0S2cvEdTQWXJ4PWase07
yKA+DPHYAUx09JHbnLfEeA+KLFUz2jnX7dYR9ohSMcsnkI1/AH/z8dkFc3NLPUQw
TXh1edQyXaYr0WK+3sW81Tl5thka5VwjznoJj1r/Og8Nrx/xYUYCEtpPsjDU1hW0
8T897m6MqOSZokWs4dyOt1ZWoncGRTHgC5tCzjcmAuiOTIkZ7hdLNXKu1nm+cgcy
LNEvJf/d1bz6UzQ9xxCxG+HttZhi4YL8uAAYMHZtydM+Zp5yZskajyNmDkThSMhu
LrUohDfMLk84DkyoAfzojr90o8tk6TujfHR+osF3oj9NkDi6o6VK0AVs1yKPg5c=
=poDO
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sysctl -a causes kernel trap 12

2013-01-18 Thread Brandon Gooch
On Fri, Jan 18, 2013 at 2:56 PM, Xin Li delp...@delphij.net wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA512

 On 01/18/13 12:50, Brandon Gooch wrote:
  On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net
  mailto:delp...@delphij.net wrote:
 
  -BEGIN PGP SIGNED MESSAGE- Hash: SHA256
 
  To all: this became more and more hard to replicate lately.  I've
  tried these options and the most important progress is that it's
  possible to get a crashdump when debug.debugger_on_panic=0 and I
  managed to get a backtrace which indicates the panic occur when
  trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait
  - propagate_priority, but after I've added some instruments to
  the surrounding code and enabled INVARIANT and/or WITNESS, it
  mysteriously went away.
 
  Reverting my instruments code and update to latest svn makes the
  issue disappear for one day.  I've hit it again today but
  unfortunately didn't get a successful dump and after reboot I can't
  reproduce it again :(
 
  Still trying...
 
 
  Any updates Xin?

 No, it mysteriously disappeared for now.  According to my
 understanding to recent svn commits, I didn't see anybody committing
 something that fixes it but I can no longer panic my system, with or
 without debugging code :(

  I was actually hitting what I believe to be exactly the same issue
  as you on one of my systems, and, as you've seen, adding any extra
  debugging or diagnostics seemed to eliminate the issue.
 
  I was able to generate quite a few vmcores and still have these
  sitting around in my filesystem (along with the kernels that helped
  produce them).
 
  I can recreate this crash on my system by compiling the NVIDIA
  driver with clang at -01 and above. Although it's been noted that
  this issue has been seen in scenarios without an NIVIDIA driver in
  the mix, whatever is happening in the kernel to cause the panic is
  somehow triggered by this, at least on my system.

 I'm not sure if this is the same problem.  Could you please try using
 gcc to compile the nVIdia driver and see if that fixes the problem?

 Cheers,
 - --
 Xin LI delp...@delphij.nethttps://www.delphij.net/
 FreeBSD - The Power to Serve!   Live free or die


Indeed, a gcc compiled NVIDIA module eliminates the issue, sorry if I
hadn't mentioned this earlier.

What was happening to me at first was that my system would just hang while
booting. I was able to figure out that it was during /etc/rc.d/initrandom.
I actually got to a point where I removed the call to sysctl -a from
'better_than_nothing()' in /etc/rc.d/initrandom to have a booting system. I
finally had a situation where I could get a panic by adding SW_WATCHDOG to
my kernel and running watchdogd(8).

For me, this panic would come and go seemingly at random as well, and I
couldn't fumble my way around in the debugger to learn much of anything
when I first started seeing it. I just started a process of modularizing
everything I could in my kernel config, then loading modules 1-by-1 and
booting over-and-over until I finally found what appeared to be the
problem, which was the NVIDIA module compiled with clang.

Oh, another thing: at times it seemed as though it was the number of
modules loaded, as I could get the hang with 41 modules loaded, but not 40
or 42?! I admit, when I was seeing that behavior, I hadn't eliminated the
NVIDIA driver from my loaded modules. I need to revisit the panic situation
to confirm this particular strangeness.

Here's the last panic I had:

Unread portion of the kernel message buffer:
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1175 (sysctl)

(kgdb) bt
#0  doadump (textdump=1694704112) at pcpu.h:229
#1  0x802fab82 in db_fncall (dummy1=value optimized out,
dummy2=value optimized out, dummy3=value optimized out, dummy4=value
optimized out) at /usr/src/sys/ddb/db_command.c:578
#2  0x802fa85a in db_command (last_cmdp=value optimized out,
cmd_table=value optimized out, dopager=1) at
/usr/src/sys/ddb/db_command.c:449
#3  0x802fa612 in db_command_loop () at
/usr/src/sys/ddb/db_command.c:502
#4  0x802fcf60 in db_trap (type=value optimized out, code=0) at
/usr/src/sys/ddb/db_main.c:231
#5  0x804a7b93 in kdb_trap (type=12, code=0, tf=value optimized
out) at /usr/src/sys/kern/subr_kdb.c:654
#6  0x807157c5 in trap_fatal (frame=0xff8865032670, eva=value
optimized out) at /usr/src/sys/amd64/amd64/trap.c:867
#7  0x80715adb in trap_pfault (frame=0x0, usermode=0) at
/usr/src/sys/amd64/amd64/trap.c:698
#8  0x8071529b in trap (frame=0xff8865032670) at
/usr/src/sys/amd64/amd64/trap.c:463
#9  0x806ff382 in calltrap () at exception.S:228
#10 0x8047bd50 in sysctl_sysctl_next_ls (lsp=value optimized out,
name=0xff8865032a80, namelen=value optimized out,
next=0xff8865032898, len=0xff8865032904,