Re: requesting vinum help

2003-11-26 Thread Joel M. Baldwin
Geeze!  Not only aren't there any emails, but I've started a full blown thread!

--On Tuesday, November 25, 2003 8:39 AM -0800 "Joel M. Baldwin" <[EMAIL PROTECTED]> wrote:
Could a vinum guru please contact me via email?

I've lost 2 vinum volumes as a result of the latest fiasco and naturally
am eager to figure out what's going on and recover the data.


--On Tuesday, November 25, 2003 10:48 AM -0600 Eric Anderson <[EMAIL PROTECTED]> wrote:
This isn't necessarily directed at you - I'm just using this email as a footstep to send this general comment -

I am kind of under the assumption that -current is more of a test bed, and anything can
happen at any time, which is why it's bad to run -current on a machine you care deeply
about (at least its data).  I think I've seen at least 5 mails in the past few weeks 
about
people getting jammed into a corner with (what sounds like) production type boxes, or 
at
least important boxes (or they wouldn't need a vinum?).. It seems odd to me that they
wouldn't give it a whirl first before attempting to use it on a box they seem so 
protective of.
Anyway, I'm just stating that running -current is for testing and developing, not 
really
for production - at least I'm fairly certain.
Please wrap your messages rather than typing your entire message in one LONG line.

How can you test if you don't use?  I'm not in 'production', I'm just running a box at 
home
and I'm sure that there are lots of others doing the same.  I've been running -current
for something like 5 years, have rarely had any major problems, and have on occasion 
reported
my problems to the list.  That's what it's all about.
The worst you could say is that I should have been more careful and that I should read
-current ( which I archive but don't read because there's rarely anything of interest
in the list ( with the occasional notable exception like now ) )
Feel free to delete this mail and ignore me..

Eric

--
--
Eric Anderson  Systems Administrator  Centaur Technology
All generalizations are false, including this one.
--


--On Wednesday, November 26, 2003 10:25 AM +1030 Greg 'groggy' Lehey <[EMAIL PROTECTED]> wrote:

On Tuesday, 25 November 2003 at 10:48:44 -0600, Eric Anderson wrote:

I am kind of under the assumption that -current is more of a test bed,
and anything can happen at any time, which is why it's bad to run
-current on a machine you care deeply about (at least its data).
Correct.  More to the point, though, it requires you to rely more on
yourself.  At the very least, this means RTFM, which in this case
includes a number of things to submit if you have problems.  It's at
the end of vinum(4) or at
http://www.vinumvm.org/vinum/how-to-debug.html.
Does someone answer email at [EMAIL PROTECTED]  I never received a response when I
sent a query there and that's the email in the above URL.
I've rtfm and I don't find it very useful in explaining why what should be
a good vinum volume is giving me a 'fsck: Could not determine filesystem type'.
A also don't know what operations would cause data damage so I'm asking for
assistance rather than running around in the dark.
Greg
--
See complete headers for address and phone numbers.


--On Wednesday, November 26, 2003 1:05 AM +0100 Max Laier <[EMAIL PROTECTED]> wrote:
The not so good point about the original request is that he is looking
for private assistance, while the problem and - even more - the
solution of it might be very interesting to all of us (more than much
of the other ongoing threads, for sure).
I was trying to use some restraint and not rant and rave in public like
I wanted to do.  I'm rather miffed that nothing appeared in UPDATING.
Rather than an unproductive public RANT I thought I'd ask for private assistance.
I can post a summary afterwards if you like, or even better write a better
FAQ/tutorial on vinum.
--
Best regards,
 Maxmailto:[EMAIL PROTECTED]


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


requesting vinum help

2003-11-25 Thread Joel M. Baldwin
Could a vinum guru please contact me via email?

I've lost 2 vinum volumes as a result of the latest fiasco and naturally
am eager to figure out what's going on and recover the data.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Server locking hard -- A LOT!!!

2003-02-05 Thread Joel M. Baldwin

Check your RAM.

Check your BIOS settings.  Try running the system with the
failsafe settings if your BIOS has that.

If all else fails put the debug options into the kernel,
add a serial console, and see if you can break into ddb.

--On Tuesday, February 04, 2003 10:24 PM -0500 Muhannad Asfour 
<[EMAIL PROTECTED]> wrote:

Hello.  I've recently faced a rather odd issue that I've never seen
before.  I bought a new server (specs below), and I loaded it up with
FreeBSD 5.0-RELEASE (I know, I know, not for a production environment,
but this is a personal server).  Now, whenever I get about 30
simultaneous connections, the box just locks hard.  I tested all the
hardware components (CPU, Memory, HD, NIC, etc.) and even bought new
ones just to make sure and all end up with the same result.  I can
never keep a decent uptime (never went past 2 days so far).  As soon
as I get a mediocre http load (30 simul. connections), the box just
locks hard.  I built a debug kernel, I tried everything imaginable,
and I have not found a solution whatsoever.  Everyone seems to be
stumped by this.  I tried FreeBSD 4.7-RELEASE, 4.7-STABLE,
5.0-RELEASE and 5.0-CURRENT on this box.  All give me the same
result.  I checked everywhere for relevant logs to explain what is
occuring, but had no such luck.  This is truly the million dollar
question for me right now, because I have no idea why it would lock
under such a petty load.  I'm not sure what to do to fix this issue,
I've tried many different areas for support and haven't come up with
anything as I stated earlier.  I'm not overclocking or anything if
that's what you're wondering.  If anyone could assist me in any way
shape or form to get this working, I would appreciate it very very
much.  Also, if you e-mail back, I'm not subscribed to the lists, so
could you please CC it to me.  In the past 4 hours, it has locked
hard about 3 times because of a 28-34 connection load.

Machine Specs:
Single Pentium IV 2.4 GHz processor
ASUS P4S533 motherboard
512MB DDR333 RAM (will be 1GB next week)
120GB Maxtor 7200rpm (ATA133) HDD
40GB Maxtor 7200 rpm (ATA66) HDD
Floppy Disk Drive
ATI Rage 128 (32 meg) AGP 4x graphics adapter
52x LG CD-ROM drive
3Com 3C905C-TX NIC
Currently running FreeBSD 5.0-CURRENT as of Sunday Feb. 2, 13:02:05
EST 2003.

Thank you very much



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message






To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: make release errors

2003-01-16 Thread Joel M. Baldwin


--On Thursday, January 16, 2003 6:12 AM -0800 "Joel M. Baldwin" 
<[EMAIL PROTECTED]> wrote:


I've been trying to do a 'make release' the last couple of weeks and
keep getting the following error.


geeze  time for a coffee.  sorry people. here is the error.

===> chinese/mozilla-tclp
===>   Creating README.html for zh-mozilla-tclp-1.U,1
Receiving mozilla-source-1.1.tar.gz (40817026 bytes): 100% (ETA 00:00) 

40817026 bytes transferred in 4917.1 seconds (8.11 kBps)
Receiving libart_lgpl-1.1.tar.gz (124497 bytes): 100%
124497 bytes transferred in 10.9 seconds (11.11 kBps)
Receiving libmng-1.0.4.tar.gz (568950 bytes): 100% (ETA 00:00)
568950 bytes transferred in 63.9 seconds (8.70 kBps)
In file included from libmng.h:307,
from libmng_callback_xs.c:48:
libmng_types.h:158:23: lcms/lcms.h: No such file or directory
In file included from libmng.h:307,
from libmng_callback_xs.c:48:
libmng_types.h:366: syntax error before "mng_cmsprof"
libmng_types.h:366: warning: data definition has no type or storage 
class
libmng_types.h:367: syntax error before "mng_cmstrans"
libmng_types.h:367: warning: data definition has no type or storage 
class
libmng_types.h:368: syntax error before "mng_CIExyY"
libmng_types.h:368: warning: data definition has no type or storage 
class
libmng_types.h:369: syntax error before "mng_CIExyYTRIPLE"
libmng_types.h:369: warning: data definition has no type or storage 
class
libmng_types.h:370: syntax error before "mng_gammatabp"
libmng_types.h:370: warning: data definition has no type or storage 
class
In file included from libmng_callback_xs.c:49:
libmng_data.h:246: syntax error before "mng_cmsprof"
.xpi pretty-print-build-depends-list: not found
In file included from libmng.h:307,
from libmng_callback_xs.c:48:
libmng_types.h:158:23: lcms/lcms.h: No such file or directory
In file included from libmng.h:307,
from libmng_callback_xs.c:48:
libmng_types.h:366: syntax error before "mng_cmsprof"
libmng_types.h:366: warning: data definition has no type or storage 
class
libmng_types.h:367: syntax error before "mng_cmstrans"
libmng_types.h:367: warning: data definition has no type or storage 
class
libmng_types.h:368: syntax error before "mng_CIExyY"
libmng_types.h:368: warning: data definition has no type or storage 
class
libmng_types.h:369: syntax error before "mng_CIExyYTRIPLE"
libmng_types.h:369: warning: data definition has no type or storage 
class
libmng_types.h:370: syntax error before "mng_gammatabp"
libmng_types.h:370: warning: data definition has no type or storage 
class
In file included from libmng_callback_xs.c:49:
libmng_data.h:246: syntax error before "mng_cmsprof"
.xpi pretty-print-run-depends-list: not found
sed: 1: "s`%%BUILD_DEPENDS%%`=== ...": unescaped newline inside 
substitute pattern
*** Error code 1

Stop in /share/snap/usr/ports/chinese/mozilla-tclp.
*** Error code 1

Stop in /share/snap/usr/ports/chinese/mozilla-tclp.
*** Error code 1

Stop in /share/snap/usr/ports/chinese.
*** Error code 1

Stop in /share/snap/usr/ports.
*** Error code 1

Stop in /disk2/usr.src/release.
su-2.05b#

Is this something on my end?  I assume other aren't having problems
with it being this close to release date.

Any suggestions on the fix?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



make release errors

2003-01-16 Thread Joel M. Baldwin

I've been trying to do a 'make release' the last couple of weeks and
keep getting the following error.

Is this something on my end?  I assume other aren't having problems
with it being this close to release date.

Any suggestions on the fix?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



bad ACPL asl's on motherboards

2003-01-16 Thread Joel M. Baldwin

I gather that there are quite a few Motherboards with bad ACPI asl's
on them.  I know that my Abit BP6 sure has problems.  As a result I
can't run ACPI.

What are those of us with these motherboards supposed to to?

I realize that we can use acpidump to get the asl, correct it,
and then recompile it using iasl from the acpicatools port, to
generate the aml that can be used during boot.

acpi_dsdt_load="YES"# DSDT Overriding
acpi_dsdt_name="/boot/abitBP6.aml"
   # Override DSDT in BIOS by this file

But this doesn't help me much since I don't know what corrections to
make to the original asl file.  Nor does it help the other people
out there using BP6's.

Is someone going to create a asl erratica web page?
How about something to help us correct our asl files like a FAQ?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 5.0 Freezes under high load with SMP.

2003-01-16 Thread Joel M. Baldwin
--On Tuesday, January 14, 2003 8:15 PM +0100 Wilko Bulte 
<[EMAIL PROTECTED]> wrote:

On Tue, Jan 14, 2003 at 11:12:30AM -0800, Nate Lawson wrote:

On Tue, 14 Jan 2003, wade wrote:
> Although GENERIC functions perfectly, when I enable SMP in
> GENERIC, the box freezes during big build jobs ( i.e. buildworld.
> ).  This box functioned well in SMP mode und 4.7-STABLE.  The big
> problem is that freezing does not leave core files or any other
> debugging information.
>
> Hardware:
>   ASUS CUV4X-D
>   2 X PIII 600 MHz Coppermine.
>   512 MB 133MHz RAM.

Hit ctl-alt-esc on console to enter DDB (enable this kernel option
if you haven't already).  Type "trace" to find out what is hung.  If
this doesn't work, attach a serial console and do a break there and
"trace".


Interesting, as I have been trying the last 3 days to reproduce
freezes  that people reported on 5.0 on (specifically) the ASUS BP6
mainboard.


I was one of the people having 'hard-lock' problems.

I have not had a single unexplained lock since I upgraded my BIOS
to the latest, version RU, and did another makeworld.  I can't say
which fixed the 'hard-locks', the BIOS upgrade, or changes
in the kernel.

ACPI does not appear to work on this motherboard.  I have to
disable it.  The attached asl file gives errors when I try
using iasl on it.

Anyone have suggestions for the correct fixes to it?


I ran continuous make -j16 buildworlds to see if I could break it,
but  no luck (or very good luck, depends on your perspective ;-)

--
|   / o / /_  _   		[EMAIL PROTECTED]
| /|/ / / /(  (_)  Bulte

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 5.0-RC2 won't install (zf_read & vm_fault) while 4.7 will

2003-01-11 Thread Joel M. Baldwin


--On Friday, January 10, 2003 5:36 PM -0800 "Gary W. Swearingen" 
<[EMAIL PROTECTED]> wrote:

I got an old P100 I'm preparing for NAT duty at a Linux meeting and I
tried to install 5.0-RC2 on it.  Near end of mfsroot.flp loading I
get:

zf_read: unexpected EOF

but it continues booting.  Just after normal msgs about fd0 & ppc, it
starts spewing unending messages so fast I can't read them well.
Something like this, over and over:

vm_fault: pager read error pid 1 error 5 [...]
spec_getpages: (md0) [...]

I bit-checked the CD and floppies and they're good.


Try making another floppy.  I've thought I had a good floppy
before and gotten that error.

Get another floppy, format it, copy the boot.flp image onto it,
and see if you're still having the errors.


Apparently I have an Intel chipset.  From boot msgs:

Intel 82371SB PCI to ISA bridge
Intel PIIX3 ATA controller
fdc0 NEC 72065B or clone

The 'puter installs Linux and FreeBSD 4.7 OK (except only Linux
will boot from the CD).

There were only a few msgs at groups.google.com with that error msg
and not much help, but then I don't need 5.0 and so don't need help.


BTW, that "md0" looked interesting to me because I just had this error
on my main 5.0 (08'jan'03) system when trying to fix my script which
uses (used) obsolete "vnconfig".

$ XXX=$(mdconfig -a -t vnode -u 0 /big/ftp/pub/5.0-RC2-i386-disc1.iso)
mdconfig: ioctl(/dev/mdctl): Bad address

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Hard Lock info

2002-11-23 Thread Joel M. Baldwin

Ack!  ^M sends the email.  continuing . . .

I came home last night and found my system Hard Locked.
I have a serial console hooked up and used CR~^B to break
into ddb and tried a few things.  The previous message
has a log of everything.

Drats!  I was hoping that the Hard Locks were gone.

1.) Does anything look out of place in the trace or ps?
2.) What additional commands should I do next time this happens?

This is on a SMP kernel less than a couple of days old.
The dmesg of the system is attached.

--On Saturday, November 23, 2002 12:07 PM -0800 "Joel M. Baldwin" 
<[EMAIL PROTECTED]> wrote:


I came home last night and found the system Hard Locked.
^

. . . snip . . .
Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #23: Fri Nov 22 07:28:07 PST 2002
[EMAIL PROTECTED]:/disk2/usr.src/sys/i386/compile/testGeneric.smp
Preloaded elf kernel "/boot/kernel.testgeneric.smp/kernel" at 0xc04db000.
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (467.73-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x665  Stepping = 5
  
Features=0x183fbff
real memory  = 268369920 (255 MB)
avail memory = 23536 (243 MB)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec0
Initializing GEOMetry subsystem
Pentium Pro MTRR support enabled
npx0:  on motherboard
npx0: INT 16 interface
Using $PIR table, 8 entries at 0xc00fd7e0
pcib0:  at pcibus 0 on motherboard
pci0:  on pcib0
IOAPIC #0 intpin 19 -> irq 2
IOAPIC #0 intpin 18 -> irq 5
IOAPIC #0 intpin 17 -> irq 9
IOAPIC #0 intpin 16 -> irq 10
agp0:  mem 0xd000-0xd3ff at device 
0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 0xf000-0xf00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0:  port 0xc000-0xc01f irq 2 at device 
7.2 on pci0
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Timecounter "PIIX"  frequency 3579545 Hz
pci0:  at device 7.3 (no driver attached)
fxp0:  port 0xc400-0xc43f mem 
0xd900-0xd90f,0xd9101000-0xd9101fff irq 5 at device 11.0 on pci0
fxp0: Ethernet address 00:90:27:a5:f4:f2
inphy0:  on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0:  port 0xc800-0xc8ff mem 0xd910-0xd9100fff 
irq 9 at device 13.0 on pci0
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs
pci0:  at device 15.0 (no driver attached)
rl0:  port 0xcc00-0xccff mem 0xd9102000-0xd91020ff 
irq 2 at device 17.0 on pci0
rl0: Realtek 8139B detected. Warning, this may be unstable in autoselect mode
rl0: Ethernet address: 00:04:e2:20:a5:b2
miibus1:  on rl0
rlphy0:  on miibus1
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
atapci1:  port 
0xd800-0xd8ff,0xd400-0xd403,0xd000-0xd007 irq 5 at device 19.0 on pci0
ata2: at 0xd000 on atapci1
atapci2:  port 
0xe400-0xe4ff,0xe000-0xe003,0xdc00-0xdc07 irq 5 at device 19.1 on pci0
ata3: at 0xdc00 on atapci2
orm0:  at iomem 
0xef000-0xe,0xd-0xd17ff,0xc8000-0xc8fff,0xc-0xc7fff on isa0
atkbdc0:  at port 0x64,0x60 on isa0
atkbd0:  flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0:  irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
fdc0:  at port 
0x3f7,0x3f0-0x3f5 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
fd1: <1200-KB 5.25" drive> on fdc0 drive 1
pmtimer0 on isa0
ppc0:  at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
sc0:  at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x30 on isa0
sio0: type 16550A, console
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown:  can't assign resources (port)
unknown:  can't assign resources (memory)
unknown:  can't assign resources (port)
unknown:  can't assign resources (irq)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
unknown:  can't assign resources (port)
Timecounters tick every 10.000 msec
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
ipfw2 initialized, divert enabled, rule-based forwarding enabled, default to deny, 
logging limited to 100 packets/entry b

Hard Lock info

2002-11-23 Thread Joel M. Baldwin

I came home last night and found the system Hard Locked.
^



Stopped at  siointr1+0xf4:  movl$0,brk_state1.0
db>
db> show locks
db> ps
 pid   proc addruid  ppid  pgrp  flag   stat  wmesgwchan 
cmd
71987 c3557000 d364b000 1006 71985 71985 0004000 norm[LOCK  Giant 
c03d01a0] formail
71985 c3242000 d29fa000 1006 71984 71985 0004100 norm[SLPQwait 
c3242000][SLP] procmail
71984 c33d9000 d30880000 1   495 100 norm[CVQ  select 
c03d38c4][SLP] sendmail
71978 c3040760 d2b090000 71976  4586 0004002 norm[SLPQ  piperd 
c31b7790][SLP] as
71977 c3202b10 d290c0000 71976  4586 0004002 norm[LOCK  Giant 
c03d01a0] cc1
71976 c340dce8 d329c0000 71975  4586 0004002 norm[SLPQwait 
c340dce8][SLP] cc
71975 c2c55000 d28f90000 71808  4586 0004002 norm[SLPQwait 
c2c55000][SLP] sh
71949 c31be3b0 d294d0000   495   495 100 norm[SLPQ  kqread 
c3fb8200][SLP] sendmail
71943 c365a1d8 d3acf0000 71941 71902 0004000 norm[SLPQ  piperd 
c298d2c0][SLP] egrep
71942 c365ace8 d3ad50000 71941 71902 0004000 norm[CVQ  select 
c03d38c4][SLP] ping
71941 c3626b10 d38d60000 71940 71902 000 norm[SLPQwait 
c3626b10][SLP] sh
71940 c36883b0 d3aa90000 71905 71902 0004000 norm[SLPQ  piperd 
c298d420][SLP] sh
71905 c33ba938 d2fa20000 71902 71902 0004000 norm[SLPQ  piperd 
c31b7d10][SLP] perl
71902 c3659000 d3ac60000 71896 71902 0004000 norm[SLPQwait 
c3659000][SLP] sh
71896 c324a938 d2a840000   751   751 000 norm[SLPQ  piperd 
c2b7fd10][SLP] cron
71808 c2c56760 d29050000 63998  4586 0004002 norm[SLPQwait 
c2c56760][SLP] make
63998 c31bb000 d290d0000 63996  4586 0004002 norm[SLPQwait 
c31bb000][SLP] sh
63996 c327a588 d2ac30000 26563  4586 0004002 norm[SLPQwait 
c327a588][SLP] make
26563 c3639588 d39950000 26556  4586 0004002 norm[SLPQwait 
c3639588][SLP] sh
26556 c2c553b0 d28fb0000 26555  4586 0004002 norm[SLPQwait 
c2c553b0][SLP] make
--More--
26555 c3573000 d36670000 32897  4586 0004002 norm[SLPQwait 
c3573000][SLP] sh
--More--
32897 c367cce8 d39ec0000 32896  4586 0004002 norm[SLPQwait 
c367cce8][SLP] make
32896 c35bc938 d38360000 32881  4586 0004002 norm[SLPQwait 
c35bc938][SLP] sh
32881 c2c561d8 d29020000 32877  4586 0004002 norm[SLPQwait 
c2c561d8][SLP] make
32877 c367d938 d39f20000  4586  4586 0004002 norm[SLPQwait 
c367d938][SLP] sh
10609 c355b588 d3656000 1001   735 10609 0004100 norm[SLPQ  sbwait 
c2c67964][SLP] imapd
10513 c340c938 d3292000 1001   735 10513 0004100 norm[SLPQ  sbwait 
c2992e64][SLP] imapd
4586 c3202760 d288c0000  1019  4586 0004002 norm[SLPQwait 
c3202760][SLP] make
2852 c34943b0 d33f   80   526   526 100 norm[SLPQ   lockf 
c30f9f00][SLP] httpd
2823 c34fdb10 d35ae000   80   526   526 100 norm[CVQ  select 
c03d38c4][SLP] httpd
1218 c2682588 d1d2a000   80   526   526 100 norm[SLPQ   lockf 
c2e1fe40][SLP] httpd
1049 c2b1d588 d27fe0000  1033  1049 0004002 norm[SLPQ   ttyin 
c2901440][SLP] bash
1048 c2b1b760 d27c40000  1031  1031 0004002 norm[SLPQ  nanslp 
c0408014][SLP] tail
1047 c2b561d8 d28050000  1031  1031 0004002 norm[SLPQ  kqread 
c269c300][SLP] tail
1046 c2b1b3b0 d27a90000  1031  1031 0004002 norm[SLPQ  kqread 
c2734400][SLP] tail
1045 c2b563b0 d280e0000  1031  1031 0004002 norm[SLPQ  kqread 
c2734700][SLP] tail
1044 c2b1dce8 d28030000  1031  1031 0004002 norm[SLPQ  kqread 
c269c500][SLP] tail
1043 c2b1b1d8 d27a80000  1031  1031 0004002 norm[SLPQ  kqread 
c269c400][SLP] tail
1042 c2b1db10 d28010000  1030  1030 0004002 norm[SLPQ  nanslp 
c0408014][SLP] tail
1041 c2b59588 d284d0000  1030  1030 0004002 norm[SLPQ  kqread 
c2734c00][SLP] tail
1040 c26f3b10 d269d0000  1030  1030 0004002 norm[SLPQ  kqread 
c269c900][SLP] tail
1039 c2912ce8 d270a0000  1030  1030 0004002 norm[SLPQ  kqread 
c269ca00][SLP] tail
--More--
1038 c29133b0 d270d0000  1030  1030 0004002 norm[SLPQ  kqread 
c2734b00][SLP] tail
1037 c26f31d8 d26980000  1030  1030 0004002 norm[SLPQ  kqread 
c2612600][SLP] tail
1036 c2b593b0 d284c0000  1030  1030 0004002 norm[SLPQ  kqread 
c2735000][SLP] tail
1035 c26823b0 d1d290000  1030  1030 0004002 norm[SLPQ  kqread 
c2735100][SLP] tail
1034 c2b1d760 d27ff0000  1030  1030 0004002 norm[SLPQ  kqread 
c2622e00][SLP] tail
1033 c2b56000 d2804000 1001  1018  1033 0004102 norm[SLPQwait 
c2b56000][SLP] su
1032 c2b59000 d284a0000  1030  1030 0004002 norm[SLPQ  kqread 
c2729e00][SLP] tail
1031 c2b59b10 d2850  1029  1031 0004002 norm[SLPQwait 
c2b59b10][SLP] sh
1030 c2b591d8 d284b0000  1025  1030 0004002 norm[SLPQwait 
c2b591d8][SLP] sh
1029 c2b1d3b0 d27fd0000 1  1020 0004002 norm[CVQ  select 
c03d38c4][SLP] xterm
1028 c2b56588 d280f0000  1023  1028 0004002 norm[CVQ  select 
c03d38c4][SLP] top
1027 c2b1d938 d2800  1021  1027 0004002 norm[SLPQ   ttyin 
c2900e40][SLP] systat
1025 c2b56760 d28460000 1  1020 0004002 nor

Re: more info - Re: panic: sleeping thread owns a mutex - withdebug traceback

2002-11-20 Thread Joel M. Baldwin
--On Wednesday, November 20, 2002 10:12 PM -0800 Steve Kargl 
<[EMAIL PROTECTED]> wrote:

On Wed, Nov 20, 2002 at 09:59:32PM -0800, Joel M. Baldwin wrote:


--On Wednesday, November 20, 2002 9:47 PM -0800 Nate Lawson
<[EMAIL PROTECTED]> wrote:

> Wow, I haven't seen so many panics from one person before.  Tried
> memtest86.com yet?
>

yes, LONG before I posted the first time.  I've even swaped
the memory from another system.  The problem ISN'T the memory.



Please don't top post.

Have you tested the power supply?

--
Steve


Yes, swapped the 250W that was in there with a 300W from
another system that was stable.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: more info - Re: panic: sleeping thread owns a mutex - withdebug traceback

2002-11-20 Thread Joel M. Baldwin

yes, LONG before I posted the first time.  I've even swaped
the memory from another system.  The problem ISN'T the memory.

--On Wednesday, November 20, 2002 9:47 PM -0800 Nate Lawson 
<[EMAIL PROTECTED]> wrote:

Wow, I haven't seen so many panics from one person before.  Tried
memtest86.com yet?

-Nate






To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



more info - Re: panic: sleeping thread owns a mutex - withdebug traceback

2002-11-20 Thread Joel M. Baldwin
--On Wednesday, November 20, 2002 2:27 PM -0500 John Baldwin 
<[EMAIL PROTECTED]> wrote:
On 20-Nov-2002 Joel M. Baldwin wrote:

--On Wednesday, November 20, 2002 12:01 PM -0500 Robert Watson
<[EMAIL PROTECTED]> wrote:


On Wed, 20 Nov 2002, Robert Watson wrote:


Hmm.  Another thread has decided to sleep while holding an inpcb
mutex.  Any chance this can be reproduced while running WITNESS?
If so, you should get a panic earlier when the other thread sleeps
in the first place.  The easiest way to do that is if you can
reproduce the panic with WITNESS.  If you can't reproduce the
panic, you may be able to extract this from your system core using
gdb -- you want to figure out what the thread owner of the mutex
is doing -- in the context of the kassert()  below, td is the
pointer to the thread that owns the mutex.  I'm not sure how to
extract a stack trace from that information, unfortunately,
perhaps someone can give us pointers.  (note: td from the
priority_propagate() argument is shadowed, which is annoying).


Ack.  I mis-read.  You want the stack from thread td1 (the mutex
owner), not thread td.


The kernel that produced the core dump ALREADY HAS
WITNESS and WITNESS_SKIPSPIN! :(

I'll try to get more info from kgdb, but I doubt that I'll have
much luck since I've never tried using gdb before.


Erm.  Did you manage to look at dmesg then?  If so, you would have
seen warnings from WITNESS earlier about the locks messing up.  If
you can reproduce this and are letting it sit unattended, a better
plan might be to turn on witness_ddb (it's a kernel option, loader
tunable, and sysctl (debug.witness_ddb)) and then when the original
error occurs it will drop into the debugger with a very useful error
message.  You can also get a useful trace at that point from ddb.

--

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


ok, I'vd had 3 events now.

The first event Hard Locked the system and 'pa' printed out
on the serial console.  Like it was trying to panic and got
hosed in the middle of printing the message.

The second event rebooted the system with nothing printed out
on the console.

The third event paniced with a live console.  Note that the
'panic' command didn't produce a core dump.

Slab at 0xc2c39fd4, freei 3 = 0.
panic: Duplicate free of item 0xc2c39264 from zone 0xc0e90c00(VMSPACE)

cpuid = 0; lapic.id = 
Debugger("panic")
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> where     t
Debugger(c0392fc9,0,c03a9554,cd2d9c4c,1) at Debugger+0x55
panic(c03a9554,c2c39264,c0e90c00,c03a7695,6a8) at panic+0x11f
uma_dbg_free(c0e90c00,0,c2c39264,6a8,0) at uma_dbg_free+0x122
uma_zfree_arg(c0e90c00,c2c39264,0,12e,c2fc6000) at uma_zfree_arg+0x124
vmspace_free(c2c39264,c03a7307,31d,31c,186a0) at vmspace_free+0xbe
swapout_procs(1,0,68,c03a8f12,0) at swapout_procs+0x387
vm_daemon(0,cd2d9d48,c0390767,355,0) at vm_daemon+0x6e
fork_exit(c031ab40,0,cd2d9d48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcd2d9d7c, ebp = 0 ---
db> show locks
exclusive sleep mutex VMSPACE (UMA zone) r = 0 (0xc0e90c24) locked @ 
../../../vm/uma_core.c:1704
exclusive sleep mutex PCPU VMSPACE (UMA cpu) r = 0 (0xc0e90ce8) locked 
@ ../../../vm/uma_core.c:1686
shared sx allproc r = 0 (0xc03d0600) locked @ ../../../vm/vm_glue.c:657
exclusive sleep mutex Giant r = 0 (0xc03d) locked @ 
../../../vm/vm_pageout.c:1501
db> panic
panic: from debugger
cpuid = 0; lapic.id = 
boot() called on cpu#0
Uptime: 22m57s


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: panic: sleeping thread owns a mutex - with debug traceback

2002-11-20 Thread Joel M. Baldwin


--On Wednesday, November 20, 2002 2:27 PM -0500 John Baldwin 
<[EMAIL PROTECTED]> wrote:


On 20-Nov-2002 Joel M. Baldwin wrote:

--On Wednesday, November 20, 2002 12:01 PM -0500 Robert Watson
<[EMAIL PROTECTED]> wrote:


On Wed, 20 Nov 2002, Robert Watson wrote:


Hmm.  Another thread has decided to sleep while holding an inpcb
mutex.  Any chance this can be reproduced while running WITNESS?
If so, you should get a panic earlier when the other thread sleeps
in the first place.  The easiest way to do that is if you can
reproduce the panic with WITNESS.  If you can't reproduce the
panic, you may be able to extract this from your system core using
gdb -- you want to figure out what the thread owner of the mutex
is doing -- in the context of the kassert()  below, td is the
pointer to the thread that owns the mutex.  I'm not sure how to
extract a stack trace from that information, unfortunately,
perhaps someone can give us pointers.  (note: td from the
priority_propagate() argument is shadowed, which is annoying).


Ack.  I mis-read.  You want the stack from thread td1 (the mutex
owner), not thread td.


The kernel that produced the core dump ALREADY HAS
WITNESS and WITNESS_SKIPSPIN! :(

I'll try to get more info from kgdb, but I doubt that I'll have
much luck since I've never tried using gdb before.


Erm.  Did you manage to look at dmesg then?  If so, you would have
seen warnings from WITNESS earlier about the locks messing up.  If


I see NOTHING in the dmesg about locks.


you can reproduce this and are letting it sit unattended, a better
plan might be to turn on witness_ddb (it's a kernel option, loader
tunable, and sysctl (debug.witness_ddb)) and then when the original
error occurs it will drop into the debugger with a very useful error
message.  You can also get a useful trace at that point from ddb.


I have debug.witness_ddb=1 and will try to panic the system.  I'll
let you know what happens.


--

John Baldwin <[EMAIL PROTECTED]>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic: sleeping thread owns a mutex - with debug traceback

2002-11-20 Thread Joel M. Baldwin
--On Wednesday, November 20, 2002 12:01 PM -0500 Robert Watson 
<[EMAIL PROTECTED]> wrote:

On Wed, 20 Nov 2002, Robert Watson wrote:


Hmm.  Another thread has decided to sleep while holding an inpcb
mutex.  Any chance this can be reproduced while running WITNESS?  If
so, you should get a panic earlier when the other thread sleeps in
the first place.  The easiest way to do that is if you can reproduce
the panic with WITNESS.  If you can't reproduce the panic, you may
be able to extract this from your system core using gdb -- you want
to figure out what the thread owner of the mutex is doing -- in the
context of the kassert()  below, td is the pointer to the thread
that owns the mutex.  I'm not sure how to extract a stack trace from
that information, unfortunately, perhaps someone can give us
pointers.  (note: td from the priority_propagate() argument is
shadowed, which is annoying).


Ack.  I mis-read.  You want the stack from thread td1 (the mutex
owner), not thread td.


The kernel that produced the core dump ALREADY HAS
WITNESS and WITNESS_SKIPSPIN! :(

I'll try to get more info from kgdb, but I doubt that I'll have
much luck since I've never tried using gdb before.

If someone else wants to give it a try the core dump is available
at 


Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: SMP stability ? [was Re: more info from panic from running

2002-11-20 Thread Joel M. Baldwin

I haven't had any Hard Locks since I upgraded the BIOS on my
BP6 from LP to RU and cvsup/buildworld/installworld again.

At the moment I'm thinking that my system is stable again, but
won't feel comfortable with that until I do some more stress
testing.  I've gotten a panic, but I think its unrelated.
( that's made this harder, multiple issues causing problems )

So if people are still having their BP6 do Hard Locks I would
suggest they make sure they're running the RU version of the
BIOS and the latest version of -current.

Thanks for the tip however.  If I continue to have Hard Locks
I'll try slowing down the IDE drives on the system.

--On Wednesday, November 20, 2002 1:10 AM +0100 Thierry Herbelot 
<[EMAIL PROTECTED]> wrote:

Le Tuesday 19 November 2002 22:35, Nate Lawson a écrit :

I have a couple BP6's running -stable and was having hard lock
problems under heavy IO until I dropped back to ATA33 on the drives
(I moved them to the onboard Intel controller instead of the
HPT366).  sos@ informed me that the HPT366 has a buggy DMA
controller and that ATA66 on them wouldn't work.  After moving to
ATA33 in early 2001, I haven't had any more hard locks.  This was
under -stable, but you might want to check your ATA drive setup
before proceeding.


Hello,

I also had a lockup this morning, with both /usr/src and /usr/obj on
the local  dma33 IDE disk, hooked on the BX ata canal (instead of the
HPT366).

the BP6 is on a serial console, but I still have to look how to get
back to  DDB when it is frozen (I run a plain vanilla GENERIC+SMP, so
I may have to  add other specific options - later)

	TfH


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



panic: sleeping thread owns a mutex - with debug traceback

2002-11-20 Thread Joel M. Baldwin

Under heavy system load and heavy swapping I had the following
panic occur.

-

Panic message from the serial console:

panic: sleeping thread owns a mutex
cpuid = 1; lapic.id = 0100
Debugger("panic")
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> panic     t
Debugger(c0392fc9,100,c0392171,cd29ac18,1) at Debugger+0x55
panic(c0392171,1,c03920e8,6b,c039614a) at panic+0x11f
propagate_priority(c0eadc40,2,c03920e8,23b,c03cffa0) at 
propagate_priority+0x104
_mtx_lock_sleep(c3520a20,0,c039f4dd,182,c040af98) at 
_mtx_lock_sleep+0x219
_mtx_lock_flags(c3520a20,0,c039f4dd,182,c0393d1d) at 
_mtx_lock_flags+0x98
syncache_timer(1,0,c0393d1d,bf,220649) at syncache_timer+0xaf
softclock(0,0,c0390a04,230,c0eac700) at softclock+0x19c
ithread_loop(c0ea0600,cd29ad48,c0390767,355,0) at ithread_loop+0x182
fork_exit(c01fef40,c0ea0600,cd29ad48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcd29ad7c, ebp = 0 ---
db> panic
panic: from debugger
cpuid = 1; lapic.id = 0100
boot() called on cpu#1
Uptime: 6h11m40s
pfs_vncache_unload(): 1 entries remaining
Dumping 255 MB
16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset called on cpu#1
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 1



kgdb traceback information:

su-2.05b# gdb -k
GNU gdb 5.2.1 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and 
you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
This GDB was configured as "i386-undermydesk-freebsd".
(kgdb) symbol-file 
/usr/src/sys/i386/compile/testGeneric.smp/kernel.debug
Reading symbols from 
/usr/src/sys/i386/compile/testGeneric.smp/kernel.debug...done.
(kgdb) exec-file /boot/kernel.testgeneric.smp/kernel
(kgdb) core-file /var/crash/vmcore.30
panic: from debugger
panic messages:
---
panic: sleeping thread owns a mutex
cpuid = 1; lapic.id = 0100
panic: from debugger
cpuid = 1; lapic.id = 0100
boot() called on cpu#1
Uptime: 6h11m40s
pfs_vncache_unload(): 1 entries remaining
Dumping 255 MB
16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
---
#0  doadump () at ../../../kern/kern_shutdown.c:232
232 dumping++;
(kgdb) where
#0  doadump () at ../../../kern/kern_shutdown.c:232
#1  0xc02118bd in boot (howto=260) at ../../../kern/kern_shutdown.c:364
#2  0xc0211b77 in panic () at ../../../kern/kern_shutdown.c:517
#3  0xc014d252 in db_panic () at ../../../ddb/db_command.c:450
#4  0xc014d1d2 in db_command (last_cmdp=0xc03bf2a0, cmd_table=0x0, 
aux_cmd_tablep=0xc03b5ec8,
   aux_cmd_tablep_end=0xc03b5ee0) at ../../../ddb/db_command.c:346
#5  0xc014d2e6 in db_command_loop () at ../../../ddb/db_command.c:472
#6  0xc014ff7a in db_trap (type=3, code=0) at ../../../ddb/db_trap.c:72
#7  0xc033e2a0 in kdb_trap (type=3, code=0, regs=0xcd29ab90) at 
../../../i386/i386/db_interface.c:166
#8  0xc03572cf in trap (frame=
 {tf_fs = -1069547496, tf_es = 16, tf_ds = -1069744112, tf_edi = 
-1058350016, tf_esi = 256, tf_ebp = -852907044, tf_isp = -852907076, 
tf_ebx = 0, tf_edx = 0, tf_ecx = 1, tf_eax = 18, tf_trapno = 3, tf_err 
= 0, tf_eip = -1070340715, tf_cs = 8, tf_eflags = 662, tf_esp = 
-1069880826, tf_ss = -1069994039}) at ../../../i386/i386/trap.c:603
#9  0xc033fac8 in calltrap () at {standard input}:99
#10 0xc0211b5f in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:503
#11 0xc0207b54 in propagate_priority (td=0x0) at 
../../../kern/kern_mutex.c:125
#12 0xc0208329 in _mtx_lock_sleep (m=0xc3520a20, opts=0,
   file=0xc039f4dd "../../../netinet/tcp_syncache.c", line=386) at 
../../../kern/kern_mutex.c:624
#13 0xc0207db8 in _mtx_lock_flags (m=0xc3520a20, opts=0,
   file=0xc039f4dd "../../../netinet/tcp_syncache.c", line=386) at 
../../../kern/kern_mutex.c:325
#14 0xc029b9ef in syncache_timer (xslot=0x1) at 
../../../netinet/tcp_syncache.c:386
#15 0xc021f51c in softclock (dummy=0x0) at 
../../../kern/kern_timeout.c:195
#16 0xc01ff0c2 in ithread_loop (arg=0xc0ea0600) at 
../../../kern/kern_intr.c:535


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: SMP stability ? [was Re: more info from panic from runningdnet on SMP kernel]

2002-11-17 Thread Joel M. Baldwin

ack, I keep forgetting there are TWO conf dirs now.  I didn't
even see those options.   I'll try this also.

--On Sunday, November 17, 2002 8:55 PM +0100 Thierry Herbelot 
<[EMAIL PROTECTED]> wrote:

Le Sunday 17 November 2002 20:46, Robert Watson a écrit :


I've seen several reports that using a serial break to get into ddb
is now quite a bit more reliable than a keyboard break.  If you're
not already using a serial console, you might want to give it a try
(make sure to turn on BREAK_TO_DEBUGGER and/or
ALT_BREAK_TO_DEBUGGER).


OK, I'll do so

	TfH

PS : I think one other BP6 user is Grog

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: more info from panic from running dnet on SMP kernel ( lockorder reversal, recursed on non-recursive lock )

2002-11-17 Thread Joel M. Baldwin
--On Sunday, November 17, 2002 2:54 PM -0500 Robert Watson 
<[EMAIL PROTECTED]> wrote:

Hmm.  It looks like there is indeed a lock leak in the RFTHREAD code.
Maybe a change like the following might help:

PROC_LOCK(p2);
psignal(p2, SIGKILL);
PROC_UNLOCK(p2);
}

Change the } to:
		} else
			PROC_UNLOCK(p1->p_leader);

And see if that gets rid of the problem.  Any chance this is highly
reproduceable, btw? :-)  And what app are you running that's using
RFTHREAD -- linux thread stuff?


What source file would this be in?

Yes, it is 100% reproduceable

dnet is the distrubuted.net client  for their various distributed
computing projects.  <http://distributed.net/download/>



Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories

On Sun, 17 Nov 2002, Joel M. Baldwin wrote:



running dnet on a SMP kernel causes the kernel to panic.


lock order reversal
 1st 0xc2c803e8 process lock (process lock) @


. . . snip . . .



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: SMP stability ? [was Re: more info from panic from runningdnet on SMP kernel]

2002-11-17 Thread Joel M. Baldwin
--On Sunday, November 17, 2002 11:36 AM +0100 Thierry Herbelot 
<[EMAIL PROTECTED]> wrote:

Le Sunday 17 November 2002 10:50, Joel M. Baldwin a écrit :

running dnet on a SMP kernel causes the kernel to panic.




[Hijacking another thread ?]


No problem, lets compare notes.


I haven't been able to complete a full buildworld with an SMP on a
Abit BP6  (bi-celeron) board for two weeks (the kernel config is just
a full GENERIC  with SMP and APICIO options enabled).


I also am running a BP6.  IS ANYONE successfully running an ABIT
BP6 motherboard on a SMP kernel with -current?

What BIOS version are running?  I'm one version behind I think,
so I'm going to upgrade and see if that makes any difference.


The same machine runs happily strings of make -j48 buildworld's when
running  with the straight GENERIC UP kernel, so I think the hardware
seems to be  working OK.


I haven't tried anything that drastic, but I have had NO crashes
or lockups that I couldn't explain in a NON SMP kernel.  The SMP
kernel on the other hand will Hard Lock anywhere in anywhere from
seconds to days.  It seems to be more likely the harder I push it.
I don't think its the hardware.  I've pulled my hair out and tried
everything I can think of to eliminate the possability of it
being hardware.  I'm hoping that dnet causing a panic is somehow
related.


Even make -j1 buildworld with the SMP kernel ends with a complete
freeze of  the machine (the kernel does not go to a panic where I
could try a backtrace)


Exactly!  Hard Lock, no panic, no keyboard, no choices other than reset.


The hardware config of the machine is pretty dull (see dmesg later).

One point that could be better is that the sources are NFS mounted
from a  4.7-Stable server, over an rl(4) board, which may be unstable
(/usr/obj is  local, on the Maxtor drive)


All my harddrives are local, 3 SCSI and 3 IDE.


. . . snip . . .




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



more info from panic from running dnet on SMP kernel ( lockorder reversal, recursed on non-recursive lock )

2002-11-17 Thread Joel M. Baldwin

running dnet on a SMP kernel causes the kernel to panic.


lock order reversal
1st 0xc2c803e8 process lock (process lock) @ 
../../../kern/kern_fork.c:571
2nd 0xc03cfce0 proctree (proctree) @ ../../../kern/kern_fork.c:596
recursed on non-recursive lock (sleep mutex) process lock @ 
../../../kern/kern_fork.c:599
first acquired @ ../../../kern/kern_fork.c:571
panic: recurse
cpuid = 1; lapic.id = 0100
Debugger("panic")
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> t
Debugger(c03926fa,100,c0395ada,d26f5c08,1) at Debugger+0x55
panic(c0395ada,c038feab,23b,c038feab,257) at panic+0x11f
witness_lock(c2c803e8,8,c038feab,257,0) at witness_lock+0x3e6
_mtx_lock_flags(c2c803e8,0,c038feab,257,d26f5cb8) at 
_mtx_lock_flags+0xb2
fork1(c2773d00,6050,0,d26f5cd4,c2c803e8) at fork1+0xbfc
rfork(c2773d00,d26f5d10,c03b07a2,407,1) at rfork+0x65
syscall(2f,2f,2f,0,80ddf10) at syscall+0x28e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (251, FreeBSD ELF32, rfork), eip = 0x8087d14, esp = 
0xbfbff4a8, ebp = 0xbfbff524 ---
db> ps
 pid   proc addruid  ppid  pgrp  flag   stat  wmesgwchan 
cmd
6217 c2b98e00 d28a70000  6215  6216 000 newpanic: unknown 
thread state
cpuid = 1; lapic.id = 0100
boot() called on cpu#1
Uptime: 1h43m39s
pfs_vncache_unload(): 1 entries remaining
Dumping 255 MB
16 32 48 64 80 96 112 128 144 160 176 192 208 224 240
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset called on cpu#1
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 1



And then when the system came back up and I took a closer
look at the core dump.


(kgdb) where
#0  doadump () at ../../../kern/kern_shutdown.c:232
#1  0xc02114ad in boot (howto=260) at ../../../kern/kern_shutdown.c:364
#2  0xc0211767 in panic () at ../../../kern/kern_shutdown.c:517
#3  0xc014f2bc in db_ps (dummy1=-1070342907, dummy2=0, dummy3=-1, 
dummy4=0xd26f5a24 "")
   at ../../../ddb/db_ps.c:169
#4  0xc014d142 in db_command (last_cmdp=0xc03be920, cmd_table=0x0, 
aux_cmd_tablep=0xc03b5540,
   aux_cmd_tablep_end=0xc03b5558) at ../../../ddb/db_command.c:346
#5  0xc014d256 in db_command_loop () at ../../../ddb/db_command.c:472
#6  0xc014feea in db_trap (type=3, code=0) at ../../../ddb/db_trap.c:72
#7  0xc033da10 in kdb_trap (type=3, code=0, regs=0xd26f5b80)
   at ../../../i386/i386/db_interface.c:166
#8  0xc0356a3f in trap (frame=
 {tf_fs = -1069481960, tf_es = 16, tf_ds = 16, tf_edi = 
-1032372992, tf_esi = 256, tf_ebp = -764453940, tf_isp = -764453972, 
tf_ebx = 0, tf_edx = 0, tf_ecx = 1, tf_eax = 18, tf_trapno = 3, tf_err 
= 0, tf_eip = -1070342907, tf_cs = 8, tf_eflags = 662, tf_esp = 
-1069883258, tf_ss = -1069996294}) at ../../../i386/i386/trap.c:603
#9  0xc033f238 in calltrap () at {standard input}:99
#10 0xc021174f in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:503
#11 0xc02333d6 in witness_lock (lock=0xc2c803e8, flags=8,
   file=0xc038feab "../../../kern/kern_fork.c", line=599) at 
../../../kern/subr_witness.c:609
#12 0xc02079c2 in _mtx_lock_flags (m=0xc03cf4c0, opts=0, 
file=0xc042cfd4 "è\003È«þ8À;\002",
   line=-1027079192) at ../../../kern/kern_mutex.c:328
#13 0xc01fd3ec in fork1 (td=0xc2773d00, flags=24656, pages=0, 
procp=0xd26f5cd4)
   at ../../../kern/kern_fork.c:599
#14 0xc01fc6c5 in rfork (td=0xc2773d00, uap=0xd26f5d10) at 
../../../kern/kern_fork.c:168
#15 0xc035739e in syscall (frame=
 {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 0, tf_esi = 
135126800, tf_ebp = -1077938908, tf_isp = -764453516, tf_ebx = 2, 
tf_edx = 135381248, tf_ecx = 135381248, tf_eax = 251, tf_trapno = 0, 
tf_err = 2, tf_eip = 134774036, tf_cs = 31, tf_eflags = 659, tf_esp = 
-1077939032, tf_ss = 47})
   at ../../../i386/i386/trap.c:1033
#16 0xc033f28d in Xint0x80_syscall () at {standard input}:141
---Can't read userspace from dump, or kernel process---






To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


dnet causes panic ( was Hard Lock )

2002-11-12 Thread Joel M. Baldwin

Well I'm not getting Hard Locks when I try running dnet on a
SMP kernel.  Instead I've gotten the following panics.

It turns out I can run a single non threaded dnet process when
I invoke it thus:  'dnet -cpunum 0'

-

lock order reversal
1st 0xc2685e68 process lock (process lock) @ 
../../../kern/kern_fork.c:571
2nd 0xc03cd760 proctree (proctree) @ ../../../kern/kern_fork.c:596
recursed on non-recursive lock (sleep mutex) process lock @ 
../../../kern/kern_fork.c:599
first acquired @ ../../../kern/kern_fork.c:571
panic: recurse
cpuid = 1; lapic.id = 0100
Debugger("panic")
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> t
Debugger(c039059a,100,c039397a,cdcebc08,1) at Debugger+0x55
panic(c039397a,c038dd4b,23b,c038dd4b,257) at panic+0x11f
witness_lock(c2685e68,8,c038dd4b,257,0) at witness_lock+0x3e6
_mtx_lock_flags(c2685e68,0,c038dd4b,257,cdcebcb8) at 
_mtx_lock_flags+0xb2
fork1(c25279c0,6050,0,cdcebcd4,c2685e68) at fork1+0xbfc
rfork(c25279c0,cdcebd10,c03ae516,407,1) at rfork+0x65
syscall(2f,2f,2f,0,80ddf10) at syscall+0x28e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (251, FreeBSD ELF32, rfork), eip = 0x8087d14, esp = 
0xbfbff82c, ebp = 0xbfbff8a8 ---

--

lock order reversal
1st 0xc285a228 process lock (process lock) @ 
../../../kern/kern_fork.c:571
2nd 0xc03cd760 proctree (proctree) @ ../../../kern/kern_fork.c:596
recursed on non-recursive lock (sleep mutex) process lock @ 
../../../kern/kern_fork.c:599
first acquired @ ../../../kern/kern_fork.c:571
panic: recurse
cpuid = 1; lapic.id = 0100
Debugger("panic")
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: Panics instead of Hard Locks

2002-11-10 Thread Joel M. Baldwin

If there's a race why hasn't it been fixed in the main tree?

A swap issue makes sense.  If I've been down long enough I get
swamped with email when I come back up.  A bug in the latest
procmail yields 130M processes that fill up swap and make
the system be swap bound.  The solution is the following
patch to procmail.



--On Sunday, November 10, 2002 7:03 PM + 
[EMAIL PROTECTED] wrote:


Since going from a SMP to nonSMP kernel the Hard Locks don't
seem to be happening.  However I'm getting panics.

I've gotten 4 'sleeping thread owns a mutex' panics and one each
of 'Assertion i != 0 failed at ../../../kern/subr_witness.c:669'
and 'Duplicate free of item 0xc3895cc0 from zone 0xc0ea63c0(VMSPACE)'


The 'Duplicate free' can be caused by a race between swapout_procs()
and kern_exit()+wait1().

The enclosed patch might help.

Disabling swapping (sysctl vm.swap_enabled=0) can also help.

- Tor Egge





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Panics instead of Hard Locks

2002-11-09 Thread Joel M. Baldwin

Since going from a SMP to nonSMP kernel the Hard Locks don't
seem to be happening.  However I'm getting panics.

I've gotten 4 'sleeping thread owns a mutex' panics and one each
of 'Assertion i != 0 failed at ../../../kern/subr_witness.c:669'
and 'Duplicate free of item 0xc3895cc0 from zone 0xc0ea63c0(VMSPACE)'

More info follows.  I finally got a debugger kernel built so I'll
have even more info after the next panic.  Let me know what I can
do to help.

--


panic: Assertion i != 0 failed at ../../../kern/subr_witness.c:669
Debugger("panic")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> bt
No such command
db> x
Debugger+0x54:  29401d87
db> t
Debugger(c038693f,c03f6020,c0389a7b,cd2e2b00,1) at Debugger+0x54
panic(c0389a7b,c0389cee,c038993f,29d,cd2e2b64) at panic+0xab
witness_lock(c03c29a0,8,c0386f4e,134,c0385a95) at witness_lock+0x513
_mtx_lock_flags(c03c29a0,0,c0386f4e,134,cd2e2b90) at 
_mtx_lock_flags+0xb1
msleep(c265d504,c03f4974,44,c039acd9,0) at msleep+0x644
acquire(c265d504,100,600,e3,5) at acquire+0xa7
lockmgr(c265d504,2,0,c0eadea0,cd2e2c2c) at lockmgr+0x378
_vm_map_lock_read(c265d4c8,c039ad3c,153,c0385a95,c265c700) at 
_vm_map_lock_read+0x5b
vmspace_swap_count(c265d4c8,0,c039c4d2,493,0) at vmspace_swap_count+0x29
vm_pageout_scan(0,0,44,c039c590,1f4) at vm_pageout_scan+0xa48
vm_pageout(0,cd2e2d48,c0384133,354,0) at vm_pageout+0x262
fork_exit(c03163b0,0,cd2e2d48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcd2e2d7c, ebp = 0 ---



panic: Duplicate free of item 0xc3895cc0 from zone 0xc0ea63c0(VMSPACE)

Debugger("panic")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> t
Debugger(c038693f,c03f6020,c039cbea,cdccac4c,1) at Debugger+0x54
panic(c039cbea,c3895cc0,c0ea63c0,c039ad51,693) at panic+0xab
uma_dbg_free(c0ea63c0,0,c3895cc0,693,0) at uma_dbg_free+0x122
uma_zfree_arg(c0ea63c0,c3895cc0,0,12d,c2f75e00) at uma_zfree_arg+0xfa
vmspace_free(c3895cc0,c039a9c3,31d,31c,186a0) at vmspace_free+0xbe
swapout_procs(1,0,68,c039c590,0) at swapout_procs+0x387
vm_daemon(0,cdccad48,c0384133,354,0) at vm_daemon+0x6e
fork_exit(c03166b0,0,cdccad48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcdccad7c, ebp = 0 ---

--


panic: sleeping thread owns a mutex
Debugger("panic")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> t
Debugger(c038693f,c03f6020,c0385b1e,cd29ac14,1) at Debugger+0x54
panic(c0385b1e,1,c0385a95,6b,0) at panic+0xab
propagate_priority(c0eaca90,2,c0385a95,23b,c0eaca90) at 
propagate_priority+0x13c
_mtx_lock_sleep(c2c090b0,0,c0392b8d,182,c03f90f8) at 
_mtx_lock_sleep+0x219
_mtx_lock_flags(c2c090b0,0,c0392b8d,182,c038760b) at 
_mtx_lock_flags+0x97
syncache_timer(0,0,c038760b,bf,2edb5a) at syncache_timer+0xaf
softclock(0,0,c03843cc,230,c0eaba80) at softclock+0x19c
ithread_loop(c0e9fa00,cd29ad48,c0384133,354,0) at ithread_loop+0x182
fork_exit(c01fb6b0,c0e9fa00,cd29ad48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcd29ad7c, ebp = 0 ---

-

panic: sleeping thread owns a mutex
Debugger("panic")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> t
Debugger(c038693f,c03f6020,c0385b1e,cd29ac14,1) at Debugger+0x54
panic(c0385b1e,1,c0385a95,6b,0) at panic+0xab
propagate_priority(c0eaca90,2,c0385a95,23b,c0eaca90) at 
propagate_priority+0x13c
_mtx_lock_sleep(c2da9c7c,0,c0392b8d,182,c03f90f8) at 
_mtx_lock_sleep+0x219
_mtx_lock_flags(c2da9c7c,0,c0392b8d,182,c038760b) at 
_mtx_lock_flags+0x97
syncache_timer(0,0,c038760b,bf,20be24) at syncache_timer+0xaf
softclock(0,0,c03843cc,230,c0eaba80) at softclock+0x19c
ithread_loop(c0e9fa00,cd29ad48,c0384133,354,0) at ithread_loop+0x182
fork_exit(c01fb6b0,c0e9fa00,cd29ad48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcd29ad7c, ebp = 0 ---

-

anic: sleeping thread owns a mutex
Debugger("panic")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> t
Debugger(c038693f,c03f6020,c0385b1e,cd29ac14,1) at Debugger+0x54
panic(c0385b1e,1,c0385a95,6b,0) at panic+0xab
propagate_priority(c0eaca90,2,c0385a95,23b,c0eaca90) at 
propagate_priority+0x13c
_mtx_lock_sleep(c2f630b0,0,c0392b8d,182,c03f90f8) at 
_mtx_lock_sleep+0x219
_mtx_lock_flags(c2f630b0,0,c0392b8d,182,c038760b) at 
_mtx_lock_flags+0x97
syncache_timer(2,0,c038760b,bf,124b9a) at syncache_timer+0xaf
softclock(0,0,c03843cc,230,c0eaba80) at softclock+0x19c
ithread_loop(c0e9fa00,cd29ad48,c0384133,354,0) at ithread_loop+0x182
fork_exit(c01fb6b0,c0e9fa00,cd29ad48) at fork_exit+0xa5
fork_trampoline() at fork_trampoline+0x1a
--- trap 0x1, eip = 0, esp = 0xcd29ad7c, ebp = 0

Re: dnet causes Hard Lock on SMP kernel? ( was 'Why is my-current system Hard Locking?' )

2002-11-07 Thread Joel M. Baldwin
--On Friday, November 08, 2002 8:45 AM +0900 Jun Kuriyama 
<[EMAIL PROTECTED]> wrote:

At Thu, 7 Nov 2002 22:55:56 + (UTC),
Joel M. Baldwin <[EMAIL PROTECTED]> wrote:

I'm still pursuing the cause of the Hard Locks on my system.

1.) I have a serial console hooked up.  Nothing appears on
the console when a Hard Lock happens.  No panic.
2.) Shorting IOCHK to ground on the ISA connector doesn't work
on a ABIT BP6, so I can't force myself into ddb.

The BIG thing is that I now have a sure fire way of forcing
a Hard Lock.  Every time I run the distributed.net client
'dnet' the system hard locks.  BUT ONLY ON A SMP KERNEL!
I have a non SMP kernel running and things so far
seem stable.  Running the system with only 1 CPU is slow,
so it'll take a while for me to be sure.


I got same result on my box.  Only solution I have is deinstall
dnetc.  :-)

--
Jun Kuriyama <[EMAIL PROTECTED]> // IMG SRC, Inc.


What motherboard/CPU is this on?
Have you tried a nonSMP kernel?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: dnet causes Hard Lock on SMP kernel? ( was 'Why is my-current system Hard Locking?' )

2002-11-07 Thread Joel M. Baldwin

--On Thursday, November 07, 2002 6:07 PM -0500 Ray Kohler 
<[EMAIL PROTECTED]> wrote:

From [EMAIL PROTECTED] Thu Nov  7 18:00:10 2002
Date: Thu, 07 Nov 2002 14:47:03 -0800
From: "Joel M. Baldwin" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: dnet causes Hard Lock on SMP kernel? ( was 'Why is my
-current system Hard Locking?' )


The BIG thing is that I now have a sure fire way of forcing
a Hard Lock.  Every time I run the distributed.net client
'dnet' the system hard locks.  BUT ONLY ON A SMP KERNEL!
I have a non SMP kernel running and things so far
seem stable.  Running the system with only 1 CPU is slow,
so it'll take a while for me to be sure.

So the questions now are.
1.) What is dnet doing that is Hard Locking the system?
2.) Is the dnet problem the same as what has been causing
my Hard Locks all along?


Just a guess, but maybe your system locks up every time the
CPU usage gets very high under SMP? That's one thing dnetc
is guaranteed to do. Do other CPU-intensive operations do
it? (Maybe you said earlier, but I haven't been following
this thread.)

- @


I've run dnet reliably before.  I'm assuming the Hard Lock
problem is threads related, but at this point who knows.  A
more heavily load system seems to increase the chances of a
Hard Lock.  But it HAS happened with nothing going on.  Well,
relatively nothing, this system does real work so its never
completely idle.

The important thing is I can now force a Hard Lock simply by
running a program.  This is something that shouldn't happen
and needs to be solved.  With any luck this will also
solve my stability problem.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



dnet causes Hard Lock on SMP kernel? ( was 'Why is my -currentsystem Hard Locking?' )

2002-11-07 Thread Joel M. Baldwin

I'm still pursuing the cause of the Hard Locks on my system.

1.) I have a serial console hooked up.  Nothing appears on
   the console when a Hard Lock happens.  No panic.
2.) Shorting IOCHK to ground on the ISA connector doesn't work
   on a ABIT BP6, so I can't force myself into ddb.

The BIG thing is that I now have a sure fire way of forcing
a Hard Lock.  Every time I run the distributed.net client
'dnet' the system hard locks.  BUT ONLY ON A SMP KERNEL!
I have a non SMP kernel running and things so far
seem stable.  Running the system with only 1 CPU is slow,
so it'll take a while for me to be sure.

So the questions now are.
1.) What is dnet doing that is Hard Locking the system?
2.) Is the dnet problem the same as what has been causing
   my Hard Locks all along?
3.) What is the fix?

ADVthaAnksNCE


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



duplicate lock

2002-11-07 Thread Joel M. Baldwin

FreeBSD outel.org 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Thu Nov  7 
05:13:19 PST 2002 
[EMAIL PROTECTED]:/disk2/usr.src/sys/i386/compile/testGeneric.nonsmp 
i386

acquiring duplicate lock of same type: "inp"
1st inp @ ../../../netinet/udp_usrreq.c:290
2nd inp @ ../../../netinet/udp_usrreq.c:290

this comes up once right after I boot up.

searching archive yields this thread.

'Subject: /usr/src/sys/netinet/udp_usrreq.c:290'

looks like I'm not the only one seeing this.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: Why is my -current system Hard Locking?

2002-11-06 Thread Joel M. Baldwin


--On Wednesday, November 06, 2002 9:23 PM -0600 Sean Kelly 
<[EMAIL PROTECTED]> wrote:

. . . snip . . .

I just came across this problem for the first time on my -CURRENT
from Wed Oct 30. I was in the middle of reading a message in mutt (my
MUA) on the console and the system just froze. I was also cvsup'ing
ports on another vty at the time. There was no way to get a dump or
backtrace. Nothing, including networking, was responding.

Here are a few things you might want to try:
1. Check your syslog to get an estimate of when the machine dies. Is
itduring high disk activity? Mine did it during a ports cvsup.
Could it bedoing it in relation to the /etc/security stuff? Does
it happen aroundthe same times every day?


The harder I press the system, the greater the odds of the lockup.
A buildworld and other stuff will eventually crash.


2. Build a kernel with GDB_REMOTE_CHAT and try a remote gdb session
againstthe kernel? Not sure how much information this will
provide, since themachine will most likely stop responding to the
remote gdb. 3. Figure out when the problem started happening for the
first time and seewhat was commit'd around then.


I've hooked up a serial console and am going to try forcing a NMI
via the ISA IOCHK pin.


. . . snip . . .

--
Sean Kelly | PGP KeyID: 77042C7B
[EMAIL PROTECTED] | http://www.zombie.org





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Why is my -current system Hard Locking?

2002-11-06 Thread Joel M. Baldwin

No change, the system is still Hard Locking.  It did it
4 times yesterday.  The next steps for me are:

1.) Get a serial console working.  Maybe I'll get some
 indication of what's happening that way.
2.) When the serial console is working, see if I can
 break into ddb after a Hard Lock.

Is there a way in HARDWARE to FORCE a break into ddb?
^

Maybe I can tell if the system is running, where it is
running, and how it got there.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Why is my -current system Hard Locking?

2002-11-05 Thread Joel M. Baldwin

There isn't any indication of a panic.  The system just
locks up and is non responsive.

--On Tuesday, November 05, 2002 9:25 PM -0800 Steve Kargl 
<[EMAIL PROTECTED]> wrote:


This sounds good, but isn't it.  X doesn't have control of the
graphics and kbd on my system.  Only xdm and clients are running,
the server is on another system.  When panics happen ( and they have
) I get the message on the screen.

Also this problem has been happening both with X running and not.

Any other suggestions?



If it happens without X running, then you should
drop into the debugger and get a backtrace and
crash dump.  Do you have known method of inducing
the panic?

--
Steve





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Why is my -current system Hard Locking?

2002-11-05 Thread Joel M. Baldwin

--On Tuesday, November 05, 2002 9:55 AM -0500 Andrew Gallatin 
<[EMAIL PROTECTED]> wrote:

Joel M. Baldwin writes:

<...>
 > don't think this is related to the X FP problem ( although I am
 > running X11 ).  There have been many times when I'd walk in the
<...>
 > options DDB #Enable the kernel debugger
<...>

Its likley you are panic'ing in X, and the system is waiting at the
ddb prompt.  But you can't see it or do anything, since X has control
of the graphics and kbd, so it looks like the machine is frozen.

Try adding options DDB_UNATTENDED  to your config.  Or just do
sysctl  debug.debugger_on_panic=0

Both of these have the same effect: they will prevent ddb from
stopping the panic.  Make sure you have a dump device setup to
capture a crashdump.

Drew


This sounds good, but isn't it.  X doesn't have control of the graphics
and kbd on my system.  Only xdm and clients are running, the server is
on another system.  When panics happen ( and they have ) I get the
message on the screen.

Also this problem has been happening both with X running and not.

Just in case I'll go ahead and try your suggestion but . . .

Any other suggestions?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Why is my -current system Hard Locking?

2002-11-04 Thread Joel M. Baldwin

I'm getting quite frustrated with -current.  I've been running it
for years and had relatively few problems.  However for quite some
time now I've had a problem with Hard Locks.  By Hard Lock I mean
that the system doesn't respond to ether traffic, the keyboard
doesn't respond, the capslock, numlock, and scroll lock keys do
nothing, the corresponding LEDs don't light, and ctrl/alt/del
doesn't reboot the system.  The only out is to hit reset.  I
don't think this is related to the X FP problem ( although I am
running X11 ).  There have been many times when I'd walk in the
morning and find the system had been down for hours.  As I
understand the X FP problem, it is a transient that shouldn't take
the system down all night.

This leads to the following questions:

1.) how likely is this a hardware problem?  I've already ran
    and there doesn't seem to be
   a memory problem.  I've changed power supplies.  Memory was
   swapped from another system.  The entire system was upgraded
   such that the only components that are still the same are
   some of the hard drives.  I've even taken the SCSI bus speed
   down to 10Mhz.  All kinds of BIOS settings have been tried.
   The only thing left is to swap out hard drives.

2.) how can I identify the software issue if that is what it
   is?  With the keyboard dead how can I break into the system?


Attached are my boot messages and kernel configs.
The current MB/CPU is a Abit BP6 with dual Celeron 466s that
are NOT overclocked.

Suggestions? ( Help! )

ADVthaAnksNCE

Nov  4 10:54:40 outel syslogd: kernel boot file is /boot/kernel/kernel
Nov  4 10:54:40 outel kernel: Copyright (c) 1992-2002 The FreeBSD Project.
Nov  4 10:54:40 outel kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 
1992, 1993, 1994
Nov  4 10:54:40 outel kernel: The Regents of the University of California. All rights 
reserved.
Nov  4 10:54:40 outel kernel: FreeBSD 5.0-CURRENT #0: Sat Nov  2 12:43:24 PST 2002
Nov  4 10:54:40 outel kernel: 
[EMAIL PROTECTED]:/disk2/usr.src/sys/i386/compile/citgate-smp
Nov  4 10:54:40 outel kernel: Preloaded elf kernel "/boot/kernel/kernel" at 0xc054a000.
Nov  4 10:54:40 outel kernel: Timecounter "i8254"  frequency 1193182 Hz
Nov  4 10:54:40 outel kernel: CPU: Pentium II/Pentium II Xeon/Celeron (467.73-MHz 
686-class CPU)
Nov  4 10:54:40 outel kernel: Origin = "GenuineIntel"  Id = 0x665  Stepping = 5
Nov  4 10:54:40 outel kernel: 
Features=0x183fbff
Nov  4 10:54:40 outel kernel: real memory  = 268369920 (262080K bytes)
Nov  4 10:54:40 outel kernel: avail memory = 254664704 (248696K bytes)
Nov  4 10:54:40 outel kernel: Programming 24 pins in IOAPIC #0
Nov  4 10:54:40 outel kernel: IOAPIC #0 intpin 2 -> irq 0
Nov  4 10:54:40 outel kernel: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
Nov  4 10:54:40 outel kernel: cpu0 (BSP): apic id:  0, version: 0x00040011, at 
0xfee0
Nov  4 10:54:40 outel kernel: cpu1 (AP):  apic id:  1, version: 0x00040011, at 
0xfee0
Nov  4 10:54:40 outel kernel: io0 (APIC): apic id:  2, version: 0x00170011, at 
0xfec0
Nov  4 10:54:40 outel kernel: Initializing GEOMetry subsystem
Nov  4 10:54:40 outel kernel: Pentium Pro MTRR support enabled
Nov  4 10:54:40 outel kernel: netsmb_dev: loaded
Nov  4 10:54:40 outel kernel: npx0:  on motherboard
Nov  4 10:54:40 outel kernel: npx0: INT 16 interface
Nov  4 10:54:40 outel kernel: Using $PIR table, 8 entries at 0xc00fdef0
Nov  4 10:54:40 outel kernel: pcib0:  at 
pcibus 0 on motherboard
Nov  4 10:54:40 outel kernel: pci0:  on pcib0
Nov  4 10:54:40 outel kernel: IOAPIC #0 intpin 18 -> irq 2
Nov  4 10:54:40 outel kernel: IOAPIC #0 intpin 17 -> irq 16
Nov  4 10:54:40 outel kernel: IOAPIC #0 intpin 16 -> irq 17
Nov  4 10:54:40 outel kernel: IOAPIC #0 intpin 19 -> irq 18
Nov  4 10:54:40 outel kernel: pcib1:  at device 1.0 on pci0
Nov  4 10:54:40 outel kernel: pci1:  on pcib1
Nov  4 10:54:40 outel kernel: isab0:  at device 7.0 on pci0
Nov  4 10:54:40 outel kernel: isa0:  on isab0
Nov  4 10:54:40 outel kernel: atapci0:  port 
0xf000-0xf00f at device 7.1 on pci0
Nov  4 10:54:40 outel kernel: ata0: at 0x1f0 irq 14 on atapci0
Nov  4 10:54:40 outel kernel: ata1: at 0x170 irq 15 on atapci0
Nov  4 10:54:40 outel kernel: uhci0:  port 
0xa000-0xa01f irq 10 at device 7.2 on pci0
Nov  4 10:54:40 outel kernel: usb0:  on uhci0
Nov  4 10:54:40 outel kernel: usb0: USB revision 1.0
Nov  4 10:54:41 outel kernel: uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, 
addr 1
Nov  4 10:54:41 outel kernel: uhub0: 2 ports with 2 removable, self powered
Nov  4 10:54:41 outel kernel: Timecounter "PIIX"  frequency 3579545 Hz
Nov  4 10:54:41 outel kernel: pci0:  at device 7.3 (no driver 
attached)
Nov  4 10:54:41 outel kernel: fxp0:  port 
0xa400-0xa43f mem 0xed00-0xed0f,0xed10-0xed100fff irq 2 at device 11.0 on 
pci0
Nov  4 10:54:41 outel kernel: fxp0: Ethernet address 00:90:27:a5:f4:f2
Nov  4 10:54:41 outel kernel: inphy0:  on miibus0
Nov  4 10:54:41 outel kernel: inphy

Re: Do we still need portmap(8)?

2002-10-07 Thread Joel M. Baldwin


Shouldn't ALL of the files in /bin, /usr/bin, /usr/include, /usr/lib
etc be replaced during an installworld?

I've always looked for files older than the last installworld and
moved them aside thinking that they're obsolete.

( aside, not delete, just in case )

--On Monday, October 07, 2002 8:51 AM +0200 Poul-Henning Kamp 
<[EMAIL PROTECTED]> wrote:

> In message <[EMAIL PROTECTED]>, "Greg
> 'groggy' Lehey" writes:
>> On Sunday,  6 October 2002 at 23:42:55 -0700, David O'Brien wrote:
>>> On Mon, Oct 07, 2002 at 04:02:51PM +0930, Greg 'groggy' Lehey wrote:
 It's been a while since we've used portmap(8) on -CURRENT systems.
 Is it still needed, or can it be removed completely?  At the very
 least, the man page should stop claiming that it's necessary to
 run NFS.
>>>
>>> Are you saying we've left behind an old manpage?
>>
>> No, I'm asking whether we have left behind both an old man page and
>> an old binary.
>>
>> On closer examination, though, it looks like this is the result of
>> installing a 4.7 system and immediately upgrading it to 5-CURRENT, so
>> that the dates of the files looked pretty much the same.  Sorry for
>> that confusion.  What's the recommended way of getting old binaries
>> off the system?
>
> I use:
>   cd /usr/src
>   make installworld DESTDIR=/some/where
>   diff -ur /some/where /
>   manual review.
>
> --
> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
> [EMAIL PROTECTED] | TCP/IP since RFC 956
> FreeBSD committer   | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by
> incompetence.
>
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: natd core dumping with bus error

2002-07-08 Thread Joel M. Baldwin


I'll have to ditto that.

no ppp, just natd, and sysctl stuff is set as listed below.

Without the punch-fw directive in /etc/natd.conf, natd will core dump.
I just verified that without the directive it core dumps.  The problem
still exits.  It isn't an instant dump, it runs for a while.

--On Monday, July 08, 2002 9:18 AM -0500 "Richard Seaman, Jr." 
<[EMAIL PROTECTED]> wrote:

> On Mon, Jul 08, 2002 at 07:08:58AM -0700, David Xu wrote:
>> you have turned on "nat enable yes" in ppp.conf,
>> and but you havn't turned ip_foward on in sysctl,
>> so core dumped.
>>
>> David Xu
>
> Well, I'm not running ppp, and never indicated I was.  I'm running
> natd.
>
># sysctl -a | grep forward
> net.inet.ip.forwarding: 1
> net.inet.ip.fastforwarding: 0
> net.inet6.ip6.forwarding: 0
>
> Everything works fine with pre "new-ipfw", and has for years.  Same
> rules, same configuration, and with "new ipfw", core dump.
>
> --
> Richard Seaman, Jr.email:[EMAIL PROTECTED]
> 5182 N. Maple Lane phone:262-367-5450
> Nashotah WI 53058fax:262-367-5852




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: natd core dumping with bus error

2002-07-08 Thread Joel M. Baldwin


I started out without punch_fw.  natd was core dumping on me.  I 
eventually figured out that if I added punch_fw in, natd no longer core 
dumped.  I've left it in, things seem to work  better
anyway with it in.

I've put a core dump file at <http://outel.org/natd.core>

Here is my original message:


> Something has messed up natd.  If I don't have the
> punch_fw option in the /etc/natd.conf file it eventuially
> core dumps with a bus error.  I think this started JUST
> BEFORE the KSE commit.
>
>
>
> /etc/natd.conf: ( note that this works.  comment out the
>   punch_fw option and it core dumps)
> use_sockets yes
> same_ports  yes
> unregistered_only   yes
> interface   rl0
> punch_fw5000:50
>
>
>
> natd stuff in /etc/rc.conf:
> natd_enable="YES"
> natd_flags="-f /etc/natd.conf"
> natd_interface="rl0"  # rl0-external ifc : fxp0-internal ifc
>
>
>
> ipfw list: ( this is the SIMPLE firewall type rules with
>   the addition of rules 400 and 500  )
> 00100 allow ip from any to any via lo0
> 00200 deny ip from any to 127.0.0.0/8
> 00300 deny ip from 127.0.0.0/8 to any
> 00400 allow tcp from any to any via fxp0
> 00500 allow udp from any to any via fxp0
> 00600 deny ip from 192.168.1.0/24 to any in via rl0
> 00700 deny ip from 168.150.177.152 to any in via fxp0
> 00800 deny ip from any to 10.0.0.0/8 via rl0
> 00900 deny ip from any to 172.16.0.0/12 via rl0
> 01000 deny ip from any to 192.168.0.0/16 via rl0
> 01100 deny ip from any to 0.0.0.0/8 via rl0
> 01200 deny ip from any to 169.254.0.0/16 via rl0
> 01300 deny ip from any to 192.0.2.0/24 via rl0
> 01400 deny ip from any to 224.0.0.0/4 via rl0
> 01500 deny ip from any to 240.0.0.0/4 via rl0
> 01600 divert 8668 ip from any to any via rl0
> 01700 deny ip from 10.0.0.0/8 to any via rl0
> 01800 deny ip from 172.16.0.0/12 to any via rl0
> 01900 deny ip from 192.168.0.0/16 to any via rl0
> 02000 deny ip from 0.0.0.0/8 to any via rl0
> 02100 deny ip from 169.254.0.0/16 to any via rl0
> 02200 deny ip from 192.0.2.0/24 to any via rl0
> 02300 deny ip from 224.0.0.0/4 to any via rl0
> 02400 deny ip from 240.0.0.0/4 to any via rl0
> 02500 allow tcp from any to any established
> 02600 allow ip from any to any frag
> 02700 allow tcp from any to 168.150.177.152 25 setup
> 02800 allow tcp from any to 168.150.177.152 53 setup
> 02900 allow udp from any to 168.150.177.152 53
> 03000 allow udp from 168.150.177.152 53 to any
> 03100 allow tcp from any to 168.150.177.152 80 setup
> 03200 deny log  tcp from any to any in via rl0 setup
> 03300 allow tcp from any to any setup
> 03400 allow udp from 168.150.177.152 to any 53 keep-state
> 65535 deny ip from any to any
>
>
>
> gdb traceback:
> su-2.05# gdb -c natd.core /sbin/natd
> GNU gdb 5.2.0 (FreeBSD) 20020627
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and
> you are welcome to change it and/or distribute copies of it under
> certain conditions. Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details. This GDB was configured as "i386-undermydesk-freebsd"...(no
> debugging symbols found)... Core was generated by `natd'.
> Program terminated with signal 10, Bus error.
># 0  0x08050c27 in ?? ()
> (gdb) bt
># 0  0x08050c27 in ?? ()
># 1  0x0804f0f0 in ?? ()
># 2  0x0804f0a6 in ?? ()
># 3  0x080503b5 in ?? ()
># 4  0x0804b489 in ?? ()
># 5  0x08048b38 in ?? ()
># 6  0x080487ee in ?? ()
># 7  0x08048131 in ?? ()
> (gdb)
>
>
> If you need something else to diagnose this let me know and I'll
> do whatever I can to help.



--On Monday, July 08, 2002 2:26 AM -0700 Luigi Rizzo 
<[EMAIL PROTECTED]> wrote:

> Could you clarify the problem ?
> I believe the problem appears when you _do_ use punch_fw,
> otherwise the modified code is never called.
>
>   cheers
>   luigi
>
> On Thu, Jul 04, 2002 at 09:20:38AM -0500, Richard Seaman, Jr. wrote:
>> On Tue, Jul 02, 2002 at 06:04:36PM -0700, Joel M. Baldwin wrote:
>> >
>> >
>> > Something has messed up natd.  If I don't have the
>> > punch_fw option in the /etc/natd.conf file it eventuially
>> > core dumps with a bus error.  I think this started JUST
>> > BEFORE the KSE commit.
>>
>> Yes, I've seen the same thing on a pre-KSE kernel. The error
>> occurs in PunchFWHole in alias_db.c in libalias.  Reverting
>> the following commit seems to fix it (I haven't had a chan

natd core dumping with bus error

2002-07-02 Thread Joel M. Baldwin



Something has messed up natd.  If I don't have the
punch_fw option in the /etc/natd.conf file it eventuially
core dumps with a bus error.  I think this started JUST
BEFORE the KSE commit.



/etc/natd.conf: ( note that this works.  comment out the
punch_fw option and it core dumps)
use_sockets yes
same_ports  yes
unregistered_only   yes
interface   rl0
punch_fw5000:50



natd stuff in /etc/rc.conf:
natd_enable="YES"
natd_flags="-f /etc/natd.conf"
natd_interface="rl0"# rl0-external ifc : fxp0-internal ifc



ipfw list: ( this is the SIMPLE firewall type rules with
the addition of rules 400 and 500  )
00100 allow ip from any to any via lo0
00200 deny ip from any to 127.0.0.0/8
00300 deny ip from 127.0.0.0/8 to any
00400 allow tcp from any to any via fxp0
00500 allow udp from any to any via fxp0
00600 deny ip from 192.168.1.0/24 to any in via rl0
00700 deny ip from 168.150.177.152 to any in via fxp0
00800 deny ip from any to 10.0.0.0/8 via rl0
00900 deny ip from any to 172.16.0.0/12 via rl0
01000 deny ip from any to 192.168.0.0/16 via rl0
01100 deny ip from any to 0.0.0.0/8 via rl0
01200 deny ip from any to 169.254.0.0/16 via rl0
01300 deny ip from any to 192.0.2.0/24 via rl0
01400 deny ip from any to 224.0.0.0/4 via rl0
01500 deny ip from any to 240.0.0.0/4 via rl0
01600 divert 8668 ip from any to any via rl0
01700 deny ip from 10.0.0.0/8 to any via rl0
01800 deny ip from 172.16.0.0/12 to any via rl0
01900 deny ip from 192.168.0.0/16 to any via rl0
02000 deny ip from 0.0.0.0/8 to any via rl0
02100 deny ip from 169.254.0.0/16 to any via rl0
02200 deny ip from 192.0.2.0/24 to any via rl0
02300 deny ip from 224.0.0.0/4 to any via rl0
02400 deny ip from 240.0.0.0/4 to any via rl0
02500 allow tcp from any to any established
02600 allow ip from any to any frag
02700 allow tcp from any to 168.150.177.152 25 setup
02800 allow tcp from any to 168.150.177.152 53 setup
02900 allow udp from any to 168.150.177.152 53
03000 allow udp from 168.150.177.152 53 to any
03100 allow tcp from any to 168.150.177.152 80 setup
03200 deny log  tcp from any to any in via rl0 setup
03300 allow tcp from any to any setup
03400 allow udp from 168.150.177.152 to any 53 keep-state
65535 deny ip from any to any



gdb traceback:
su-2.05# gdb -c natd.core /sbin/natd
GNU gdb 5.2.0 (FreeBSD) 20020627
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and 
you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for 
details.
This GDB was configured as "i386-undermydesk-freebsd"...(no debugging 
symbols found)...
Core was generated by `natd'.
Program terminated with signal 10, Bus error.
#0  0x08050c27 in ?? ()
(gdb) bt
#0  0x08050c27 in ?? ()
#1  0x0804f0f0 in ?? ()
#2  0x0804f0a6 in ?? ()
#3  0x080503b5 in ?? ()
#4  0x0804b489 in ?? ()
#5  0x08048b38 in ?? ()
#6  0x080487ee in ?? ()
#7  0x08048131 in ?? ()
(gdb)


If you need something else to diagnose this let me know and I'll
do whatever I can to help.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



can't make release - don't know how to make maninstall inusr.bin/strip

2002-04-28 Thread Joel M. Baldwin


I've been trying to do a 'make release' for a week or two and I keep 
getting the following error.  I've cvsuped, make buildworld, make 
installworld, reboot, a number times hoping for it to work, but no go.

Am I doing something wrong, or is there something that is yet to be 
fixed?


su-2.05# make release CHROOTDIR=/disk2/r 
BUILDNAME=jmb-citrus-snap-20020428 CVSROOT=/home/ncvs

. . . snip . . .

===> usr.bin/size
install -c -s -o root -g wheel -m 555   size /disk2/r/usr/libexec/aout
===> usr.bin/smbutil
install -c -s -o root -g wheel -m 555   smbutil /disk2/r/usr/bin
===> usr.bin/strings
install -c -s -o root -g wheel -m 555   strings 
/disk2/r/usr/libexec/aout
===> usr.bin/strip
make: don't know how to make maninstall. Stop
*** Error code 2

Stop in /disk2/usr.src/usr.bin.
*** Error code 1

Stop in /disk2/usr.src.
*** Error code 1

Stop in /disk2/usr.src.
*** Error code 1

Stop in /disk2/usr.src.
*** Error code 1

Stop in /disk2/usr.src.
*** Error code 1

Stop in /disk2/usr.src/release.
su-2.05#cd /usr/src/sys/i386/conf
su-2.05# cat citgate-smp
machine i386
ident   citgate-smp
maxusers32
options SMP # Symmetric MultiProcessor 
Kernel
options APIC_IO # Symmetric (APIC) I/O
cpu I686_CPU# aka Pentium Pro(tm)
options CPU_FASTER_5X86_FPU
options NO_F00F_HACK
options COMPAT_43
options SYSVSHM
options SYSVSEM
options SYSVMSG
options DIAGNOSTIC
options PERFMON
options INET#Internet communications 
protocols
options INET6   #IPv6 communications protocols
options IPSEC   #IP security
options IPSEC_ESP   #IP security (crypto; define w/ 
IPSEC)
device  ether   #Generic Ethernet
device  loop1   #Network loopback device
device  bpf #Berkeley packet filter
device  tun #Tunnel driver (ppp(8), 
nos-tun(8))
device  gif 4   #IPv6 and IPv4 tunneling
device  faith   1   #for IPv6 and IPv4 translation
device  stf #6to4 IPv6 over IPv4 
encapsulation
options IPFIREWALL  #firewall
options IPFIREWALL_VERBOSE  #print information about
# dropped packets
options IPFIREWALL_FORWARD  #enable transparent proxy 
support
options IPFIREWALL_VERBOSE_LIMIT=100#limit verbosity
options IPV6FIREWALL#firewall for IPv6
options IPV6FIREWALL_VERBOSE
options IPV6FIREWALL_VERBOSE_LIMIT=100
options IPDIVERT#divert sockets
options IPSTEALTH   #support for stealth forwarding
options FFS #Fast filesystem
options NFSSERVER   #Network File System
options NFSCLIENT   #Network File System
options CD9660  #ISO 9660 filesystem
options MSDOSFS #MS DOS File System (FAT, FAT32)
options PROCFS  #Process filesystem
options PSEUDOFS#Pseudo-filesystem framework
options SMBFS   #SMB/CIFS filesystem
options SOFTUPDATES
device  random
options P1003_1B
options _KPOSIX_PRIORITY_SCHEDULING
options _KPOSIX_VERSION=199309L
device  scbus   #base SCSI code
device  da  #SCSI direct access devices (aka disks)
device  pass#CAM passthrough driver
options SCSI_DELAY=3000 # Be pessimistic about Joe SCSI device
device  pty #Pseudo ttys
device  speaker #Play IBM BASIC-style noises out your 
speaker
device  gzip#Exec gzipped a.out's
device  md  #Memory/malloc disk
device  snp #Snoop device - to look at pty/vty/etc..
device  isa
options AUTO_EOI_1
options AUTO_EOI_2
device  pci
device  atkbdc  1
device  atkbd
device  psm
device  vga
device  splash
device  sc  1
options SC_HISTORY_SIZE=500 # number of history buffer lines
device  npx
options ACPI_DEBUG
device  ahc
options AHC_ALLOW_MEMIO
device  ata
device  atadisk # ATA disk drives
device  atapicd
device  fdc
device  miibus  # MII bus support
device  fxp # Intel EtherExpress PRO/100B (82557, 
82558)
# USB support
# UHCI controller
device  uhci
options UHCI_DEBUG
# General USB code (mandatory for USB)
device  usb
options USB_DEBUG
#
# Generic USB device driver
device  ugen
options UGEN

dev/usb/uhci.c is messed up again

2002-03-18 Thread Joel M. Baldwin


/usr/src/sys/dev/usb/uhci.c is messed up again for those
of us that have the kernel compiled with UHCI_DEBUG.  The
following patch appears to get it to at least compile.

problem summary:

1.) uhci_dump_ii is defined twice,
 starting at lines 805 and 920.

2.) a missing ; at the end of line 2764.



805,841d804
< void
< uhci_dump_ii(uhci_intr_info_t *ii)
< {
<   usbd_pipe_handle pipe;
<   usb_endpoint_descriptor_t *ed;
<   usbd_device_handle dev;
<
< #ifdef DIAGNOSTIC
< #define DONE ii->isdone
< #else
< #define DONE 0
< #endif
<   if (ii == NULL) {
<   printf("ii NULL\n");
<   return;
<   }
<   if (ii->xfer == NULL) {
<   printf("ii %p: done=%d xfer=NULL\n",
<  ii, DONE);
<   return;
<   }
<   pipe = ii->xfer->pipe;
<   if (pipe == NULL) {
<   printf("ii %p: done=%d xfer=%p pipe=NULL\n",
<  ii, DONE, ii->xfer);
<   return;
<   }
<   ed = pipe->endpoint->edesc;
<   dev = pipe->device;
<   printf("ii %p: done=%d xfer=%p dev=%p vid=0x%04x pid=0x%04x 
addr=%d pipe=%p ep=0x%02x attr=0x%02x\n",
<  ii, DONE, ii->xfer, dev,
<   UGETW(dev->ddesc.idVendor),
<   UGETW(dev->ddesc.idProduct),
<  dev->address, pipe,
<  ed->bEndpointAddress, ed->bmAttributes);
< #undef DONE
< }
2764c2727
<   splx(s)
---
>   splx(s);


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: size of /usr/src

2002-01-16 Thread Joel M. Baldwin


My current /usr/src is 524M.  That include 83M of kernel
object files in /usr/src/sys/i386/compile from a couple
of different kernel builds.  /usr/obj which holds the
object files from a buildworld is 460M.  If you're going
to do a full cvs repository then /home/ncvs on my system
is 1391M.


--On Tuesday, January 15, 2002 11:38 PM -0800 Arvind Srivaths 
<[EMAIL PROTECTED]> wrote:

>
> Hi,
>
> I created a separate partition for /usr/src (around 420MB) and cvsup
> ran out of space.  Can someone give me a rough idea of how big it is?
> Also, I should be able to use growfs (after booting off of a floppy)
> to increase the size of the partition (if the slice has space),
> right? How about moving partitions - is there an easier way than
> creating a partition at the end of the slice and copying partitions
> down?
>
> Thanks,
>
> Arvind
>
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: HEADS UP: -CURRENT switched from pam.conf to pam.d

2002-01-12 Thread Joel M. Baldwin


The convert script seems to have an error.


su-2.05# perl -w /usr/src/etc/pam.d/convert.pl /etc/pam.conf

/(\$FreeBSD: src/: unmatched () in regexp at 
/usr/src/etc/pam.d/convert.pl line 63.
su-2.05#


--On Saturday, January 12, 2002 3:09 PM +0100 Dag-Erling Smorgrav 
<[EMAIL PROTECTED]> wrote:

> The preferred configuration method for PAM is now /etc/pam.d/ rather
> than /etc/pam.conf.  If you have an unmodified pam.conf, just delete
> it after your next mergemaster run.  If you have local modifications,
> you can use /usr/src/etc/pam.d/convert.pl to incorporate them into
> your /etc/pam.d:
>
># cd /etc/pam.d
># perl -w /usr/src/etc/pam.d/convert.pl /etc/pam.conf
>
> The script will create new files for non-standard services you've
> added to pam.conf, and update existing files while taking care to
> preserve the version string so as to avoid tripping up mergemaster.
>
> If you do neither of these things, then after your next mergemaster
> run PAM will start using the policies in /etc/pam.d instead of
> /etc/pam.conf, falling back to the latter only when no appropriate
> policy was found in the former.
>
> DES
> --
> Dag-Erling Smorgrav - [EMAIL PROTECTED]
>
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



sys/dev/usb/uhci.c needs patch to compile

2002-01-08 Thread Joel M. Baldwin


I've been having to apply the following patch to get a kernel
compile to go through.  The problem sees to be that when
you have

options UHCI_DEBUG

AND

options DIAGNOSTIC

some conditional code gets added that trys to call uhci_dump_ii
which isn't defined anywhere.  Also on line 694 there is:
   uhci_dump_qh(sc->sc_ctl_start->qh.hlink);
which produces

cc -c -O -pipe  -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
-Wcast-qual  -fformat-extensions -ansi  -nostdinc -I-  -I. -I../../.. 
-I../../../dev -I../../../contrib/dev/acpica 
-I../../../contrib/ipfilter -I../../../../include  -D_KERNEL 
-ffreestanding -include opt_global.h -elf  -mpreferred-stack-boundary=2 
../../../dev/usb/uhci.c
../../../dev/usb/uhci.c: In function `uhci_dump_all':
../../../dev/usb/uhci.c:693: structure has no member named `hlink'
../../../dev/usb/uhci.c: At top level:
../../../dev/usb/uhci.c:1268: warning: `uhci_reset' defined but not used
*** Error code 1


 patch starts on the next line --
*** uhci.c.orig Mon Jan  7 07:32:06 2002
--- uhci.c  Mon Jan  7 07:37:32 2002
***
*** 258,264 
  Static void   uhci_dump_qh(uhci_soft_qh_t *);
  Static void   uhci_dump_tds(uhci_soft_td_t *);
  Static void   uhci_dump_td(uhci_soft_td_t *);
- Static void   uhci_dump_ii(uhci_intr_info_t *ii);
  void  uhci_dump(void);
  #endif

--- 258,263 
***
*** 691,697 
uhci_dumpregs(sc);
printf("intrs=%d\n", sc->sc_bus.no_intrs);
/*printf("framelist[i].link = %08x\n", 
sc->sc_framelist[0].link);*/
!   uhci_dump_qh(sc->sc_ctl_start->qh.hlink);
  }


--- 690,696 
uhci_dumpregs(sc);
printf("intrs=%d\n", sc->sc_bus.no_intrs);
/*printf("framelist[i].link = %08x\n", 
sc->sc_framelist[0].link);*/
! /*uhci_dump_qh(sc->sc_ctl_start->qh.hlink); */
  }


***
*** 1093,1099 
splx(s);
  #ifdef UHCI_DEBUG
printf("uhci_idone: ii is done!\n   ");
-   uhci_dump_ii(ii);
  #else
printf("uhci_idone: ii=%p is done!\n", ii);
  #endif
--- 1092,1097 
***
*** 2296,2302 
  if (ii->stdend == NULL) {
  printf("uhci_device_isoc_done: xfer=%p 
stdend==NULL\n", xfer);
  #ifdef UHCI_DEBUG
-   uhci_dump_ii(ii);
  #endif
return;
}
--- 2294,2299 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -CURRENT boot problems: loader, kernel

2001-11-09 Thread Joel M. Baldwin


So I'm not the only one having problems. .  .

FreeBSD/i386 bootstrap loader, Revision 1.0
([EMAIL PROTECTED], Fri Nov  9 01:58:33 PST 2001)
name not found
Assert failed: (false), function ficlCompileSoftcore, file softcore.c,
line 291

I got the system running by:

booting up with a set of 'fixit' floppies
mounting the root drive
cd'ing into /boot,
mv loader loader.new
cp loader.old loader
reboot


Robert Watson wrote:
> 
> Upgraded a box to yesterday's -CURRENT, and am experiencing two problems:
> 
> (1) the machine spins rebooting after loading /boot/loader.  I don't get a
> chance to interupt the boot once /boot/loader starts.  Unfortunately, my
> serial console support also seems to be broken, so I can't read the error
> that flicks up before the reboot.
> 
> (2) if I try to boot /boot/kernel/kernel directly, rather than via
> /boot/loader, it hangs in the twiddling bar.
> 
> If I load the old loader and kernel, things work fine.  I'm currently
> trying to diagnose the serial console problem, and will post more as I
> figure something out.
> 
> There are reports on that channel about other machines having the same
> problem, so if you're upgrading, make sure to keep an old loader around.
> 
> Robert N M Watson FreeBSD Core Team, TrustedBSD Project
> [EMAIL PROTECTED]  NAI Labs, Safeport Network Services
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



miniperl not found error?

2000-09-14 Thread Joel M. Baldwin


okI give up!

I've been getting this error for ages when doing a
'make depend' on the current tree and up to now
I've just done a 'make -k' to get it to work.

Whats the final solution so I don't have to do this?

What stupid thing have I missed somewhere along the line.

===> gnu/usr.bin/perl/perl
Extracting config.h (with variable substitutions)
Extracting cflags (with variable substitutions)
Extracting writemain (with variable substitutions)
Extracting myconfig (with variable substitutions)
/usr/obj/disk2/usr.src/gnu/usr.bin/perl/perl/../miniperl/miniperl: not
found
*** Error code 127
1 error
*** Error code 2
1 error
*** Error code 2
1 error
*** Error code 2
1 error
*** Error code 2
1 error
*** Error code 2
1 error
su-2.02#


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message