malloc message with nfs transfer

2003-08-21 Thread cosmin
malloc() of "64" with the following non-sleepable locks held:
exclusive sleep mutex inpr = 0 (0xcef0) locked @ 
/usr/src/sys/netinet/udp_usrreq.c:378 
exclusive sleep mutex netisr lock r = 0 (0xc061be80) locked @ 
/usr/src/sys/net/netisr.c:215

I'm getting those on the console, and it seems that they only happen when 
users start an nfs transfer to the nfs exported filesystem.  The exported 
filesystem is a vinum raid5 array but I don't know if that has anything to 
do with the messages.

Before I upgraded from 4.8, I used to be able to send at about 8mb/s to the nfs 
exported raid5.  After upgrading to 5.1-CURRENT, the maximum speed has been only 
4mb/s.  I'm wondering if the messages above have anything to do with the performance 
drop.


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: malloc message with nfs transfer

2003-08-21 Thread cosmin
On Thu, Aug 21, 2003 at 02:37:34PM -0400, Robert Watson wrote:
> 
> On Thu, 21 Aug 2003, cosmin wrote:
> 
> > malloc() of "64" with the following non-sleepable locks held:  exclusive
> > sleep mutex inpr = 0 (0xcef0) locked @
> > /usr/src/sys/netinet/udp_usrreq.c:378 exclusive sleep mutex netisr lock
> > r = 0 (0xc061be80) locked @ /usr/src/sys/net/netisr.c:215
> > 
> > I'm getting those on the console, and it seems that they only happen
> > when users start an nfs transfer to the nfs exported filesystem.  The
> > exported filesystem is a vinum raid5 array but I don't know if that has
> > anything to do with the messages.
> 
> Sorry, just to be clear -- is the message you're getting on the NFS
> client, or the NFS server?  Could you turn on debug.witness_ddb and get a
> stack trace for the warning?

This is on the NFS server.  I turned on debug.witness_ddb, but I'm not sure if this 
will help, because the system isn't locking up, or otherwise stopping.  I have tried 
setting a breakpoint in ddb for 0xcef0, but it starts breaking right away.  The 
malloc() messages are many minutes apart.

I'm not sure if these messages indicate anything critical.  I was mainly concerned 
with the nfs performance.

I tried reading the developer's handbook to figure out how to make it break only when 
there's a malloc message but right now I'm stuck.

> 
> > Before I upgraded from 4.8, I used to be able to send at about 8mb/s to
> > the nfs exported raid5.  After upgrading to 5.1-CURRENT, the maximum
> > speed has been only 4mb/s.  I'm wondering if the messages above have
> > anything to do with the performance drop. 
> 
> You appear to have the kernel debugging features turned to high (which
> will be useful for resolving this problem :-).  Turn off WITNESS and
> INVARIANTS and you should see a substantial performance improvement.  It
> may not be back up to 4.x levels -- we hope that with ongoing network
> stack locking work we'll be back to 4.x (and exceed them) in the next few
> months.
> 
> Thanks,
> 
> Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
> [EMAIL PROTECTED]  Network Associates Laboratories
> 
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


LOR (swap_pager.c , uma_core.c)

2003-11-11 Thread cosmin
Getting a lock order reversal when the system first boots up.  And this is the only 
time it seems to happen.

Nov 11 15:16:31 cosmin kernel: lock order reversal
Nov 11 15:16:31 cosmin kernel: 1st 0xc1ee165c vm object (vm object) @ 
/usr/src/sys/vm/swap_pager.c:1323
Nov 11 15:16:31 cosmin kernel: 2nd 0xc07100c0 swap_pager swhash (swap_pager swhash) @ 
/usr/src/sys/vm/swap_pager.c:1838
Nov 11 15:16:31 cosmin kernel: 3rd 0xc0c3440c vm object (vm object) @ 
/usr/src/sys/vm/uma_core.c:876
Nov 11 15:16:31 cosmin kernel: Stack backtrace:

I'm running the sources from yesterday:

FreeBSD cosmin.phy.uic.edu 5.1-CURRENT FreeBSD 5.1-CURRENT #7: Mon Nov 10 12:15:53 CST 
2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GALAXY  i386

If you need more information, please let me know.

Cosmin Stroe.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


NFS problem (non-sleepable locks held)

2003-11-11 Thread cosmin
I'm getting the following message when transfering data to a freebsd-current server 
via an nfs mount from another fbsd client.  

malloc() of "64" with the following non-sleepable locks held:
exclusive sleep mutex inp r = 0 (0xc1d250ac) locked @ 
/usr/src/sys/netinet/udp_usrreq.c:378

The message shows up 12 times and then it doesn't show up anymore, even if I stop the 
transfer and start it again.  This server uses the nge driver for its network card.  
It's running the sources from yesterday, Nov 10 2003.

I've been having problems with one of our machines freezing up during long nfs 
transfers, and now i'm trying to reproduce the freeze on this test machine.   So far 
no luck, and the only oddity i've been getting is the above message.

Could the above message be causing the freezes ?

Cosmin Stroe.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Vinum doesn't work anymore

2003-11-13 Thread cosmin
Hello, I'm having major problems with the latest kernel sources and vinum.  I don't 
really know where to begin the debug.  For now, I'll have to run older sources just to 
have vinum working.

Vinum isn't able to detect my volume, even though it worked fine before the install 
and reboot. Here's some info.

Script started on Thu Nov 13 02:14:17 2003
# kldunload vinum.ko 
# ls -la /boot/kernel/vinum.ko 
-r-xr-xr-x  1 root  wheel  97304 Nov 13 02:10 /boot/kernel/vinum.ko 
# ls -la /sbin/vinum 
-r-xr-xr-x  1 root  wheel  752300 Nov 13 01:58 /sbin/vinum 
# kldload /boot/kernel/vinum.ko 
# /sbin/vinum 
vinum -> start 
** no drives found: No such file or directory 
vinum -> dumpconfig 
Drive a:Device /dev/ad1s1e 
Created on  at Sat Jul 19 03:10:25 2003 
Config last updated Thu Nov 13 01:45:34 2003 
Size: 200047002624 bytes (190779 MB) 
volume raid5 state up 
plex name raid5.p0 state up org raid5 1020s vol raid5  
sd name raid5.p0.s0 drive a len 390715080s driveoffset 265s state up plex raid5.p0 
plexoffset 0s  
sd name raid5.p0.s1 drive b len 390715080s driveoffset 265s state up plex raid5.p0 
plexoffset 1020s  
sd name raid5.p0.s2 drive c len 390715080s driveoffset 265s state up plex raid5.p0 
plexoffset 2040s  
sd name raid5.p0.s3 drive d len 390715080s driveoffset 265s state up plex raid5.p0 
plexoffset 3060s  
sd name raid5.p0.s4 drive e len 390715080s driveoffset 265s state up plex raid5.p0 
plexoffset 4080s  
sd name raid5.p0.s5 drive f len 390715080s driveoffset 265s state up plex raid5.p0 
plexoffset 5100s  
 
[... etc for all the drives ...]

vinum -> read /dev/ad1 
** no drives found: No such file or directory 
Can't save Vinum config: No child processes 
vinum -> quit 
# exit 

Script done on Thu Nov 13 02:15:17 2003


Also, when I tried fixing this problem by trying to compile vinum into the kernel 
(maybe would help, probably wouldn't), I got a compile error.

The kernel is just the GENERIC kernel with device vinum added at the end of it.
Here is the error:


cc -c -O -pipe -mcpu=pentiumpro -Wall -Wredundant-decls -Wnested-externs 
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  
-fformat-extensions -std=c99 -g -nostdinc -I-  -I. -I/usr/src/sys 
-I/usr/src/sys/contrib/dev/acpica -I/usr/src/sys/contrib/ipfilter 
-I/usr/src/sys/contrib/dev/ath -I/usr/src/sys/contrib/dev/ath/freebsd 
-I/usr/src/sys/contrib/ngatm -D_KERNEL -include opt_global.h -fno-common 
-finline-limit=15000 -fno-strict-aliasing  -mno-align-long-strings 
-mpreferred-stack-boundary=2 -ffreestanding -Werror  /usr/src/sys/dev/vinum/vinum.c
/usr/src/sys/dev/vinum/vinum.c: In function `vinumattach':
/usr/src/sys/dev/vinum/vinum.c:136: error: structure has no member named 
`p_intr_nesting_level'
/usr/src/sys/dev/vinum/vinum.c:143: error: structure has no member named 
`p_intr_nesting_level'
/usr/src/sys/dev/vinum/vinum.c:150: error: structure has no member named 
`p_intr_nesting_level'
/usr/src/sys/dev/vinum/vinum.c:162: error: structure has no member named 
`p_intr_nesting_level'
*** Error code 1

Stop in /usr/obj/usr/src/sys/GENERIC_WITH_VINUM.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.

The sources are freshly cvsuped from Nov 12, 2003.

Cosmin Stroe.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


exclusive sleep mutex ... /usr/src/sys/kern/kern_synch.c:293

2003-11-14 Thread Cosmin Stroe
Hello,

I'm getting the following messages on my console with a compile from today's (Nov 14, 
2003) sources:

Nov 14 19:38:26  syslogd: /var/log/debug.log: No such file or directory
Nov 14 19:38:26 cosmin syslogd: kernel boot file is /boot/kernel/kernel
checking stopevent 2 with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc1cb8aa8) locked @ 
/usr/src/sys/kern/kern_synch.c:293
Debugger("witness_warn")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> trace
Debugger(c0675228,c93f4b88,1,c93f4b84,0) at Debugger+0x54
witness_warn(5,c1c0fcc8,c068d494,2,c06f37a0) at witness_warn+0x19f
issignal(c1bb8dc0,2,c068fc5b,bd,c1c0fcc8) at issignal+0x16b
cursig(c1bb8dc0,0,c0690152,125,1) at cursig+0xe8
msleep(c1c0fc5c,c1c0fcc8,15c,c068fb80,0) at msleep+0x631
wait1(c1bb8dc0,c93f4d10,0,c93f4d40,c065bca0) at wait1+0x990
wait4(c1bb8dc0,c93f4d10,c06a868e,3ee,4) at wait4+0x20
syscall(2f,2f,2f,bfbfeec0,bfbfeec0) at syscall+0x2e0
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (7, FreeBSD ELF32, wait4), eip = 0x280d0b1f, esp = 0xbfbfe84c, ebp = 
0xbfbfe868 ---
db> 


Also, if I don't have debug.witness_ddb=1, I get the following messages:

checking stopevent 2 with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc1cbdaa8) locked @ 
/usr/src/sys/kern/kern_synch.c:293
checking stopevent 2 with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc1cbdaa8) locked @ 
/usr/src/sys/kern/subr_trap.c:260
checking stopevent 2 with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc1cbdaa8) locked @ 
/usr/src/sys/kern/subr_trap.c:260

I will also try to buildworld and installworld using the same sources, maybe that will 
fix it.
If you need more information please ask.

Here is the full dmesg:

Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.1-CURRENT #10: Fri Nov 14 18:35:12 CST 2003
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GALAXY
Preloaded elf kernel "/boot/kernel/kernel" at 0xc07cf000.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) Processor (1100.05-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x671  Stepping = 1
  
Features=0x183f9ff
  AMD Features=0xc048
real memory  = 134217728 (128 MB)
avail memory = 125054976 (119 MB)
Pentium Pro MTRR support enabled
npx0: [FAST]
npx0:  on motherboard
npx0: INT 16 interface
pcibios: BIOS version 2.10
Using $PIR table, 7 entries at 0xc00fdc90
pcib0:  at pcibus 0 on motherboard
pci0:  on pcib0
pci_cfgintr: 0:8 INTA BIOS irq 10
pci_cfgintr: 0:9 INTA BIOS irq 7
pci_cfgintr: 0:9 INTB BIOS irq 11
pci_cfgintr: 0:9 INTC BIOS irq 5
pci_cfgintr: 0:10 INTA BIOS irq 11
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci_cfgintr: 0:1 INTA routed to irq 10
pcib1: slot 0 INTA is routed to irq 10
CPU: AMD Athlon(tm) Processor (1100.05-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x671  Stepping = 1
  
Features=0x183f9ff
  AMD Features=0xc048
real memory  = 134217728 (128 MB)
avail memory = 125054976 (119 MB)
Pentium Pro MTRR support enabled
npx0: [FAST]
npx0:  on motherboard
npx0: INT 16 interface
pcibios: BIOS version 2.10
Using $PIR table, 7 entries at 0xc00fdc90
pcib0:  at pcibus 0 on motherboard
pci0:  on pcib0
pci_cfgintr: 0:8 INTA BIOS irq 10
pci_cfgintr: 0:9 INTA BIOS irq 7
pci_cfgintr: 0:9 INTB BIOS irq 11
pci_cfgintr: 0:9 INTC BIOS irq 5
pci_cfgintr: 0:10 INTA BIOS irq 11
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci_cfgintr: 0:1 INTA routed to irq 10
pcib1: slot 0 INTA is routed to irq 10
pci1:  at device 0.0 (no driver attached)
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 0xc000-0xc00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
nge0:  port 0xcc00-0xccff mem 
0xdc005000-0xdc005fff irq 10 at device 8.0 on pci0
nge0: Ethernet address: 00:50:ba:39:06:d6
miibus0:  on nge0
nsgphy0:  on miibus0
nsgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, 
auto
uhci0:  port 0xd000-0xd01f irq 7 at device 9.0 on pci0
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub0: port error, restarting port 1
uhub0: port error, giving up port 1
uhub0: port error, restarting port 2
uhub0: port error, giving up port 2
uhci1:  port 0xd400-0xd41f irq 11 at device 9.1 on pci0
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub1: port error, restarting port 1
uhub1: port error, giving up port 1
uhub1: port error, restarting port 2
uhub1: port error, giving up port 2
pci0:  at device 9.2 (no driver attached)
atapci1

Re: checking stopevent 2!

2003-11-15 Thread Cosmin Stroe
On Sat, Nov 15, 2003 at 09:38:37AM -0500, Robert Watson wrote:
> 
> On Sat, 15 Nov 2003, Andy Farkas wrote:
> 
> would probably be useful if you could drop to DDB and generate a trace for
> the event.
> 

I've done that, in this email message:

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=2157067+0+current/freebsd-current

> > 
> > ...
> > Nov 15 16:05:44  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:44  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/kern_condvar.c:289
> > Nov 15 16:05:44  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:44  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > Nov 15 16:05:44  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:45  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > Nov 15 16:05:45  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:45  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4663aa8) locked @ /hummer/src-current/src/sys/kern/kern_synch.c:293
> > Nov 15 16:05:45  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:45  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4663aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > Nov 15 16:05:45  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:45  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4663aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > Nov 15 16:05:45  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:45  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/kern_condvar.c:289
> > Nov 15 16:05:45  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:46  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > Nov 15 16:05:46  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:46  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/kern_condvar.c:289
> > Nov 15 16:05:46  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:46  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > Nov 15 16:05:46  hummer kernel: checking stopevent 2 with the following 
> > non-sleepable locks held:
> > Nov 15 16:05:46  hummer kernel: exclusive sleep mutex sigacts r = 0 
> > (0xc4656aa8) locked @ /hummer/src-current/src/sys/kern/subr_trap.c:260
> > ...
> > 
> > 
> > 
> > This is latest -current (cvsup'd a few hours ago)
> > 
> > 
> > --
> > 
> >  :{ [EMAIL PROTECTED]
> > 
> > Andy Farkas
> > System Administrator
> >Speednet Communications
> >  http://www.speednet.com.au/
> > 
> > 
> > ___
> > [EMAIL PROTECTED] mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "[EMAIL PROTECTED]"
> > 
> 
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Cosmin Stroe
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


LOR (swap_pager.c:1323, swap_pager.c:1838, uma_core.c:876) (current:Nov17)

2003-11-18 Thread Cosmin Stroe
Here is the stack backtrace:

lock order reversal
 1st 0xc1da318c vm object (vm object) @ /usr/src/sys/vm/swap_pager.c:1323
 2nd 0xc0724900 swap_pager swhash (swap_pager swhash) @ 
/usr/src/sys/vm/swap_pager.c:1838
 3rd 0xc0c358c4 vm object (vm object) @ /usr/src/sys/vm/uma_core.c:876
Stack backtrace:
backtrace(c0692be9,c0c358c4,c06a376c,c06a376c,c06a464d) at backtrace+0x17
witness_lock(c0c358c4,8,c06a464d,36c,1) at witness_lock+0x672
_mtx_lock_flags(c0c358c4,0,c06a464d,36c,1) at _mtx_lock_flags+0xba
obj_alloc(c0c22480,1000,c976f9db,101,c06f3f50) at obj_alloc+0x3f
slab_zalloc(c0c22480,1,c06a464d,68c,c0c22494) at slab_zalloc+0xb3
uma_zone_slab(c0c22480,1,c06a464d,68c,c0c22520) at uma_zone_slab+0xd6
uma_zalloc_internal(c0c22480,0,1,5c1,72e,c06f55a8) at uma_zalloc_internal+0x3e
uma_zalloc_arg(c0c22480,0,1,72e,2) at uma_zalloc_arg+0x3ab
swp_pager_meta_build(c1da318c,7,0,2,0) at swp_pager_meta_build+0x174
swap_pager_putpages(c1da318c,c976fbb8,8,0,c976fb20) at swap_pager_putpages+0x32d
default_pager_putpages(c1da318c,c976fbb8,8,0,c976fb20) at default_pager_putpages+0x2e
vm_pageout_flush(c976fbb8,8,0,0,c06f36a0) at vm_pageout_flush+0x17a
vm_pageout_clean(c0dae2d8,0,c06a4468,32a,0) at vm_pageout_clean+0x305
vm_pageout_scan(0,0,c06a4468,5a9,1f4) at vm_pageout_scan+0x65f
vm_pageout(0,c976fd48,c068d4ed,311,0) at vm_pageout+0x31b
fork_exit(c0625250,0,c976fd48) at fork_exit+0xb4
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xc976fd7c, ebp = 0 ---
Debugger("witness_lock")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db>

I'm running the sources from yesterday, nov 17:

FreeBSD 5.1-CURRENT #0: Mon Nov 17 06:40:05 CST 2003 
root@:/usr/obj/usr/src/sys/GALAXY 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: requesting vinum help

2003-11-26 Thread Cosmin Stroe
On Wed, 26 Nov 2003, Poul-Henning Kamp wrote:

> In message <[EMAIL PROTECTED]>, "Joel M. Baldwin" writes:
> 
> >I was trying to use some restraint and not rant and rave in public like
> >I wanted to do.  I'm rather miffed that nothing appeared in UPDATING.
> >Rather than an unproductive public RANT I thought I'd ask for private assistance.
> >I can post a summary afterwards if you like, or even better write a better
> >FAQ/tutorial on vinum.
> 
> Joel,
> 
> The problem is that vinum is hot political potato in the project.
> 
> In the eyes of a fair number of competent people, vinum has never
> quite "made it".  I think most of them have given it a shot and
> lost data to it.  Some of them, after looking in the code to "fix
> the problem", said "never again!" and now hate vinum of a good
> heart.
> 
> Greg has disclaimed maintainership of vinum some time ago for reasons
> of politics, and he now is of the opinion that it is everybodys
> (elses) task to maintain vinum.  Everybody else disagree and belive
> that "vinum is very much Gregs own problem".
> 
> With Greg being a core@ member, and well known for his ability to
> talk an acturan megadonkey into taking a stroll after first having
> talked its legs off about procedural issues, "Doing something about
> vinum" is permanently on the "we should really..." list and everybody
> hopes somebody else will "deal with it".  Of course, in the end
> nobody does.
> 
> As matters stand, we are doing our users a disservice by continuing
> to pretend everything is OK when in fact it is not at all.
> 
> Personally, I think vinum(8) should not be in our 5-STABLE featureset
> if it is not brought up to current standards and actively maintained.
> 
> But at the very least we should have the release notes reflect that
> vinum is unmaintained and belived to unreliable and have vinum(8)
> issue a very stern warning to people along those lines.
> 
> I'm sure that a major bikeshed will now ensue and people will argue
> that there is a lot more to this dispute than what I've said above.
> 
> They're right of course, this is a very short summary :-)
> 
> Poul-Henning
> 
> 


I am using vinum atm, and I am having serious problems with it.  After 
about 16 hrs of writing data to a vinum volume via NFS at a constant data 
stream of 200k/sec and reading at 400k/sec at the same time, the whole 
machine just freezes, hard.  The only thing I can do is reboot.  This 
behavior appears in 4.8 and 5-CURRENT.  I have no indication of what is 
wrong, or how to go about finding it out.  The problem is either with NFS 
or Vinum, and I'm leaning towards Vinum (because of the failure in both 
-STABLE and -CURRENT).

I'm not the kind of person that relies on other people, and I like to fix 
my own problems, but this is a problem which I cannot fix at this time.  
So, I'm planning to look through the code of vinum and start messing with 
it to figure out how it works and how to debug it.  This is how important 
Vinum is to me at the moment.

I'm not a kernel coder, or an intense coder in general (but I'm proficient 
in C/C++, and have used FreeBSD for quite some years now), so I'm reading  
the Kernel Developer's Handbook as a starting point.  If anyone has other 
online documentation on FreeBSD Kernel programming, it would be much 
appreciated.  

What would also be appreciated is an overall "map" of how vinum is 
organized and how it works.  Otherwise, I'll have to painstaikingly 
go through the code and figure everything out little by little 
(which I plan to do, but if you know how Vinum works, everything is much 
easier, makes sense right away, and takes less time).

Thank you in advance.

Cosmin Stroe.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: requesting vinum help

2003-11-26 Thread Cosmin Stroe
On Thu, 27 Nov 2003, Greg 'groggy' Lehey wrote:

> On Wednesday, 26 November 2003 at 12:04:52 -0600, Cosmin Stroe wrote:
> >
> > I am using vinum atm, and I am having serious problems with it.  After
> > about 16 hrs of writing data to a vinum volume via NFS at a constant data
> > stream of 200k/sec and reading at 400k/sec at the same time, the whole
> > machine just freezes, hard.  The only thing I can do is reboot.  This
> > behavior appears in 4.8 and 5-CURRENT.  I have no indication of what is
> > wrong, or how to go about finding it out.  The problem is either with NFS
> > or Vinum, and I'm leaning towards Vinum (because of the failure in both
> > -STABLE and -CURRENT).
> >
> > I'm not the kind of person that relies on other people, and I like to fix
> > my own problems, but this is a problem which I cannot fix at this time.
> > So, I'm planning to look through the code of vinum and start messing with
> > it to figure out how it works and how to debug it.
> 
> This is unlikely to get you very far.  Some more details (offline if
> you prefer) would be handy, but as you say, you can't even be sure
> that it's Vinum.  The best thing would be to get the system into the
> kernel debugger at the point of freeze, if that's possible, and try to
> work out what has happened.
> 

Quick question:  If this is a software problem with vinum, there should be 
no way it can hard lock a machine.  Is this assumption correct ?  I should 
be able to invoke the kernel debugger by pressing the hotkey 
(ctrl+alt+esc) while the machine is locked and get a backtrace (altho i'd 
be in an ISR servicing the hotkey, so i'm not sure it'd do much good).

Any special suggestions on debugging this kind of freezing problem ?  The 
hardware has been tested and it's good (CPU,RAM,HDs). (some kind of 
watchdog in software ??)


> > What would also be appreciated is an overall "map" of how vinum is
> > organized and how it works.
> 
> You've read the documentation on http://www.vinumvm.org/, right?  If
> you have any questions, I'm sure it can be improved on.
> 

Yes :).

> Greg
> --
> See complete headers for address and phone numbers.
> 


Cosmin Stroe.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"