Re: amd(8) cores dump when load high

2008-12-27 Thread Rong-en Fan
On Sat, Dec 27, 2008 at 7:03 PM, Danny Braniss da...@cs.huji.ac.il wrote:
 No, we do not running amd with -S.

 # ps auxww | grep amd
 root  706  0.0  0.1  7660  5416  ??  Ss   Wed05PM   4:48.12
 /usr/sbin/amd -p -k amd64 -x all /net amd.map

 well, I'm running 7.1-PRERELEASE, what does the amd logs show?

[...]
 Dec 27 10:37:01 sf-02 amd[857]: Locked process pages in memory
**

Hmm.. interesting, I got this

Dec 26 15:32:11 bsd2 amd[39723]: Couldn't lock process pages in memory using mlo
ckall(): Resource temporarily unavailable

w/ 7-STABLE around Sep 4. I don't put plock = no in amd.conf, so
by default it's plock'ed.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: amd(8) cores dump when load high

2008-12-26 Thread Rong-en Fan
On Tue, Dec 23, 2008 at 12:44 AM, Lin Jui-Nan Eric eric...@tamama.org wrote:
 Dear listers,

 We currently found that amd frequently cores dump while loading is
 high (about 4~5) after we upgrade world  kernel from 7.0-RELEASE to
 7.1-PRERELEASE.

 I have read -stable and svn log of 7-STABLE, but can not found a
 report or a solution. Did anyone have the same issue? Thank you very
 much.


According to my previous experience, amd 6.1.5 crashes
under low memory situations. Not necessary high load.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: lock problem: nfs server on FreeBSD 7-stable, client on linux

2008-04-06 Thread Rong-en Fan
On Sun, Apr 6, 2008 at 1:18 PM, Tz-Huan Huang [EMAIL PROTECTED] wrote:
 Hi,

  Thanks for your suggestion, but we don't accept this workaround.

  After doing binary searching, I find that this commit break the working 
 lockd:

   http://lists.freebsd.org/pipermail/cvs-src/2008-March/089037.html

  I have rolled back the lockd.c to 1.20 in our nfs server and it works
  fine as before.

Add dfr@ to CC list.

I'm curious about this change, could you check what socket bind by
rpc.lockd and rpc.statd before and after lockd. rev 1.21+1.22 changes?

Thanks,
Rong-En Fan

  On Thu, Apr 3, 2008 at 1:02 AM, Ken Chen [EMAIL PROTECTED] wrote:
   I have the similar problem when FreeBSD 7 client + FreeBSD 6 server.
  
Now, I use ' mount_nfs -L' on the client to do local locking only. Of
course, it may cause other problem.
  
  
2008/4/2, Tz-Huan Huang [EMAIL PROTECTED]:
  
  
   
 Hi,

 We have one nfs server (Mar 27's 7-stable, AMD64) and many clients.
 One of the client is also 7-stable(Mar 30's, i386), and others are 
 Debian
 Linux. The problem is that the fcntl lock works fine on FreeBSD client
 but not on linux ones.

 We have tested the linux server + linux client, and they works fine.
 The following is all the combination we have tried:

 FreeBSD server + FreeBSD client: ok
 FreeBSd server + Linux clinet: fail
 Linux server + Linux client: ok
 Linux server + FreeBSD client: ok

 Is there some issue with 7-stable 's rpc.lockd?
 More information will be available if necessary, thanks.

 Tz-Huan
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]
  
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SchedULE vs BSD scheduler - Was: HP ProLiant DL360 G5 success stories?

2008-03-14 Thread Rong-en Fan
On Sat, Mar 15, 2008 at 12:14 AM, Christopher Sean Hilton
[EMAIL PROTECTED] wrote:

  On Mar 12, 2008, at 12:05 PM, Oliver Fromme wrote:

  
   Those machines work very well with both FreeBSD 6 and 7.
   If you install FreeBSD 7, remember to enable ULE instead
   of the default BSD scheduler.
  

  What's the advantage of ULE / disadvantage of the default? Is it
  specific to this hardware?

It gives you better performance. You may want to check Kris's slides

http://people.freebsd.org/~kris/scaling/7.0%20and%20beyond.pdf

Regards,
Rong-En Fan

  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic: locking against myself on 7.0-R

2008-03-11 Thread Rong-en Fan
It's 7.0-RELEASE amd64, GENERIC modulo some devices,
using 4BSD, IPSEC, and IPFW. The backtrace seems related
to softupdate code. This box is just a NFS server that serves
~25 6.x + Linux clients.

Any ideas?

Regards,
Rong-En Fan

panic: lockmgr: locking against myself
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x17a
_lockmgr() at _lockmgr+0x85a
getblk() at getblk+0x149
breadn() at breadn+0x3f
bread() at bread+0x1e
indir_trunc() at indir_trunc+0x11f
indir_trunc() at indir_trunc+0x287
indir_trunc() at indir_trunc+0x287
handle_workitem_freeblocks() at handle_workitem_freeblocks+0x2aa
process_worklist_item() at process_worklist_item+0x293
softdep_process_worklist() at softdep_process_worklist+0xed
softdep_flush() at softdep_flush+0x12a
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xb91f1d30, rbp = 0 ---
Uptime: 8d15h33m46s
Physical memory: 3064 MB
Dumping 470 MB: 455 439 423 407 391 375 359 343 327 311 295 279 263
247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7

#0 doadump () at pcpu.h:194
194 pcpu.h: No such file or directory.
 in pcpu.h

#0 doadump () at pcpu.h:194
#1 0x802b1ad8 in boot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:409
#2 0x802b1f37 in panic (fmt=Variable fmt is not available.
) at /usr/src/sys/kern/kern_shutdown.c:563
#3 0x802a258a in _lockmgr (lkp=0xa65f1c38, flags=0,
interlkp=Variable interlkp is not available.
)
 at /usr/src/sys/kern/kern_lock.c:366
#4 0x80319dc9 in getblk (vp=0xff000151e7c0, blkno=21058528,
 size=16384, slpflag=0, slptimeo=0, flags=Variable flags is not available.
) at buf.h:301
#5 0x8031aa8f in breadn (vp=0xff000151e7c0, blkno=Variable
blkno is not available.
)
 at /usr/src/sys/kern/vfs_bio.c:786
#6 0x8031abae in bread (vp=Variable vp is not available.
) at /usr/src/sys/kern/vfs_bio.c:734
#7 0x803e897f in indir_trunc (freeblks=0xff009c6db600,
 dbn=21058528, level=0, lbn=6303756, countp=0xb91f1b10)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:2866
#8 0x803e8ae7 in indir_trunc (freeblks=0xff009c6db600,
dbn=Variable dbn is not available.
)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:2892
#9 0x803e8ae7 in indir_trunc (freeblks=0xff009c6db600,
dbn=Variable dbn is not available.
)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:2892
#10 0x803e8f0a in handle_workitem_freeblocks (
 freeblks=0xff009c6db600, flags=0)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:2746
#11 0x803ea473 in process_worklist_item
(mp=0xff000159d978, flags=Variable flags is not available.
)
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:963
#12 0x803eb4cd in softdep_process_worklist (mp=0xff000159d978,
 full=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:847
#13 0x803ed42a in softdep_flush ()
 at /usr/src/sys/ufs/ffs/ffs_softdep.c:758
#14 0x802924bf in fork_exit (
 callout=0x803ed300 softdep_flush, arg=0x0,
 frame=0xb91f1c80) at /usr/src/sys/kern/kern_fork.c:781
#15 0x8043170e in fork_trampoline ()
 at /usr/src/sys/amd64/amd64/exception.S:415
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: broken buildkernel (scsi_low and -Os) and duplicate manpages

2008-02-13 Thread Rong-En Fan
On Wed, Feb 13, 2008 at 10:52:50AM +0100, Christian Brueffer wrote:
 On Wed, Feb 13, 2008 at 11:15:29AM +0200, David Naylor wrote:
  Hi,
  
  Building the kernel with CFLAGS=-Os breaks when compiling module
  scsi_low.  Sorry no output available.
  
  Placing CFLAGS+= -O in the Makefile fixes the problem.  Last build
  with -O2 did work (for everything, world, kernel and ports).
  
  From my research it appears the -Os produces code faster than -O2 and
  generally slower than -O3 but the smallest binary (and quicker compile
  times), does anyone have a better understanding of such things
  (performance and -O? flags).
  
  When doing an installworld DEST=? it fails twice when trying to
  install duplicate man pages:
  1) lib/ncurses/ncurses: tputs.3
  2) share/man/man9: rman_fini.9
  
 
 The rman_fini.9 one was a mistake, I've just fixed it.  Thanks!  rafan@
 (CCed) did the last few ncurses updates.  Rong-En, could you take a look
 at the tputs.3 issue?

Interesting, I actually use installworld w/ DESTDIR, but it does not
fail. Nevertheless, I have just removed the duplicate one (actually,
both curs_terminfo and curs_termcap has tputs.3. As we use termcap
in base, so I just removed the one links to curs_terminfo).

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.3 panic (seems hptmv related)

2008-02-04 Thread Rong-en Fan
We have a box running 6.2-RELEASE smoothly, once we boot
with 6.3-RELEASE. It panics in hptmv0, I have kernel dump available.
Any ideas?

Regards,
Rong-En Fan

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xfffb5444efc5
fault code  = supervisor read data, page not present
instruction pointer = 0x8:0x8032ac66
stack pointer   = 0x10:0xa5334b90
frame pointer   = 0x10:0x0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 22 (irq19: hptmv0)
trap number = 12
panic: page fault
Uptime: 2m46s
Dumping 1023 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 1023MB (261808 pages) 1007 991 975 959 943 927 911 895 879
863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607
591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335
319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31
15

#0  doadump () at pcpu.h:172
172 __asm __volatile(movq %%gs:0,%0 : =r (td));
(kgdb) bt full
#0  doadump () at pcpu.h:172
No locals.
#1  0x0004 in ?? ()
No symbol table info available.
#2  0x8021a083 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:409
first_buf_printf = 1
#3  0x8021a686 in panic (fmt=0xff003dba2980 °6¹=)
at /usr/src/sys/kern/kern_shutdown.c:565
bootopt = 260
newpanic = 0
ap = {{gp_offset = 16, fp_offset = 48,
overflow_arg_area = 0xa5334a00,
reg_save_area = 0xa5334930}}
buf = page fault, '\0' repeats 245 times
#4  0x80349e41 in trap_fatal (frame=0xff003dba2980,
eva=18446742975233472176) at /usr/src/sys/amd64/amd64/trap.c:669
code = 12
ss = 12
type = 12
esp = 0
softseg = {ssd_base = 0, ssd_limit = 1048575, ssd_type = 27,
  ssd_dpl = 0, ssd_p = 1, ssd_long = 1, ssd_def32 = 0, ssd_gran = 1}
msg = 0x0
#5  0x8034a1b2 in trap_pfault (frame=0xa5334ae0, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:580
va = 18446744053648515072
vm = (struct vmspace *) 0x0
map = 0x1
rv = 1
ftype = 1 '\001'
p = (struct proc *) 0x0
eva = 18446744053648519109
#6  0x8034a463 in trap (frame=
  {tf_rdi = -2144149195, tf_rsi = -2142164024, tf_rdx = 0, tf_rcx
= 3175162082, tf_r8 = 1536, tf_r9 = 97, tf_rax = -20061032619, tf_rbx
= -2144149195, tf_rbp = 0, tf_r10 = -2142280936, tf_r11 = -2054259168,
tf_r12 = 4, tf_r13 = 0, tf_r14 = -1099503442816, tf_r15 =
-1098476017280, tf_trapno = 12, tf_addr = -20061032507, tf_flags =
-1098476079440, tf_err = 0, tf_rip = -2144162714, tf_cs = 8, tf_rflags
= 66183, tf_rsp = -1523364960, tf_ss = 16})
at /usr/src/sys/amd64/amd64/trap.c:353
p = (struct proc *) 0xff003db936b0
sticks = 4294967295
type = 3
i = 0
ucode = 0
code = 0
#7  0x80334dbb in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:168
No locals.
#8  0x8032ac66 in CheckPendingCall ()
No symbol table info available.
#9  0x8035681c in hpt_intr (arg=0x8032df25)
at /usr/src/sys/dev/hptmv/entry.c:2039
_vbus_p = 0x8032e135
oldspl = 0
#10 0x80200335 in ithread_loop (arg=0xff7ce480)
at /usr/src/sys/kern/kern_intr.c:682
ie = (struct intr_event *) 0xff009800
#11 0x801fed83 in fork_exit (
callout=0x802001f0 ithread_loop, arg=0xff7ce480,
frame=0xa5334c50) at /usr/src/sys/kern/kern_fork.c:788
p = (struct proc *) 0xff003db936b0
#12 0x8033517e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:411
No locals.
#13 0x in ?? ()
No symbol table info available.
#14 0x in ?? ()
No symbol table info available.
#15 0x0001 in ?? ()
No symbol table info available.
#16 0x in ?? ()
No symbol table info available.
#17 0x in ?? ()
No symbol table info available.
#18 0x in ?? ()
No symbol table info available.
#19 0x in ?? ()
No symbol table info available.
#20 0x in ?? ()
No symbol table info available.
#21 0x in ?? ()
No symbol table info available.
#22 0x in ?? ()
No symbol table info available.
#23 0x in ?? ()
No symbol table info available.
#24 0x in ?? ()
No symbol table info available.
#25 0x in ?? ()
No symbol table info available.
#26 0x in ?? ()
No symbol table info available.
#27 0x in ?? ()
No symbol table info available.
#28 0x in ?? ()
No symbol table info available.
#29 0x in ?? ()
No symbol table info available.
#30

if you see undefined symbol '__mb_sb_limit' on 6.x

2007-11-22 Thread Rong-en Fan
The ctype fix for UTF-8 locale unfortunately introduced some
new symbols to libc. Therefore, binaries built on system with
that fix can not be used on older system. For that sake, the
fix is back-out for 6-STABLE. If you see undefined symbols
'__mb_sb_limit', please rebuild the affected binary. Everything
will be fine then. Binaries built between 20071025 and 20071030
will be affected by this.

Sorry for the inconvenience.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM xSeries 336 dual Xeon hangs on boot when APIC enabled

2007-11-06 Thread Rong-en Fan
On Aug 13, 2006 11:41 PM, Arjan van Leeuwen [EMAIL PROTECTED] wrote:
 I'm trying to boot FreeBSD 6.1-RELEASE/amd64 on an IBM xSeries 336 machine
 with dual Xeons 3.2GHz installed.

 The installation was successful, but
 if I try to boot the SMP kernel, it hangs after detection of SCSI and ATA
 devices (possibly when doing the initialization of the mpt0 RAID controller,
 or when it tries to start the second CPU?).

Recently, I had an opportunity to access one xSeries 336 box. With
7.0-BETA2 amd64, it boots just fine without any tuning. SMP is also
working.

Something must be changed in the past two years. ;-)

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADSUP: don't upgrade to RELENG_6 now [FIXED]

2007-10-30 Thread Rong-en Fan
On 10/30/07, Byung-Hee HWANG [EMAIL PROTECTED] wrote:
 Hello,

 On Thu, 2007-10-25 at 20:51 +0800, Rong-en Fan wrote:
  The breakage introduced by MFC of ctype(3) after 2007/10/24 14:23 UTC
  is now fixed. Make sure you have lib/Makefile rev 1.205.2.4 before upgrading
  your world. If it breaks already, please follow the instructions in 
  src/UPDATING
  to recover.

 In that case,
 should I re-build only userland ? or
 should I re-build both userland and kernel?
 (Yep, of course, I have updated source tree with CVSup for now)

If you have fixed your world, it should be okay. But I suggest you
upgrading to latest 6-STABLE for ctype abi forward compatibility
fix.

Regards,
Rong-En Fan


  Sorry for all the troubles.

 No problem, I'm always OK!

  Regards,
  Rong-En Fan

 Byung-Hee

  On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
   On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
On 10/25/07, Alson van der Meulen [EMAIL PROTECTED] wrote:
 Hello,

 My installworld of RELENG_6 from a few hours ago failed with this 
 error
 (from memory):
 /lib/libncurses.so.6: undefined symbol: __mb_sb_limit

 This broke everything that depended on libncurses, plus PAM. I had to
 force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to /lib
 using binaries from /rescue to fix it so I could run make installworld
 again.
   
I will take a look. before that do not upgrade your system.
  
   I did some tests, it turns out that only RELENG_6 is affected. To be
   more specific, as ncurses lib is installed before libc. It gets broken.
   For 7 and above, it is fine because we install libc right after csu and
   before everything.
  
   One way to solve this is we install libc as early as possible, but I
   think it may be too risky at release cycle, so I would like to back
   out this change and add an UPDATING entry. Then check whether
   we can change the installation order of libc later.
  
   If you cvsup after 'Oct 24 14:32:33 2007 UTC', please do not
   upgrade until I send out all clear message.
  
   If you already broken your world, use this way in *single user*,
  
   /rescue/chflags noschg /lib/libc.so.6
   /rescue/cp /usr/obj/usr/src/lib/libc/libc.so.6 /lib/
  
   then you need reboot, after that continue installworld (this is what I
   just did).
  
   Sorry for all the trouble.
  
   Thanks,
   Rong-En Fan
  
   
Regards,
Rong-En Fan
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to [EMAIL PROTECTED]
 --
 After super, can you drive me and the kids to New York in your car?
 That's what I came for.
 -- Kay Adams and Tom Hagen, Chapter 32, page 443


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADSUP: don't upgrade to RELENG_6 now (7 is fine)

2007-10-25 Thread Rong-en Fan
On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
 On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
  On 10/25/07, Alson van der Meulen [EMAIL PROTECTED] wrote:
   Hello,
  
   My installworld of RELENG_6 from a few hours ago failed with this error
   (from memory):
   /lib/libncurses.so.6: undefined symbol: __mb_sb_limit
  
   This broke everything that depended on libncurses, plus PAM. I had to
   force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to /lib
   using binaries from /rescue to fix it so I could run make installworld
   again.
 
  I will take a look. before that do not upgrade your system.

 I did some tests, it turns out that only RELENG_6 is affected. To be
 more specific, as ncurses lib is installed before libc. It gets broken.
 For 7 and above, it is fine because we install libc right after csu and
 before everything.

 One way to solve this is we install libc as early as possible, but I
 think it may be too risky at release cycle, so I would like to back
 out this change and add an UPDATING entry. Then check whether
 we can change the installation order of libc later.

 If you cvsup after 'Oct 24 14:32:33 2007 UTC', please do not
 upgrade until I send out all clear message.

 If you already broken your world, use this way in *single user*,

 /rescue/chflags noschg /lib/libc.so.6
 /rescue/cp /usr/obj/usr/src/lib/libc/libc.so.6 /lib/

 then you need reboot, after that continue installworld (this is what I
 just did).

 Sorry for all the trouble.

An entry in UPDATING is added and I'm working on a proper fix.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADSUP: don't upgrade to RELENG_6 now (7 is fine)

2007-10-25 Thread Rong-en Fan
On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
 On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
  On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
   On 10/25/07, Alson van der Meulen [EMAIL PROTECTED] wrote:
Hello,
   
My installworld of RELENG_6 from a few hours ago failed with this error
(from memory):
/lib/libncurses.so.6: undefined symbol: __mb_sb_limit
   
This broke everything that depended on libncurses, plus PAM. I had to
force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to /lib
using binaries from /rescue to fix it so I could run make installworld
again.
  
   I will take a look. before that do not upgrade your system.
 
  I did some tests, it turns out that only RELENG_6 is affected. To be
  more specific, as ncurses lib is installed before libc. It gets broken.
  For 7 and above, it is fine because we install libc right after csu and
  before everything.
 
  One way to solve this is we install libc as early as possible, but I
  think it may be too risky at release cycle, so I would like to back
  out this change and add an UPDATING entry. Then check whether
  we can change the installation order of libc later.
 
  If you cvsup after 'Oct 24 14:32:33 2007 UTC', please do not
  upgrade until I send out all clear message.
 
  If you already broken your world, use this way in *single user*,
 
  /rescue/chflags noschg /lib/libc.so.6
  /rescue/cp /usr/obj/usr/src/lib/libc/libc.so.6 /lib/
 
  then you need reboot, after that continue installworld (this is what I
  just did).
 
  Sorry for all the trouble.

 An entry in UPDATING is added and I'm working on a proper fix.

If you have updated your source after Oct 24, please apply this
patch

http://people.freebsd.org/~rafan/libc-order.diff

before building world.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADSUP: don't upgrade to RELENG_6 now [FIXED]

2007-10-25 Thread Rong-en Fan
The breakage introduced by MFC of ctype(3) after 2007/10/24 14:23 UTC
is now fixed. Make sure you have lib/Makefile rev 1.205.2.4 before upgrading
your world. If it breaks already, please follow the instructions in src/UPDATING
to recover.

Sorry for all the troubles.

Regards,
Rong-En Fan

On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
 On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
  On 10/25/07, Alson van der Meulen [EMAIL PROTECTED] wrote:
   Hello,
  
   My installworld of RELENG_6 from a few hours ago failed with this error
   (from memory):
   /lib/libncurses.so.6: undefined symbol: __mb_sb_limit
  
   This broke everything that depended on libncurses, plus PAM. I had to
   force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to /lib
   using binaries from /rescue to fix it so I could run make installworld
   again.
 
  I will take a look. before that do not upgrade your system.

 I did some tests, it turns out that only RELENG_6 is affected. To be
 more specific, as ncurses lib is installed before libc. It gets broken.
 For 7 and above, it is fine because we install libc right after csu and
 before everything.

 One way to solve this is we install libc as early as possible, but I
 think it may be too risky at release cycle, so I would like to back
 out this change and add an UPDATING entry. Then check whether
 we can change the installation order of libc later.

 If you cvsup after 'Oct 24 14:32:33 2007 UTC', please do not
 upgrade until I send out all clear message.

 If you already broken your world, use this way in *single user*,

 /rescue/chflags noschg /lib/libc.so.6
 /rescue/cp /usr/obj/usr/src/lib/libc/libc.so.6 /lib/

 then you need reboot, after that continue installworld (this is what I
 just did).

 Sorry for all the trouble.

 Thanks,
 Rong-En Fan

 
  Regards,
  Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


HEADSUP: don't upgrade to RELENG_[67] now (Re: Installworld broken on RELENG_6 by libc commit?)

2007-10-24 Thread Rong-en Fan
On 10/25/07, Alson van der Meulen [EMAIL PROTECTED] wrote:
 Hello,

 My installworld of RELENG_6 from a few hours ago failed with this error
 (from memory):
 /lib/libncurses.so.6: undefined symbol: __mb_sb_limit

 This broke everything that depended on libncurses, plus PAM. I had to
 force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to /lib
 using binaries from /rescue to fix it so I could run make installworld
 again.

I will take a look. before that do not upgrade your system.

Regards,
Rong-En Fan


 I upgraded from RELENG_6 of October, 22. I believe I followed the
 procedure from /usr/src/UPDATING fairly closely, except for the reboot
 to single user part after installing the kernel:
 mergemaster -p  make buildworld  make kernel  make installworld 
 mergemaster  make delete-old

 I would expect libc to be installed before other libs. The securelevel
 was -1, so it should be no problem to overwrite libc.

 Did I do something wrong or is this a bug/missing entry in UPDATING?

 regards,
 Alson

 The csup output since my last make world:
 --
  Running /usr/bin/csup
 --
 Parsing supfile /usr/share/examples/cvsup/stable-supfile
 Connecting to cvsup3.nl.freebsd.org
 Connected to 62.250.3.15
 Server software version: SNAP_16_1h
 Negotiating file attribute support
 Exchanging collection information
 Establishing multiplexed-mode data connection
 Running
 Updating collection src-all/cvs
  Edit src/include/_ctype.h
   Add delta 1.30.2.1 2007.10.24.14.32.32 rafan
  Edit src/include/ctype.h
   Add delta 1.28.8.1 2007.10.24.14.32.32 rafan
  Edit src/lib/libc/locale/big5.c
   Add delta 1.17.2.1 2007.10.24.14.32.32 rafan
  Edit src/lib/libc/locale/euc.c
   Add delta 1.21.2.1 2007.10.24.14.32.32 rafan
  Edit src/lib/libc/locale/gb18030.c
   Add delta 1.7.2.1 2007.10.24.14.32.32 rafan
  Edit src/lib/libc/locale/gb2312.c
   Add delta 1.9.2.1 2007.10.24.14.32.33 rafan
  Edit src/lib/libc/locale/gbk.c
   Add delta 1.12.2.1 2007.10.24.14.32.33 rafan
  Edit src/lib/libc/locale/isctype.c
   Add delta 1.9.14.1 2007.10.24.14.32.33 rafan
  Edit src/lib/libc/locale/mskanji.c
   Add delta 1.17.2.1 2007.10.24.14.32.33 rafan
  Edit src/lib/libc/locale/none.c
   Add delta 1.13.2.1 2007.10.24.14.32.33 rafan
  Edit src/lib/libc/locale/setrunelocale.c
   Add delta 1.45.2.1 2007.10.24.14.32.33 rafan
  Edit src/lib/libc/locale/utf8.c
   Add delta 1.13.2.2 2007.10.24.14.32.33 rafan
  Edit src/lib/libstand/Makefile
   Add delta 1.54.2.1 2007.10.24.11.50.06 nyan
  Edit src/release/Makefile
   Add delta 1.887.2.21 2007.10.23.23.45.14 kensmith
  Edit src/sbin/mount_unionfs/mount_unionfs.8
   Add delta 1.20.2.2 2007.10.23.03.37.09 daichi
  Edit src/share/mklocale/UTF-8.src
   Add delta 1.1.8.2 2007.10.24.14.32.33 rafan
  Edit src/sys/alpha/pci/pcibus.c
   Add delta 1.36.2.2 2007.10.24.12.36.25 jhb
  Edit src/sys/boot/ficl/Makefile
   Add delta 1.41.2.2 2007.10.24.11.50.07 nyan
  Edit src/sys/boot/pc98/Makefile.inc
   Add delta 1.5.8.2 2007.10.24.11.50.07 nyan
  Edit src/sys/conf/newvers.sh
   Add delta 1.69.2.15 2007.10.23.23.41.24 kensmith
  Edit src/sys/ddb/db_command.c
   Add delta 1.60.2.4 2007.10.23.16.07.30 obrien
  Edit src/sys/fs/nullfs/null_subr.c
   Add delta 1.48.2.2 2007.10.23.03.38.31 daichi
  Edit src/sys/fs/nullfs/null_vnops.c
   Add delta 1.87.2.4 2007.10.23.03.38.32 daichi
  Edit src/sys/fs/unionfs/union.h
   Add delta 1.31.2.2 2007.10.23.03.28.22 daichi
   Add delta 1.31.2.3 2007.10.23.03.37.09 daichi
  Edit src/sys/fs/unionfs/union_subr.c
   Add delta 1.86.2.2 2007.10.23.03.22.48 daichi
   Add delta 1.86.2.3 2007.10.23.03.28.22 daichi
  Edit src/sys/fs/unionfs/union_vfsops.c
   Add delta 1.76.2.3 2007.10.23.03.32.17 daichi
   Add delta 1.76.2.4 2007.10.23.03.34.58 daichi
   Add delta 1.76.2.5 2007.10.23.03.37.09 daichi
  Edit src/sys/fs/unionfs/union_vnops.c
   Add delta 1.132.2.2 2007.10.23.03.24.37 daichi
   Add delta 1.132.2.3 2007.10.23.03.26.37 daichi
   Add delta 1.132.2.4 2007.10.23.03.28.22 daichi
   Add delta 1.132.2.5 2007.10.23.03.30.13 daichi
   Add delta 1.132.2.6 2007.10.23.03.32.17 daichi
   Add delta 1.132.2.7 2007.10.23.03.33.43 daichi
   Add delta 1.132.2.8 2007.10.23.03.37.10 daichi
 Shutting down connection to server
 Finished successfully
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADSUP: don't upgrade to RELENG_6 now (7 is fine)

2007-10-24 Thread Rong-en Fan
On 10/25/07, Rong-en Fan [EMAIL PROTECTED] wrote:
 On 10/25/07, Alson van der Meulen [EMAIL PROTECTED] wrote:
  Hello,
 
  My installworld of RELENG_6 from a few hours ago failed with this error
  (from memory):
  /lib/libncurses.so.6: undefined symbol: __mb_sb_limit
 
  This broke everything that depended on libncurses, plus PAM. I had to
  force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to /lib
  using binaries from /rescue to fix it so I could run make installworld
  again.

 I will take a look. before that do not upgrade your system.

I did some tests, it turns out that only RELENG_6 is affected. To be
more specific, as ncurses lib is installed before libc. It gets broken.
For 7 and above, it is fine because we install libc right after csu and
before everything.

One way to solve this is we install libc as early as possible, but I
think it may be too risky at release cycle, so I would like to back
out this change and add an UPDATING entry. Then check whether
we can change the installation order of libc later.

If you cvsup after 'Oct 24 14:32:33 2007 UTC', please do not
upgrade until I send out all clear message.

If you already broken your world, use this way in *single user*,

/rescue/chflags noschg /lib/libc.so.6
/rescue/cp /usr/obj/usr/src/lib/libc/libc.so.6 /lib/

then you need reboot, after that continue installworld (this is what I
just did).

Sorry for all the trouble.

Thanks,
Rong-En Fan


 Regards,
 Rong-En Fan

 
  I upgraded from RELENG_6 of October, 22. I believe I followed the
  procedure from /usr/src/UPDATING fairly closely, except for the reboot
  to single user part after installing the kernel:
  mergemaster -p  make buildworld  make kernel  make installworld 
  mergemaster  make delete-old
 
  I would expect libc to be installed before other libs. The securelevel
  was -1, so it should be no problem to overwrite libc.
 
  Did I do something wrong or is this a bug/missing entry in UPDATING?
 
  regards,
  Alson
 
  The csup output since my last make world:
  --
   Running /usr/bin/csup
  --
  Parsing supfile /usr/share/examples/cvsup/stable-supfile
  Connecting to cvsup3.nl.freebsd.org
  Connected to 62.250.3.15
  Server software version: SNAP_16_1h
  Negotiating file attribute support
  Exchanging collection information
  Establishing multiplexed-mode data connection
  Running
  Updating collection src-all/cvs
   Edit src/include/_ctype.h
Add delta 1.30.2.1 2007.10.24.14.32.32 rafan
   Edit src/include/ctype.h
Add delta 1.28.8.1 2007.10.24.14.32.32 rafan
   Edit src/lib/libc/locale/big5.c
Add delta 1.17.2.1 2007.10.24.14.32.32 rafan
   Edit src/lib/libc/locale/euc.c
Add delta 1.21.2.1 2007.10.24.14.32.32 rafan
   Edit src/lib/libc/locale/gb18030.c
Add delta 1.7.2.1 2007.10.24.14.32.32 rafan
   Edit src/lib/libc/locale/gb2312.c
Add delta 1.9.2.1 2007.10.24.14.32.33 rafan
   Edit src/lib/libc/locale/gbk.c
Add delta 1.12.2.1 2007.10.24.14.32.33 rafan
   Edit src/lib/libc/locale/isctype.c
Add delta 1.9.14.1 2007.10.24.14.32.33 rafan
   Edit src/lib/libc/locale/mskanji.c
Add delta 1.17.2.1 2007.10.24.14.32.33 rafan
   Edit src/lib/libc/locale/none.c
Add delta 1.13.2.1 2007.10.24.14.32.33 rafan
   Edit src/lib/libc/locale/setrunelocale.c
Add delta 1.45.2.1 2007.10.24.14.32.33 rafan
   Edit src/lib/libc/locale/utf8.c
Add delta 1.13.2.2 2007.10.24.14.32.33 rafan
   Edit src/lib/libstand/Makefile
Add delta 1.54.2.1 2007.10.24.11.50.06 nyan
   Edit src/release/Makefile
Add delta 1.887.2.21 2007.10.23.23.45.14 kensmith
   Edit src/sbin/mount_unionfs/mount_unionfs.8
Add delta 1.20.2.2 2007.10.23.03.37.09 daichi
   Edit src/share/mklocale/UTF-8.src
Add delta 1.1.8.2 2007.10.24.14.32.33 rafan
   Edit src/sys/alpha/pci/pcibus.c
Add delta 1.36.2.2 2007.10.24.12.36.25 jhb
   Edit src/sys/boot/ficl/Makefile
Add delta 1.41.2.2 2007.10.24.11.50.07 nyan
   Edit src/sys/boot/pc98/Makefile.inc
Add delta 1.5.8.2 2007.10.24.11.50.07 nyan
   Edit src/sys/conf/newvers.sh
Add delta 1.69.2.15 2007.10.23.23.41.24 kensmith
   Edit src/sys/ddb/db_command.c
Add delta 1.60.2.4 2007.10.23.16.07.30 obrien
   Edit src/sys/fs/nullfs/null_subr.c
Add delta 1.48.2.2 2007.10.23.03.38.31 daichi
   Edit src/sys/fs/nullfs/null_vnops.c
Add delta 1.87.2.4 2007.10.23.03.38.32 daichi
   Edit src/sys/fs/unionfs/union.h
Add delta 1.31.2.2 2007.10.23.03.28.22 daichi
Add delta 1.31.2.3 2007.10.23.03.37.09 daichi
   Edit src/sys/fs/unionfs/union_subr.c
Add delta 1.86.2.2 2007.10.23.03.22.48 daichi
Add delta 1.86.2.3 2007.10.23.03.28.22 daichi
   Edit src/sys/fs/unionfs/union_vfsops.c
Add delta 1.76.2.3 2007.10.23.03.32.17 daichi
Add delta 1.76.2.4 2007.10.23.03.34.58 daichi
Add delta 1.76.2.5 2007.10.23.03.37.09 daichi
   Edit src/sys/fs/unionfs/union_vnops.c
Add delta 1.132.2.2

Re: Installworld broken on RELENG_6 by libc commit?

2007-10-24 Thread Rong-en Fan
On 10/25/07, David Booth [EMAIL PROTECTED] wrote:
 On Wednesday 24 October 2007, Alson van der Meulen wrote:
  Hello,
 
  My installworld of RELENG_6 from a few hours ago failed with this
  error (from memory):
  /lib/libncurses.so.6: undefined symbol: __mb_sb_limit
 
  This broke everything that depended on libncurses, plus PAM. I had
  to force a reboot via DDB and copy /usr/obj/lib/libc/libc.so.6 to
  /lib using binaries from /rescue to fix it so I could run make
  installworld again.
 
  I upgraded from RELENG_6 of October, 22. I believe I followed the
  procedure from /usr/src/UPDATING fairly closely, except for the
  reboot to single user part after installing the kernel:
  mergemaster -p  make buildworld  make kernel  make
  installworld  mergemaster  make delete-old
 
  I would expect libc to be installed before other libs. The
  securelevel was -1, so it should be no problem to overwrite libc.
 
  Did I do something wrong or is this a bug/missing entry in
  UPDATING?
 
  regards,
  Alson
 
 It is not something you did.  I had the same problem and have just
 recovered from it.

Sorry for the trouble, see my HEADSUP message on [EMAIL PROTECTED]

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: is read-write nullfs safe?

2007-06-21 Thread Rong-en Fan

On 6/21/07, Peter Jeremy [EMAIL PROTECTED] wrote:

On 2007-Jun-19 02:58:20 -0400, Kris Kennaway [EMAIL PROTECTED] wrote:
On Tue, Jun 19, 2007 at 02:39:22PM +0800, Rong-en Fan wrote:
 I was asking about nullfs because the following lines
 in sys/conf/NOTES:

 # NB: The NULL, PORTAL, UMAP and UNION filesystems are known to be
 # buggy, and WILL panic your system if you attempt to do anything with
 # them.  They are included here as an incentive for some enterprising
 # soul to sit down and fix them.

Yeah, that's almost completely stale for both 6.x and 7.x.

Since this issue pops up fairly regularly, would it be possible to
correct, tone down or remove this warning before 6.3/7.0?


Ya, how about this one:

http://people.freebsd.org/~rafan/remove-nullfs-warning.diff

It just remove NULL from the warning list. If no one objects,
I will ask re@ for approval tomorrow. BTW, since umapfs
is disconnected from build, shall we axe it?

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: is read-write nullfs safe?

2007-06-21 Thread Rong-en Fan

On 6/21/07, Kris Kennaway [EMAIL PROTECTED] wrote:

On Thu, Jun 21, 2007 at 09:49:16PM +0800, Rong-en Fan wrote:
 On 6/21/07, Peter Jeremy [EMAIL PROTECTED] wrote:
 On 2007-Jun-19 02:58:20 -0400, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Tue, Jun 19, 2007 at 02:39:22PM +0800, Rong-en Fan wrote:
  I was asking about nullfs because the following lines
  in sys/conf/NOTES:
 
  # NB: The NULL, PORTAL, UMAP and UNION filesystems are known to be
  # buggy, and WILL panic your system if you attempt to do anything with
  # them.  They are included here as an incentive for some enterprising
  # soul to sit down and fix them.
 
 Yeah, that's almost completely stale for both 6.x and 7.x.
 
 Since this issue pops up fairly regularly, would it be possible to
 correct, tone down or remove this warning before 6.3/7.0?

 Ya, how about this one:

 http://people.freebsd.org/~rafan/remove-nullfs-warning.diff

Maybe note that UNIONFS is being maintained now and is in a much
better state, although there are still some issues being resolved.


OK, I add them. See the patch in the url above.


 It just remove NULL from the warning list. If no one objects,
 I will ask re@ for approval tomorrow. BTW, since umapfs
 is disconnected from build, shall we axe it?

I would recommend it.

Kris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: is read-write nullfs safe?

2007-06-19 Thread Rong-en Fan

On 6/19/07, Josh Paetzel [EMAIL PROTECTED] wrote:

On Tuesday 19 June 2007, Rong-en Fan wrote:
 I'm running 6.2-RELEASE, and I am wondering
 if using nullfs w/ rw is safe in a production environment?
 My impression is that ro nullfs is ok, but not rw.
 Is this still the case?

 Regards,
 Rong-En Fan

I've been using r/w nullfs in production for ages without issue...sure
you're not confusing nullfs with unionfs?


I'm aware that unionfs status and I think it's usable
in 7.x, right?

I was asking about nullfs because the following lines
in sys/conf/NOTES:

# NB: The NULL, PORTAL, UMAP and UNION filesystems are known to be
# buggy, and WILL panic your system if you attempt to do anything with
# them.  They are included here as an incentive for some enterprising
# soul to sit down and fix them.

Regards,
Rong-En Fan


--
Thanks,

Josh Paetzel



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


is read-write nullfs safe?

2007-06-18 Thread Rong-en Fan

I'm running 6.2-RELEASE, and I am wondering
if using nullfs w/ rw is safe in a production environment?
My impression is that ro nullfs is ok, but not rw.
Is this still the case?

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Unable to install FreeBSD from external USB cdrom

2007-05-27 Thread Rong-en Fan

On 5/28/07, Daniel O'Connor [EMAIL PROTECTED] wrote:

Daniel O'Connor wrote:
 kib@ has real mode BTX code which appears to work with affected
 systems of mine, however, the code has not yet made it into CVS. I
 spliced it into a 6.2 miniboot ISO and it worked.

 Ooh ahh, please sir, can I have some more^Wit? :)

I did some googling.. Is this the patch?

http://people.freebsd.org/~kib/realbtx/realbtx.2.patch

(Going to try it today anyway :)


[I'm CC'ing [EMAIL PROTECTED]

Yes, there is also a loader/pxeboot in the same directory.
As kib@ told me, do not install this loader on your disk
which may destroy your data.

Regards,
Rong-En Fan



--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
   -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Unable to install FreeBSD from external USB cdrom

2007-05-26 Thread Rong-en Fan

On 5/26/07, Bruce M. Simpson [EMAIL PROTECTED] wrote:

Daniel O'Connor wrote:

 I believe this is most likely this issue...
 http://www.nabble.com/BTX-issues-when-booting-from-a-USB-CD-ROM-t3047441.html

 Alas no solution yet as far as I am aware :(


Forgot to Cc: my reply to the list:

kib@ has real mode BTX code which appears to work with affected systems
of mine, however, the code has not yet made it into CVS. I spliced it
into a 6.2 miniboot ISO and it worked.


It also works on my ThinkPad X60 with 7.x boot cd.

Regards,
Rong-En Fan


regards,
BMS

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge watchdog timeout -- resetting problem on recent update

2007-04-18 Thread Rong-en Fan

On 4/18/07, Tom Evans [EMAIL PROTECTED] wrote:

On Wed, 2007-04-18 at 17:38 +0800, Jason Chang 張傑生 wrote:
 Dear All,

 After recent cvsup and make world, my server suffered from the
 bge watchdog timeout -- resetting problem. Manually revert the
 bge related source to older version and compiled a new kernel
 may solve the problem. So I guess the recent committed source
 does not go well with bge nics onboard of IBM e326m servers.

Hi Jason

Can you add

hw.pci.enable_msi=0
hw.pci.enable_msix=0

to /boot/loader.conf and reboot and see if that solves the problem.

I saw this problem a while ago on my -CURRENT with a misbehaving MSI on
my motherboard. (I may have this completely wrong; I'm not even 100%
sure the MSI stuff has been MFC'ed to -STABLE).


MSI stuffs are MFC'ed to 6.x since April 1 by jhb@, but I
thought to use MSI, one MUST set hw.pci.enable_{msi,msix} to
1 in loader.conf as the commit log said.

But from the revision changes that Jason posted, the most
suspicious part is MSI...

Regards,
Rong-En Fan



Cheers

Tom




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS UP: ncurses is updated

2007-04-09 Thread Rong-en Fan

On 4/10/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:

On Sat, Apr 07, 2007 at 02:05:45AM +0800, Rong-en Fan wrote:
 I just merged ncurses 5.6 and wide character support from
 HEAD to 6.x. That means ncurses in 6.x is now up-to-date and
 has wide character support, i.e., ncursesw library.

I just wanted to take a moment to thank you for this.  You have no idea
how long I've been waiting (okay, now you do: years!), as I never felt
comfortable with having two versions of ncurses installed on a single
box (base + port).

So far it works great.  Thank you so much!


You are welcome.


The only thing I've found, though, is that dialog(1) does not appear to
properly handle UTF-8 encoding.  Line drawing characters show up as
gibberish (alphanumeric characters).  I realise dialog isn't part of
ncurses, but it does rely on it.  We should consider updating dialog to
match this change.


You mean it display sometihng like tqxu instead of line drawing characters?
Last time I checked, I thought it is terminal related. When I use screen, it
uses line drawing character. For PuTTY, see:

http://lists.freebsd.org/pipermail/freebsd-questions/2007-April/146577.html

The current dialog + utf8 MacOS's Term.app seems work just fine.
I'm playing with devel/cdialog and no matter it uses ncurses or ncursesw
the result is the same.

I'm CCing ache@ who imports GNU's dialog to our base and cdialog/ncurses
author, hope they can comment  :-)

Regards,
Rong-En Fan


--
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS UP: ncurses is updated

2007-04-09 Thread Rong-en Fan

On 4/10/07, Nikolay Pavlov [EMAIL PROTECTED] wrote:

On Monday,  9 April 2007 at 11:48:08 -0700, Jeremy Chadwick wrote:
 On Mon, Apr 09, 2007 at 11:21:08AM -0700, Jeremy Chadwick wrote:
  On Tue, Apr 10, 2007 at 01:49:32AM +0800, Rong-en Fan wrote:
   On 4/10/07, Jeremy Chadwick [EMAIL PROTECTED] wrote:
   The only thing I've found, though, is that dialog(1) does not appear to
   properly handle UTF-8 encoding.  Line drawing characters show up as
   gibberish (alphanumeric characters).  I realise dialog isn't part of
   ncurses, but it does rely on it.  We should consider updating dialog to
   match this change.
  
   You mean it display sometihng like tqxu instead of line drawing
   characters?
   Last time I checked, I thought it is terminal related. When I use screen, 
it
   uses line drawing character. For PuTTY, see:
  
   
http://lists.freebsd.org/pipermail/freebsd-questions/2007-April/146577.html

 This is quite applicable.  I just now got around to reading it
 (should've done this before I sent my previous Email).  Yep, that's the
 exact problem:

 /usr/bin/dialog:
   libdialog.so.5 = /usr/lib/libdialog.so.5 (0x3807e000)
   libncurses.so.6 = /lib/libncurses.so.6 (0x38099000)
   libc.so.6 = /lib/libc.so.6 (0x380dd000)

 At least I have a workaround with NCURSES_NO_UTF8_ACS=1.  :-)

I am not sure, but maybe this is related to ncurses update. I am getting
this trying to run sysinstall utility:

Probing devices, please wait (this can take a while)...BARF 170 105

Than goes EOL and exit...

It's a current from April 6.


The ncurses update to 5.6 is in late Jan, and enable wide character support it
in late Feb. My sysinstall runs just fine under console and rxvt-unicode
on my currenct as of yesterday.



--
==
- Best regards, Nikolay Pavlov. ---
==



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Call for Testers: ncurses 5.6 update

2007-04-06 Thread Rong-en Fan

On 4/7/07, Stefan Lambrev [EMAIL PROTECTED] wrote:

Hi list,

Rong-en Fan wrote:
 On 3/13/07, Stefan Lambrev [EMAIL PROTECTED] wrote:
 Hello,

 Rong-en Fan wrote:
  On 3/12/07, Stefan Lambrev [EMAIL PROTECTED] wrote:
  Rong-en Fan wrote:
   Hi folks,
  
   ncurses in 6.x is pretty old. We have update-to-date ncurses in 7.x
   with wide character support now. The patch at
  
  
 
 
http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070310.diff.gz

 
  
  
   gives you ncurses 5.6 and wide character support in 6.x. Please
   apply with 'patch -p0' under /usr/src.
  
   For more information, please visit
  
   http://people.freebsd.org/~rafan/ncurses/
  
   You can also find individual patches, say ncurses update and wide
   character support, there.
  
   Feedbacks and suggestions are welcome.
  
   P.S. Due to some lib32 issues, the patch above contains changes
   made by ru@ recently for src/Makefile.inc1.
  make installworld failed:
 
  cd /usr/src; /usr/obj/usr/src/make.amd64/make -f Makefile.inc1
 install32
  mkdir -p /usr/lib32 # XXX add to mtree
  [...]
 
  Sorry about this. I messed up the lib32 changes in the all-in-one
 patch.
  Could you please use this one instead?
 
 
 
http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070312.diff.gz

This patch doesn't seems to work anymore on 6.2-stable i386 (my previous
test was on amd64)
It seems that part of this patch is already in -stable :)

If I'm right the patch for src/Makefile.inc1 should be replaced by :

--- Makefile.inc1   Fri Apr  6 20:03:35 2007
+++ /root/Makefile.inc1.origFri Apr  6 20:03:17 2007
@@ -894,8 +894,7 @@
 bin/csh \
 bin/sh \
 ${_rescue} \
-lib/ncurses/ncurses \
-lib/ncurses/ncursesw \
+lib/libncurses \
 ${_share} \
 ${_aicasm} \
 usr.bin/awk \
@@ -1000,8 +999,7 @@

 _prebuild_libs+= lib/libbz2 lib/libcom_err lib/libcrypt lib/libexpat \
lib/libkvm lib/libmd \
-   lib/ncurses/ncurses lib/ncurses/ncursesw \
-   lib/libnetgraph lib/libopie lib/libpam \
+   lib/libncurses lib/libnetgraph lib/libopie lib/libpam \
lib/libradius \
lib/libsbuf lib/libtacplus lib/libutil \
lib/libz lib/msun

I'm still compiling and will let you know if things still works.


Yes, you are right. I merged Makefile.inc1 changes two days ago. I'm
going to merge the whole changes later.

Enjoy!

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


HEADS UP: ncurses is updated

2007-04-06 Thread Rong-en Fan

Hi all,

I just merged ncurses 5.6 and wide character support from
HEAD to 6.x. That means ncurses in 6.x is now up-to-date and
has wide character support, i.e., ncursesw library.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Lenovo X60 em workaround

2007-03-15 Thread Rong-en Fan

On 1/23/07, Jack Vogel [EMAIL PROTECTED] wrote:


Hey Gleb,

Acknowledge... I can do better than that, I have a fix for this problem, and
its not temporary. Here is the code change (not a patch, I'm very busy),
its in hardware_init, should be obvious how to patch:

   /* Make sure we have a good EEPROM before we read from it */
if (e1000_validate_nvm_checksum(adapter-hw)  0) {
/*
** Some PCI-E parts fail the first check due to
** the link being in sleep state, call it again,
** if it fails a second time its a real issue.
*/
if (e1000_validate_nvm_checksum(adapter-hw)  0) {
device_printf(dev,
The EEPROM Checksum Is Not Valid\n);
return (EIO);
}
}

This is already checked into my code base at Intel, I've just been too
busy to do anything with it, be my guest if you wish to check it in after
testing...


I accidentally found this :

http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=lenovolndocid=MIGR-67166

which patches the eeprom. And it solves by problem.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Call for Testers: ncurses 5.6 update

2007-03-12 Thread Rong-en Fan

On 3/12/07, Stefan Lambrev [EMAIL PROTECTED] wrote:

Rong-en Fan wrote:
 Hi folks,

 ncurses in 6.x is pretty old. We have update-to-date ncurses in 7.x
 with wide character support now. The patch at

 
http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070310.diff.gz


 gives you ncurses 5.6 and wide character support in 6.x. Please
 apply with 'patch -p0' under /usr/src.

 For more information, please visit

 http://people.freebsd.org/~rafan/ncurses/

 You can also find individual patches, say ncurses update and wide
 character support, there.

 Feedbacks and suggestions are welcome.

 P.S. Due to some lib32 issues, the patch above contains changes
 made by ru@ recently for src/Makefile.inc1.
make installworld failed:

cd /usr/src; /usr/obj/usr/src/make.amd64/make -f Makefile.inc1 install32
mkdir -p /usr/lib32 # XXX add to mtree

[...]

Sorry about this. I messed up the lib32 changes in the all-in-one patch.
Could you please use this one instead?

http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070312.diff.gz

It should solve lib32 problem. Note that individual patches work well.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Call for Testers: ncurses 5.6 update

2007-03-12 Thread Rong-en Fan

On 3/13/07, Stefan Lambrev [EMAIL PROTECTED] wrote:

Hello,

Rong-en Fan wrote:
 On 3/12/07, Stefan Lambrev [EMAIL PROTECTED] wrote:
 Rong-en Fan wrote:
  Hi folks,
 
  ncurses in 6.x is pretty old. We have update-to-date ncurses in 7.x
  with wide character support now. The patch at
 
 
 
http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070310.diff.gz

 
 
  gives you ncurses 5.6 and wide character support in 6.x. Please
  apply with 'patch -p0' under /usr/src.
 
  For more information, please visit
 
  http://people.freebsd.org/~rafan/ncurses/
 
  You can also find individual patches, say ncurses update and wide
  character support, there.
 
  Feedbacks and suggestions are welcome.
 
  P.S. Due to some lib32 issues, the patch above contains changes
  made by ru@ recently for src/Makefile.inc1.
 make installworld failed:

 cd /usr/src; /usr/obj/usr/src/make.amd64/make -f Makefile.inc1 install32
 mkdir -p /usr/lib32 # XXX add to mtree
 [...]

 Sorry about this. I messed up the lib32 changes in the all-in-one patch.
 Could you please use this one instead?

 
http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070312.diff.gz


This works for me (at least make buildworld  make installworld
finished without problems).


Thanks for testing.


Should I recompile and the kernel again or the patch is only in contrib ? :)


No you don't.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Call for Testers: ncurses 5.6 update

2007-03-11 Thread Rong-en Fan

Hi folks,

ncurses in 6.x is pretty old. We have update-to-date ncurses in 7.x
with wide character support now. The patch at

http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070310.diff.gz

gives you ncurses 5.6 and wide character support in 6.x. Please
apply with 'patch -p0' under /usr/src.

For more information, please visit

http://people.freebsd.org/~rafan/ncurses/

You can also find individual patches, say ncurses update and wide
character support, there.

Feedbacks and suggestions are welcome.

P.S. Due to some lib32 issues, the patch above contains changes
made by ru@ recently for src/Makefile.inc1.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ncurses

2007-01-17 Thread Rong-en Fan

On 1/18/07, Stephen Montgomery-Smith [EMAIL PROTECTED] wrote:

In the cvs repository, there has appeared src/lib/ncurses, which seems
to be a copy of lib/libncurses.  Is this meant to be?


Yes. it's for the upcoming ncurses update, which will occur within one week.
I'm waiting the current exp run on pointyhat to be finished.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: using md-mounted ISO as NFS root for PXE booting and installing

2007-01-11 Thread Rong-en Fan

On 1/12/07, Andrew N. Below [EMAIL PROTECTED] wrote:

Hello.

I have 6.1-STABLE FreeBSD box I want to use as network
boot and install server for PXE clients.

At this moment I'm experimenting with 4.11-RELEASE-i386-disc1-gnome.iso
and 6.2-RC2-i386-disc1.iso images.


[...]

All is fine if we are booting into 4.11 (root mounted from MFS,
sysintall runs and we able to install OS to local disks via NFS).

But in case of 6.2-RC2 root mounted from NFS, not MFS:


Does add

vfs.root.mountfrom=ufs:/dev/md0c

in boot/loader.conf from CD help?
Of course, you have to copy the boot/ directory out of cd.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


gpt device node does not show at boot

2006-12-19 Thread Rong-en Fan

I'm running 6.2-RC1 on i386. I use gpt(8) to partition my
disk. After reboot, the device node, say da1p1, does not
show up until 'gpt show da1' is issued. This prevents
gpt partition being mounted from fstab, and therefore
cannot be nfs exported at boot time!

My kernel config is simply GENERIC+QUOTA+SMP.

I also noticed that it is not possible to modify in-use
disk's partition table. There is also a PR 85772 about
it. Can someone comment on it? Thanks.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ips(4) in toaster mode

2006-12-15 Thread Rong-en Fan

It seems that after upgrading ips firmware to the latest version
available on ibm.com solves this problem. One changelog caught
me eye: increase timeout when tape driver is attached to the adapter.
Indeed, we have a tape on ahd(4), which I think ips(4) is sharing
this adapter.

however, the mystery is why ips stops work suddenly. Perhaps, I
tweaked some hardware settings and I forgot it.

On 11/18/06, Scott Long [EMAIL PROTECTED] wrote:

I'll look at this.

Scott


Rong-en Fan wrote:
 Hi,

 After upgrading RELENG_6 from Jul 11 to Sep 30 on an i386 box,
 everytime I run tar to backup my system to a mounted nfs volume.
 After one hour of operation, it panics with sleeping thread. Upgrading
 to RELENG_6_2 does not help. Also, the console is complete
 hang, I can not break into DDB at all. The only thing is do power
 cycling.

 Also, the only harddisk on that host is the ips(4), so I can not obtain
 a kernel dump. I'm not sure if this is a hardware failure, at least, no
 led on the panel is shown red...

 OK, the only information on console is attached below. Any suggesstion
 are welcome.

 Thanks,
 Rong-En Fan

 ==

 ips0: WARNING: command timeout. Adapter is in toaster mode, resetting to
 known s
 tate
 ips0: resetting adapter, this may take up to 5 minutes
 ips0: syncing config
 Sleeping thread (tid 12, pid 14) owns a non-sleepable lock
 sched_switch(c5feec00,0,1,8577f833,d14d2103,...) at sched_switch+0x158
 mi_switch(1,0) at mi_switch+0x1d5
 sleepq_switch(c60c2604,e9f77b8c,c051acd3,4,1,...) at sleepq_switch+0x93
 sleepq_wait(c60c2604,c60c25e0,c06e6957,1,1,...) at sleepq_wait+0x75
 cv_wait(c60c2604,c60c25e0,a,e9f77c04,5,...) at cv_wait+0x151
 _sema_wait(c60c25e0,0,0,c60c2400,c60c2400,...) at _sema_wait+0x64
 ips_send_config_sync_cmd(c60f5000,e9f77c08,1,c60f5000,7,...) at
 ips_send_config_sync_cmd+0x78
 ips_clear_adapter(c60c2400,c60b6e00,0,4,c60f5000,...) at
 ips_clear_adapter+0x60
 ips_morpheus_reinit(c60c2400,1,c053abf7,c0740100,c5feec00,...) at
 ips_morpheus_reinit+0x2ac
 ips_timeout(c60c2400,c053a7f5,c5feec00,c5feea80,d69f60ed,...) at
 ips_timeout+0xf8
 softclock(0,e9f77cd4,15dbe,c43ec589,c5feec00,...) at softclock+0x35d
 ithread_execute_handlers(c5fed430,c6042000,0,0,0,...) at
 ithread_execute_handlers+0x162
 ithread_loop(c5fbc880,e9f77d38,0,0,0,...) at ithread_loop+0x64
 fork_exit(c050b22a,c5fbc880,e9f77d38) at fork_exit+0x7b
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xe9f77d6c, ebp = 0 ---
 panic: sleeping thread
 cpuid = 2
 KDB: stack backtrace:
 kdb_backtrace(c0702303,2,c06ef01b,e9f98bf0,0,...) at kdb_backtrace+0x2f
 panic(c06ef01b,,e,c051ac54,1,...) at panic+0x129
 propagate_priority(c5fefd80,c5fefd80,0,0,0,...) at propagate_priority+0x69
 turnstile_wait(c60c25a8,c5feec00,c610c000,c7d064a4,4,...) at
 turnstile_wait+0x32f
 _mtx_lock_sleep(c60c25a8,c5fefd80,0,0,0,...) at _mtx_lock_sleep+0xfd
 ipsd_strategy(c7d064a4,43,200,0,c04e31a1,...) at ipsd_strategy+0x70
 g_disk_start(c7d1e4a4,c073bac8,24c,c06e8406,64,...) at g_disk_start+0x1b1
 g_io_schedule_down(c5fefd80,4c,c5fefd80,c04e39c1,e9f98d04,...) at
 g_io_schedule_down+0x15f
 g_down_procbody(0,e9f98d38,0,0,0,...) at g_down_procbody+0xb3
 fork_exit(c04e39c1,0,e9f98d38) at fork_exit+0x7b
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xe9f98d6c, ebp = 0 ---
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Panic in thread taskq on RELENG_6

2006-11-29 Thread Rong-en Fan

On 11/29/06, Kevin Oberman [EMAIL PROTECTED] wrote:

 From: =?iso-8859-1?Q?Markus_Oestreicher?= [EMAIL PROTECTED]
 Date: Tue, 28 Nov 2006 20:01:06 +0100
 Sender: [EMAIL PROTECTED]

 Good Day,

 I get a panic on latest RELENG_6 every 6-12 hours. The server is a
 Dual Xeon FSB800 with 2 GB RAM and aac(4)-disks running postfix and
 amavisd-new for SPAM scanning.


 kernel trap 12 with interrupts disabled

 Fatal trap 12: page fault while in kernel mode
 cpuid = 3; apic id = 07
 fault virtual address = 0x104
 fault code= supervisor read, page not present
 instruction pointer   = 0x20:0xc06774e1
 stack pointer = 0x28:0xe4f93c90
 frame pointer = 0x28:0xe4f93c9c
 code segment  = base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
 processor eglags  = resume, IOPL = 0
 current process   = 5 (thread taskq)

 The panic always in process thread taskq.

 dbtrace
 _mit_lock_sleep(cb031e5c,c63f7180) at _mtx_lock_sleep+0x9d
 unp_gc(0,1) at uno_gc+0x222
 taskqueue_run(c6439d80) at taskqueue_run+0x13f
 taskqueue_thread_loop(c09f8988,e4f93d38) at taskqueue_thread_loop+0x 92
 fork_exit(c06a1bc0,c09f8988,e4f93d38) at fork_exit+0x71
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp=0xe4f93d6c, ebp = 0

 FreeBSD mx.local 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1:
  Tue Nov 28 02:12:58 CET 2006
  [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386


 Does that look like a hardware problem or a software issue?
 I will try to swap RAM in the next few days.

You are the third person to report this panic. (I am one of the other
two.


I reported unp_gc() panic recently. See
Re: LOR (intr table and sio) and instability on [EMAIL PROTECTED]
jhb@ told me that he also saw this and there is currently no
fix yet.

Regards,
Rong-En Fan



I am guessing from the name of your kernel that this is an SMP
system. So are the other two.

Are you running gnome-2.16 with hald? This is about all we found
in common on the first two systems.

Robert Watson would like some added data. Can you build a kernel with
the following options and connect something to the serial port to record
output?
options WITNESS
options INVARIANT_SUPPORT
options DDB
options KDB
options INVARIANTS

At the debugger prompt:
 show pcpu
 trace
 show allpcpu
 traceall
 show alllocks

At least my system has been totally uncooperative in crashing when I am
anywhere near it, so I have not yet collected any information other than
dumps.
--
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]  Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ips(4) in toaster mode

2006-11-18 Thread Rong-en Fan

On 11/18/06, Martin Blapp [EMAIL PROTECTED] wrote:


Hi,

 Also, the only harddisk on that host is the ips(4), so I can not obtain
 a kernel dump. I'm not sure if this is a hardware failure, at least, no
 led on the panel is shown red...

Hmm ? We do kernel dumps on ips(4) and it works.

dumpdev=/dev/ipsd0s1b

Martin


ips(4) can do kernel dump, but in my case above, ips(4) is already
command timeout mode...

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ips(4) in toaster mode

2006-11-17 Thread Rong-en Fan

Hi,

After upgrading RELENG_6 from Jul 11 to Sep 30 on an i386 box,
everytime I run tar to backup my system to a mounted nfs volume.
After one hour of operation, it panics with sleeping thread. Upgrading
to RELENG_6_2 does not help. Also, the console is complete
hang, I can not break into DDB at all. The only thing is do power
cycling.

Also, the only harddisk on that host is the ips(4), so I can not obtain
a kernel dump. I'm not sure if this is a hardware failure, at least, no
led on the panel is shown red...

OK, the only information on console is attached below. Any suggesstion
are welcome.

Thanks,
Rong-En Fan

==

ips0: WARNING: command timeout. Adapter is in toaster mode, resetting to known s
tate
ips0: resetting adapter, this may take up to 5 minutes
ips0: syncing config
Sleeping thread (tid 12, pid 14) owns a non-sleepable lock
sched_switch(c5feec00,0,1,8577f833,d14d2103,...) at sched_switch+0x158
mi_switch(1,0) at mi_switch+0x1d5
sleepq_switch(c60c2604,e9f77b8c,c051acd3,4,1,...) at sleepq_switch+0x93
sleepq_wait(c60c2604,c60c25e0,c06e6957,1,1,...) at sleepq_wait+0x75
cv_wait(c60c2604,c60c25e0,a,e9f77c04,5,...) at cv_wait+0x151
_sema_wait(c60c25e0,0,0,c60c2400,c60c2400,...) at _sema_wait+0x64
ips_send_config_sync_cmd(c60f5000,e9f77c08,1,c60f5000,7,...) at
ips_send_config_sync_cmd+0x78
ips_clear_adapter(c60c2400,c60b6e00,0,4,c60f5000,...) at ips_clear_adapter+0x60
ips_morpheus_reinit(c60c2400,1,c053abf7,c0740100,c5feec00,...) at
ips_morpheus_reinit+0x2ac
ips_timeout(c60c2400,c053a7f5,c5feec00,c5feea80,d69f60ed,...) at
ips_timeout+0xf8
softclock(0,e9f77cd4,15dbe,c43ec589,c5feec00,...) at softclock+0x35d
ithread_execute_handlers(c5fed430,c6042000,0,0,0,...) at
ithread_execute_handlers+0x162
ithread_loop(c5fbc880,e9f77d38,0,0,0,...) at ithread_loop+0x64
fork_exit(c050b22a,c5fbc880,e9f77d38) at fork_exit+0x7b
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe9f77d6c, ebp = 0 ---
panic: sleeping thread
cpuid = 2
KDB: stack backtrace:
kdb_backtrace(c0702303,2,c06ef01b,e9f98bf0,0,...) at kdb_backtrace+0x2f
panic(c06ef01b,,e,c051ac54,1,...) at panic+0x129
propagate_priority(c5fefd80,c5fefd80,0,0,0,...) at propagate_priority+0x69
turnstile_wait(c60c25a8,c5feec00,c610c000,c7d064a4,4,...) at
turnstile_wait+0x32f
_mtx_lock_sleep(c60c25a8,c5fefd80,0,0,0,...) at _mtx_lock_sleep+0xfd
ipsd_strategy(c7d064a4,43,200,0,c04e31a1,...) at ipsd_strategy+0x70
g_disk_start(c7d1e4a4,c073bac8,24c,c06e8406,64,...) at g_disk_start+0x1b1
g_io_schedule_down(c5fefd80,4c,c5fefd80,c04e39c1,e9f98d04,...) at
g_io_schedule_down+0x15f
g_down_procbody(0,e9f98d38,0,0,0,...) at g_down_procbody+0xb3
fork_exit(c04e39c1,0,e9f98d38) at fork_exit+0x7b
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe9f98d6c, ebp = 0 ---
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic when portupgrade in jail (devfs related?)

2006-11-05 Thread Rong-en Fan

Hi,

I'm running RELENG_6 as of yesterday on a amd64 box.
This host has one jail running, and everytime when I try
to run portupgrade inside the jail. It panics. INVARIANTS
does not catch anything. I don't think this happens on
RELENG_6 two months ago. The panic messages
and backtrace are shown:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xfffe75b851b0
fault code  = supervisor read, page not present
instruction pointer = 0x8:0x80231118
stack pointer   = 0x10:0xb40a5860
frame pointer   = 0x10:0xb40a5880
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 60200 (script)
[thread pid 60200 tid 100268 ]
Stopped at  ptcclose+0x19:  movqlinesw(,%rdx,8),%rax
db bt
Tracing pid 60200 tid 100268 td 0xff006c4ec720
ptcclose() at ptcclose+0x19
giant_close() at giant_close+0x5f
devfs_close() at devfs_close+0x28f
VOP_CLOSE_APV() at VOP_CLOSE_APV+0x6e
vn_close() at vn_close+0x90
vn_closefile() at vn_closefile+0x88
fdrop_locked() at fdrop_locked+0xa5
closef() at closef+0x35f
close() at close+0x173
syscall() at syscall+0x4a1
Xfast_syscall() at Xfast_syscall+0xa8
--- syscall (6, FreeBSD ELF64, close), rip = 0x800807f9c, rsp =
0x7fffdfa8, rbp = 0 ---

I put the box back to production. If anyone needs more information,
I can reproduce this panic and gather them in ddb. BTW, I used
'call doadump' in ddb, but after rebooting, savecore complains
there is no dump?

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


LOR (intr table and sio) and instability

2006-10-23 Thread Rong-en Fan

I'm running today's 6-stable on a amd64 SMP (Pentium-D) machine.
When turning on witness, I got a LOR on half way of booting:

ata0-slave: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=40 wire
ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad4: 157066MB HDT722516DLA380 V43OA96A at ata2-master SATA150
ad4: 321672960 sectors [319120C/16H/63S] 16 sectors/interrupt 1 depth queue
SMP: AP CPU #1 Launched!
cpu1 AP:
ID: 0x0100   VER: 0x00050014 LDR: 0x DFR: 0x
 lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
 timer: 0x000200ef therm: 0x0001 err: 0x0001 pcm: 0x0001
lock order reversal:
1st 0x805643c0 intr table (intr table) @
/home/admin/usr/src/sys/amd64/amd64/intr_machdep.c:417
2nd 0x8056fb00 sio (sio) @ /home/admin/usr/src/sys/dev/sio/sio.c:2586
KDB: stack backtrace:
witness_checkorder() at witness_checkorder+0x4e1
_mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x5d
siocnputc() at siocnputc+0xf5
cnputc() at cnputc+0x60
putchar() at putchar+0xcb
kvprintf() at kvprintf+0x9b
printf() at printf+0xe7
intr_assign_next_cpu() at intr_assign_next_cpu+0x7a
intr_shuffle_irqs() at intr_shuffle_irqs+0x6b
mi_startup() at mi_startup+0xc0
btext() at btext+0x2c
INTR: Assigning IRQ 1 to local APIC 0
ioapic0: Assigning ISA IRQ 1 to local APIC 0
INTR: Assigning IRQ 3 to local APIC 1
ioapic0: Assigning ISA IRQ 3 to local APIC 1
INTR: Assigning IRQ 4 to local APIC 0
ioapic0: Assigning ISA IRQ 4 to local APIC 0
INTR: Assigning IRQ 9 to local APIC 1
ioapic0: Assigning ISA IRQ 9 to local APIC 1
INTR: Assigning IRQ 12 to local APIC 0
ioapic0: Assigning ISA IRQ 12 to local APIC 0
INTR: Assigning IRQ 14 to local APIC 1
ioapic0: Assigning ISA IRQ 14 to local APIC 1
INTR: Assigning IRQ 15 to local APIC 0
ioapic0: Assigning ISA IRQ 15 to local APIC 0
INTR: Assigning IRQ 17 to local APIC 1
ioapic0: Assigning PCI IRQ 17 to local APIC 1
GEOM: new disk ad4
Trying to mount root from ufs:/dev/ad4s1a

I checked the LOR page, and there no such report. I'm wondering
if this is related to the console hang every few days on this machine.
When it hangs, keyboard does not work. The only way is to break
into ddb and reset it. I just hooked up the serial console few days ago.
No hang yet. But I got a panic this noon:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address  = 0x18c
fault code = supervisor read, page not present
instruction pointer= 0x8:0x801f107e
stack pointer  = 0x10:0xb1839b30
frame pointer  = 0x10:0xb1839b60
code segment   = base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags   = resume, IOPL = 0
current process= 9 (thread taskq)


[thread pid 9 tid 100011 ]
Stopped at  _mtx_lock_sleep+0x80:   movl0x18824(%r12),%r8d
db  bt
Tracing pid 9 tid 100011 td 0xff007b218980
_mtx_lock_sleep() at _mtx_lock_sleep+0x80
unp_gc() at unp_gc+0x4c4
taskqueue_run() at taskqueue_run+0xd5
taskqueue_thread_loop() at taskqueue_thread_loop+0x88
fork_exit() at fork_exit+0x8b
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xb1839d00, rbp = 0 ---

I searched the mail archive, but it seems there is no similar report.
I call 'doadump' in ddb, but after rebooting, savecore says there is
no dump there? (i have dump_dev=AUTO

If you need more information, please let me know.

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: nfsd stuck in ufs/biord/biowr/getblk

2006-10-17 Thread Rong-en Fan

On 10/18/06, Vivek Khera [EMAIL PROTECTED] wrote:


On Oct 16, 2006, at 5:03 PM, Rong-en Fan wrote:

 Yesterday, I saw my all my nfsd stuck in ufs/biord/biowr/getblk.
 I saw the same thing some time ago. I break into ddb and do a
 'alltrace':

do you have an em or bge ethernet?



Yes. I do have an em0. My em0 does not share irq with other
device. I got watchdog timeout message every few days, but
I didn't get any in the deadlock above since boot.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


nfsd stuck in ufs/biord/biowr/getblk

2006-10-16 Thread Rong-en Fan

Hi,

Yesterday, I saw my all my nfsd stuck in ufs/biord/biowr/getblk.
I saw the same thing some time ago. I break into ddb and do a
'alltrace':

http://www.rafan.org/FreeBSD/ufs/20061017.txt

The system in question is running 6-STABLE Sep 20. It's an i386
SMP box. When all nfsd stuck in ufs/biord/biowr/getblk, I can
still login to the system (all exported fs are on an external RAID).
I'm not sure how to trigger this behavior. Any suggestions are
welcome. If there is anything I can provide in ddb to help trace
this down, please let me know..

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking question

2006-08-15 Thread Rong-en Fan

On 8/15/06, Kris Kennaway [EMAIL PROTECTED] wrote:

On Tue, Aug 15, 2006 at 11:59:50AM +0200, Morten A. Middelthon wrote:
 On Tue, Feb 28, 2006 at 05:21:50AM -0500, Kris Kennaway wrote:
  On Tue, Feb 28, 2006 at 11:14:53AM +0100, Patrick M. Hausen wrote:
   Hi, all!
  
   In our local office network we have a rather old FreeBSD 5.2.1
   server acting as an NFS server for several other systems, mostly
   running 6.0.
  
   From time to time we experience processes on the NFS clients
   hanging in statd D with wchan lockd when accessing files
   over NFS.
 
  Try the attached patch on the 6.0 machines:
 
  Index: usr.sbin/rpc.lockd/lock_proc.c
 snip

 Hi,

 I have been plagued with this NFS lockd issue for quite some time now. It has
 kept me from installing FreeBSD 6.x on our workstations at work. I just tried
 applying your patch to my own 6.1-RELEASE-p3 workstation, and so far it has
 been working nicely. Has anyone else had the same experience? If so, maybe it
 should go into production?

I was unable to obtain confirmation from anyone else (including the
submitter who previously claimed it was necessary, and my own testing)
that the patch actually solved a problem.  Since it involves reverting
useful functionality, someone would need to obtain further debugging
from your system (tcpdump traces before/after, etc) to determine what
it's actually solving.

Kris


In my experiences, rpc.lockd dies automatically on both server and
client. If this happens, then all processes that want to lock a file, they
will be stuck in lockd (top will tell). In my case, rpc.lockd dies because
write failed, and then a SIGPIPE generated. Two months ago, bin/97768
is sent and rodrigc@ committed (also MFC'ed in RELENG_6). That PR
ignores SIGPIPE (since the code in server/client already takes care of
write failed case). After I applied this PR, I'm quite happy with nfs locking.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS Locking Issue

2006-06-30 Thread Rong-en Fan

On 6/29/06, Michael Collette [EMAIL PROTECTED] wrote:

This last week I had been working on a test network to test out 6.1
prior to upgrading our production boxes from 5.4.  That's when I ran
across the rpc.lockd issues that have been discussed earlier.

Our production setup has diskless clients running KDE, which due to this
bug is now dead on 6.1.  I also have my mail server delivering messages
to a file server via NFS.  I even have servers booting diskless with NFS
provided file systems... all of which are dead on 6.1.

The last discussion our bug updates I've seen on this issue were about 3
months ago.  This leaves me with a number of questions I hope can be
answered here on this list.

Is NFS a big deal for most other users, or am I out here on the fringe
using it as much as I do?

Is anyone working on a fix for this?  If so, is there any kind of time
frame where this fix might be MFC'd to 6-STABLE?

I guess I'm still just a bit stunned that a bug this obvious not only
found it's way into the STABLE branch, but is still there.  Maybe it's
not as obvious as I think, or not many folks are using it?  All I know
for sure here is that if I had upgraded to 6.1 my network would have
been crippled.


Try 6.1-STABLE, especially make sure you have

$FreeBSD: src/usr.sbin/rpc.lockd/kern.c,v 1.16.2.1 2006/06/02
01:20:58 rodrigc Exp $

for usr.sbin/rpc.lockd/kern.c, and see if this helps.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Updating ncurses in base

2006-06-09 Thread Rong-en Fan

Hi,

[I'm also CC'ed peter@ since he is the maintainer of ncurses in base]

As you may know, the current ncurses in the base system is rather old
(it is 4 years old). I have been working on updating ncurses to the latest
version 5.5 and enable wide character as default. I have put the description,
goal, issues, current status, and tarball for test at the following URL:

http://www.rafan.org/FreeBSD/ncurses/

I use the updated ncurses on my 7-CURRENT for sometime, everything
works well. As I note in that page, there are some issues related to building
framework of libncurses and related libraries. I hope there are some
experienced people here can show me which way is most likely to be
included in the base system.

In addition to those issues, I hope some of you can test it and feedback.
I really would like to see ncurses in base is updated in the near future.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [patch, try 1] Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-25 Thread Rong-en Fan

On 5/25/06, Konstantin Belousov [EMAIL PROTECTED] wrote:

On Thu, May 25, 2006 at 01:19:26AM -0400, Kris Kennaway wrote:
 On Wed, May 24, 2006 at 11:48:53PM -0400, Howard Leadmon wrote:

  So what's changed at that delta, under the one that works vfs_lookup.c is:
 
   Edit src/sys/kern/vfs_lookup.c
Add delta 1.80.2.6 2006.03.31.07.39.24 kris
 
 
  Under the one that fails the vfs_lookup.c is:
 
   Edit src/sys/kern/vfs_lookup.c
Add delta 1.80.2.7 2006.04.30.03.57.46 kris
 
 
 
   So I stand corrected on my last post, the issue is in fact in this module, 
as
  just taking that module back to 1.80.2.6 fixes the problem with my server.  
 I
  even took multiple NFS clients and gave them a heavy workload, and CPU still
  remained reasonable, and very responsive.  As soon as I rev to the new
  version, NFS breaks badly and even a single client doing something like a du
  of a directory structure results in sluggishness and extreme CPU usage.

 Yep, unfortunately this commit was necessary to fix other bugs.  Jeff
 said he should have time to look at it next week.

 Kris

I tried to debug the problem. First, I have to admit that I cannot
reproduce the problem on GENERIC kernel. Only after QUOTAS where added,
and, correspondingly, UFS started to require Giant,
I get described behaviour. Below are the changes to GENERIC config file
I made to reproduce problem.


[...]

After that, server machine easily panics on

KASSERT(!(debug_mpsafenet == 1  mtx_owned(Giant)),
(nfssvc_nfsd(): debug.mpsafenet=1  Giant));

from nfsserver/nfs_syscalls.c, line 570.

As I understand the problem, kern/vfs_lookup.c:lookup() could
aquire additional locks on Giant, indicating this by GIANTHELD
flag in nd. All processing in nfsserver already goes with Giant held,
so, I just dropped that excessive locks after return from lookup.
System with patch applied survived smoke test (client did
du on mounted dir, patch was generated from exported fs, etc.).
nfsd eats no more than 25% of CPU (with INVARIANTS).

Please, users who reported the problem and willing to help,
try the patch (generated against STABLE) and give the feedback.


[...]

Hi Konstantin and others,

I'm now running RELENG_6_1 as of Apr 30 04:00 UTC source + your
patch. The nfsd is quite happy! After client's du finishes, it
stays idle as expected (eats 0.00% CPU).

Thank you very much.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-23 Thread Rong-en Fan

On 5/23/06, Konstantin Belousov [EMAIL PROTECTED] wrote:

On Mon, May 22, 2006 at 05:43:32PM -0400, Rong-en Fan wrote:
 On 5/14/06, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Sun, May 14, 2006 at 02:28:55PM -0400, Howard Leadmon wrote:
 
 Hello All,
 
   I have been running FBSD a long while, and actually running since the
 5.x
  releases on the server I am having troubles with.   I basically have a
 small
  network and just use NIS/NFS to link my various FBSD and Solaris machines
  together.
 
   This has all been running fine up till a few days ago, when all of a
 sudden
  NFS came to a crawl, and CPU usage so high the box appears to freeze
 almost.
  When I had 6.1-RC running all seemed well, then came the announcement
 for the
  official 6.1 release, so I did the cvs updates, made world, kernel, and
 ran
  mergemaster to get everything up to the 6.1 stable version.
 
   Now after doing this, something is wrong with NFS.   It works, it will
 return
  information and open files, just it's very very slow, and while
 performing a
  request the CPU spike is astounding.  A simple du of my home directory
 can
  take minutes, and machine all but locks up if the request is done over
 NFS.
  Here is top snip:
 
PID USERNAME   THR PRI NICE   SIZERES STATE  C   TIME   WCPU
 COMMAND
497 root 1   40  1252K   780K -  2  50:42 188.48% nfsd
 
 
   This is a nice IBM eServer with dual P4-XEON's and a couple GB or RAM
 on a
  disk array, and locally is screams, heck NFS used to scream till I
 updated.  I
  am not really sure what info would be useful in debugging, so won't post
 tons
  of misc junk in this eMail, but if anyone has any ideas as to how best to
  figure out and resolve this issue it would sure be appreicated...
 
 Use tcpdump and related tools to find out what traffic is being sent.
 
 Also verify that you did not change your system configuration in any
 way: there have been no changes to NFS since the release, so it is
 unclear why an update would cause the problem to suddenly occur.
 
 Kris

 Hi Kris and Howard,

 As I posted few days ago, I have similar problems like Howard's
 (some details in the thread 6.1-RELEASE, em0 high interrupt rate
 and nfsd eats lots of cpu on stable@). After binary searching
 the source tree, I found that

 RELENG_6_1, 2006.04.30.03.57 ok
 RELENG_6_1, 2006.04.30.04.00 bad

 The only commit is kern/vfs_lookup.c, an MFC of rev 1.90 and 1.91.
 With 04.30 03.57's source + manaully patched vfs_lookup.c rev 1.90,
 the same problem occurs.

 Let me refresh what problems I'm seeing

 1. a client (no matter Linux 2.6.16 or FreeBSD 6.1) runs du on
   a nfs directory
 2. on server-side, nfsd starts to eats lots of CPU
 3. the du finishes
 4. on server-side, nfsd still eats lots of CPU, but there is no
   nfs traffic. Wait for 5 minutes, you can still see that nfsd is
   running and eats lots of CPU.

 On FreeBSD 6.1R client, it uses UDP mount and fstab is like
 rw,-L,nosuid,bg,nodev. On Linux cleint, it uses UDP mount and
 fstab is like defaults,udp,hard,intr,nfsvers=3,rsize=8192,wsize=8192.
 The server's kernel conf is at

 http://www.rafan.org/FreeBSD/nfs/KERNEL

 Some related configuration files:

 /etc/export
  /export/dir1 host1 host2...
  /export/dir2 host1 host2...

 /etc/rc.conf
 nfs_server_enable=YES
 nfs_server_flags=-u -t -n 16
 mountd_enable=YES
 mountd_flags=-r -l -n
 rpc_lockd_enable=YES
 rpc_statd_enable=YES
 rpcbind_enable=YES

 /etc/fstab:
 /dev/...  /export/dir1 ufs rw,nosuid,noexec 2 2
 /dev/...  /export/dir2 ufs rw,nosuid,noexec,userquota 2 2

 The NFS server is also using amd to mount some backup directories
 from another NFS server. the amd.conf is

 [global]
 browsable_dirs = yes
 map_type = file
 mount_type = nfs
 auto_dir = /nfs
 fully_qualified_hosts = no
 log_file = syslog
 nfs_proto = udp
 nfs_allow_insecure_port = no
 nfs_vers = 3
 # plock = yes
 selectors_on_default = yes
 restart_mounts = yes

 [/backup]
 map_options = type:=direct
 map_name = /etc/amd.direct

 /etc/amd.direct:
 /defaults
 opts:=rw,grpid,resvport,vers=3,proto=udp,nosuid,nodev,rsize=8192,wsize=8192
 backup  type:=nfs;rhost:=nfs2;rfs:=/nfs2/${host}


 If there are any thing I can provide to help tracking this down. Please
 let me know. By the way, I tried with truss/kdump to see what happens
 when nfsd eats lot of CPUs, but in vain. They do not return anything.

I tried your recipe on 7-CURRENT with locally exported fs, remounted
over nfs. I did not get the behaviour your described.


As noted in my previous thread, I have another 6.1-RELEASE nfs server,
which does not have this problem.


Could you, please, provide the backtrace for the nfsd that
eats the CPU (from the ddb). I think it would be helpful to get several
backtraces (i.e., bt nfsd pid, cont, bt nfsd pid ...) to
see where it running.


I'm afraid that I can not do that. Last time I tried breaking into ddb (on 5.x),
it hangs my serial console and the server is miles away :-( . Perhaps we

Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-23 Thread Rong-en Fan

On 5/23/06, Howard Leadmon [EMAIL PROTECTED] wrote:



   If there are any thing I can provide to help tracking this down.
   Please let me know. By the way, I tried with truss/kdump
 to see what
   happens when nfsd eats lot of CPUs, but in vain. They do
 not return anything.
  
  I tried your recipe on 7-CURRENT with locally exported fs,
 remounted
  over nfs. I did not get the behaviour your described.

 As noted in my previous thread, I have another 6.1-RELEASE
 nfs server, which does not have this problem.

  Could you, please, provide the backtrace for the nfsd that eats the
  CPU (from the ddb). I think it would be helpful to get several
  backtraces (i.e., bt nfsd pid, cont, bt nfsd pid ...)
 to see where
  it running.

 I'm afraid that I can not do that. Last time I tried breaking
 into ddb (on 5.x), it hangs my serial console and the server
 is miles away :-( . Perhaps we can ask Howard to do that?

 I am more than willing to do that, as this machine runs here with me, so if
needed I can easily get on a console, or perform a reboot.  Can one of you
shed a little light on exactly what I need to do, and how to do this?  I ask
as I have never used this ddb stuff, so not clue one on how to go about
getting the information your looking to find.  Guess I have been lucky, and
just never had an issue that took things to this level.


At least you have to add the following to your kernel:

options KDB
options DDB

Recompile it, reboot. You would better to setup a serial console
so you can easily copy thing from ddb output. To do it, you have
to put device sio in your kernel configuration and some files
below:

/boot.config
-Dh

/boot/loader.conf
comconsole_speed=115200
machdep.conspeed=115200

/etc/ttys
ttyd0   /usr/libexec/getty std.115200 cons25  on secure

On the other machine, /etc/remote:
com1:dv=/dev/cuad0:br#115200:pa=none:

Then, use tip com1 to attach the nfs server. The above settings
assume your serial console on nfs server is on COM1 and on the
client side is also COM1. If that's not the case, please follow
Handbook for howto setup a serial console other than COM1. To
break into ddb, either use ctrl+alt+esc or send a BREAK (I think ^b
will do) via serial line. After that, you should see

db

Then you first use ps to find out the nfsd pid (better to remember
the pid which eats  lots of cpu before enter ddb). After that, do
what Konstantin suggests. I have never tried cont in db. I guess
that will return the execution back to kernel and you need to break
into ddb again to do another bt pid.

By the way, could you verify that backing out vfs_lookup.c rev 1.90
helps in your situation? If not, maybe we are seeing different problems,
and then I have to figure out how to make my serial console work
here.

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: quota and snapshots in 6.1-RELEASE

2006-05-23 Thread Rong-en Fan

On 5/23/06, Dmitriy Kirhlarov [EMAIL PROTECTED] wrote:

Hi, list.

Some time ago quota and, AFAIR, snapshots in 6.1-RELEASE has deadlock
problems. What the current situation with this? I'm ready to test
patches, if needed.

WBR


IIRC, there are some quota and snapshots changes merged in
6.1-STABLE after 6.1-RELEASE is releases. So I think you may want
to try that.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-23 Thread Rong-en Fan

On 5/23/06, Howard Leadmon [EMAIL PROTECTED] wrote:


   Hello Rong-en,

 Thanks for the info on getting the debugger configured, and on the serial
console.   I will have to try and play with the serial console thing more, I
just tried putting in the flags and the damn thing hung, I had to boot from CD
and take the stuff back out.

 One thing you mention below that concerns me is that you have version 1.90 of
the vfs_lookup.c file.   I just did a less on /usr/src/sys/kern/vfs_lookup.c
and I see the following:

FreeBSD: src/sys/kern/vfs_lookup.c,v 1.80.2.7 2006/04/30 03:57:46 kris Exp


 I even did a cvsup (I use cvsup2.FreeBSD.org) to make sure I had the current
stuff before rebuilding the kernel just now, and still I see the same thing.
Is something fishy going on here, or did you by chance make a typo??


Sorry for the confusion. rev 1.90 is the number for -HEAD. To back
out this MFC'ed change for RELENG_6_1, please cvsup to
RELENG_6_1 date=2006.04.30.03.57.00. Then you should see it
is

1.80.2.6 2006/03/31 07:39:24 kris

To verify the effect of this revision. Please run RELENG_6_1 with
2006.04.30.03.57.00 and 2006.04.30.04.00.00.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Trouble with NFSd under 6.1-Stable, any ideas?

2006-05-22 Thread Rong-en Fan

On 5/14/06, Kris Kennaway [EMAIL PROTECTED] wrote:

On Sun, May 14, 2006 at 02:28:55PM -0400, Howard Leadmon wrote:

Hello All,

  I have been running FBSD a long while, and actually running since the 5.x
 releases on the server I am having troubles with.   I basically have a small
 network and just use NIS/NFS to link my various FBSD and Solaris machines
 together.

  This has all been running fine up till a few days ago, when all of a sudden
 NFS came to a crawl, and CPU usage so high the box appears to freeze almost.
 When I had 6.1-RC running all seemed well, then came the announcement for the
 official 6.1 release, so I did the cvs updates, made world, kernel, and ran
 mergemaster to get everything up to the 6.1 stable version.

  Now after doing this, something is wrong with NFS.   It works, it will return
 information and open files, just it's very very slow, and while performing a
 request the CPU spike is astounding.  A simple du of my home directory can
 take minutes, and machine all but locks up if the request is done over NFS.
 Here is top snip:

   PID USERNAME   THR PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
   497 root 1   40  1252K   780K -  2  50:42 188.48% nfsd


  This is a nice IBM eServer with dual P4-XEON's and a couple GB or RAM on a
 disk array, and locally is screams, heck NFS used to scream till I updated.  I
 am not really sure what info would be useful in debugging, so won't post tons
 of misc junk in this eMail, but if anyone has any ideas as to how best to
 figure out and resolve this issue it would sure be appreicated...

Use tcpdump and related tools to find out what traffic is being sent.

Also verify that you did not change your system configuration in any
way: there have been no changes to NFS since the release, so it is
unclear why an update would cause the problem to suddenly occur.

Kris


Hi Kris and Howard,

As I posted few days ago, I have similar problems like Howard's
(some details in the thread 6.1-RELEASE, em0 high interrupt rate
and nfsd eats lots of cpu on stable@). After binary searching
the source tree, I found that

RELENG_6_1, 2006.04.30.03.57 ok
RELENG_6_1, 2006.04.30.04.00 bad

The only commit is kern/vfs_lookup.c, an MFC of rev 1.90 and 1.91.
With 04.30 03.57's source + manaully patched vfs_lookup.c rev 1.90,
the same problem occurs.

Let me refresh what problems I'm seeing

1. a client (no matter Linux 2.6.16 or FreeBSD 6.1) runs du on
  a nfs directory
2. on server-side, nfsd starts to eats lots of CPU
3. the du finishes
4. on server-side, nfsd still eats lots of CPU, but there is no
  nfs traffic. Wait for 5 minutes, you can still see that nfsd is
  running and eats lots of CPU.

On FreeBSD 6.1R client, it uses UDP mount and fstab is like
rw,-L,nosuid,bg,nodev. On Linux cleint, it uses UDP mount and
fstab is like defaults,udp,hard,intr,nfsvers=3,rsize=8192,wsize=8192.
The server's kernel conf is at

http://www.rafan.org/FreeBSD/nfs/KERNEL

Some related configuration files:

/etc/export
 /export/dir1 host1 host2...
 /export/dir2 host1 host2...

/etc/rc.conf
nfs_server_enable=YES
nfs_server_flags=-u -t -n 16
mountd_enable=YES
mountd_flags=-r -l -n
rpc_lockd_enable=YES
rpc_statd_enable=YES
rpcbind_enable=YES

/etc/fstab:
/dev/...  /export/dir1 ufs rw,nosuid,noexec 2 2
/dev/...  /export/dir2 ufs rw,nosuid,noexec,userquota 2 2

The NFS server is also using amd to mount some backup directories
from another NFS server. the amd.conf is

[global]
browsable_dirs = yes
map_type = file
mount_type = nfs
auto_dir = /nfs
fully_qualified_hosts = no
log_file = syslog
nfs_proto = udp
nfs_allow_insecure_port = no
nfs_vers = 3
# plock = yes
selectors_on_default = yes
restart_mounts = yes

[/backup]
map_options = type:=direct
map_name = /etc/amd.direct

/etc/amd.direct:
/defaults
opts:=rw,grpid,resvport,vers=3,proto=udp,nosuid,nodev,rsize=8192,wsize=8192
backup  type:=nfs;rhost:=nfs2;rfs:=/nfs2/${host}


If there are any thing I can provide to help tracking this down. Please
let me know. By the way, I tried with truss/kdump to see what happens
when nfsd eats lot of CPUs, but in vain. They do not return anything.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.1-RELEASE, em0 high interrupt rate and nfsd eats lots of cpu

2006-05-15 Thread Rong-en Fan
 root   1 -40 -159 0K 8K CPU0   0   8:19  9.18% swi2: camb
  40 root   1 -160 0K 8K sdflus 1   6:04  5.13% softdepflu


The wait channel of nfsd are usually biord, biowd, ufs, RUN, CPUX, and -.

The kernel conf is GENERIC without unneeded hardware + ipfw2, FAST_IPSEC,
QUOTA (but I don't have any userquota or groupquota in fstab). I also tuned some
sysctls:

machdep.hyperthreading_allowed=1
vm.kmem_size_max=419430400
vm.kmem_size_scale=2
net.link.ether.inet.log_arp_wrong_iface=0
net.inet.tcp.sendspace=65536
net.inet.tcp.recvspace=65536
net.inet.udp.recvspace=65536
kern.ipc.somaxconn=4096
kern.maxfiles=65535
kern.ipc.shmmax=104857600
kern.ipc.shmall=25600
net.inet.ip.random_id=1
kern.maxvnodes=10
vfs.read_max=16
kern.cam.da.retry_count=20
kern.cam.da.default_timeout=300


Anything that I can provide to help nail this problem down?

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1-RELEASE, em0 high interrupt rate and nfsd eats lots of cpu

2006-05-15 Thread Rong-en Fan

On 5/15/06, Dmitriy Kirhlarov [EMAIL PROTECTED] wrote:

On Mon, May 15, 2006 at 02:15:08PM -0400, Rong-en Fan wrote:
 Hi,

 After upgrading from 5.5-PRERELEASE to 6.1-RELEASE on one
 nfs server today, I noticed that the load is very high, ranging from 4.x
 to 30.x, depends how many nfsd I run. From mrtg traffic graph, I did
 not notice there is high traffic. This box is 2 physical Xeon CPU w/

I have same situation today on RC2.
One client installing world from nfs share.
nfsd eat 91% CPU, load average 6-8. Very small disk activitie.
I don't look interrupt rate.
I, also, have em0.


Hi,

It looks to me that after reboot the machine, do a du frm a client,
during du running, the nfsd eats lots of cpu. However, after du
exits, the nfsd still eats lots of cpu. Don't know what happened,
I will give latest RELENG_6 a shot. If that does not work, I probably
goes back to 6.0-R or even 5.5-PR :(

By the way, I have another nfs server (mainly for backup), running
6.1-RELEASE, does not have this behavior.

When do you start to notice this problem? Since 6.1-RC or?

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1-RELEASE, em0 high interrupt rate and nfsd eats lots of cpu

2006-05-15 Thread Rong-en Fan

On 5/15/06, Dmitriy Kirhlarov [EMAIL PROTECTED] wrote:

On Mon, May 15, 2006 at 02:15:08PM -0400, Rong-en Fan wrote:
 Hi,

 After upgrading from 5.5-PRERELEASE to 6.1-RELEASE on one
 nfs server today, I noticed that the load is very high, ranging from 4.x
 to 30.x, depends how many nfsd I run. From mrtg traffic graph, I did
 not notice there is high traffic. This box is 2 physical Xeon CPU w/

I have same situation today on RC2.
One client installing world from nfs share.
nfsd eat 91% CPU, load average 6-8. Very small disk activitie.
I don't look interrupt rate.
I, also, have em0.


After some digging, I found the cpu-eater nfsd can be triggered by
running ``du'' on nfs client (both FreeBSD 6.1-R and Linux box).
The nfsd will eat lots of CPU. After the client's du is finished, the nfsd
still eat lots of CPU. The workaround is to

/etc/rc.d/nfsd restart

Everything will be just fine. Besides du, on FreeBSD 6.1-R client,
buildkernel over nfs will trigger the same behavior.

I just downgraded this box to 6.0-RELEASE and everything works
fine. Running du or buildkernel from nfs client do not trigger the
same behavior. I will try to do a binary search from 6.0-R to 6.1-R
see if I can find out related commits.

By the way, I have another nfs server running 6.1-RELEASE,
but it does not exhibit this behavior. Kernel conf and sysctl
are basically the same for both boxes.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.x on an IBM T42 laptop

2006-05-08 Thread Rong-En Fan

On 5/8/06, Nik Clayton [EMAIL PROTECTED] wrote:

Mark Willson wrote:
 I've got an IBM T42 laptop that's currently running 5.4, and it's working
 nicely at the moment.  ACPI works well enough that suspend to RAM works
 ('zzz'), the audio works, USB devices are recognised, and the battery life's
 reasonable (with est enabled).

 Is anyone aware of any regressions in laptop functionality going from 5.4 to
 6.x?

 I've been running 6-STABLE on a T42 for a while and not noticed any
 problems in the subjects mentioned.  The addition of iwi has made life
 a little simpler.  I think it is ok to take the leap...

Thanks.

I noticed that the acpi_ibm manual page talks about the suspend-to-disk
functionality.

Does that work?



From my experience on X31, the Fn+F12 (suspend to disk)

only works with apm and no acpi. Of course, you have
to create a partition first (via phdisk(?) aviliable at ibm's site).

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ypwhich -m

2006-04-19 Thread Rong-En Fan
Hi,

I found that ypwhich -m does not work on 6.1-RC, it shows

ypwhich: can't find the master of `�`: reason: No such map in server's domain

IIRC, there was a commit last year to fix this. After some search, I think
it is include/rpcsvc/yp_prot.h revision 1.13 done by peter@ (CC'ed).
As far as I can tell, ypwhich -m is also broken on 5.4 and 5.5-PRERELEASE.

I have tested that revision on a 5.5-PRERELEASE machine, it
fixes ypwhich -m. I would like to see this MFC'ed to RELENG_6
and RELENG_5, so the newer releases will have this fixed.

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: rpc.lockd brokenness (2)

2006-04-08 Thread Rong-En Fan
On 4/8/06, Kris Kennaway [EMAIL PROTECTED] wrote:
 On Sat, Apr 08, 2006 at 01:28:55AM -0400, Rong-En Fan wrote:
  On 3/6/06, Jun Kuriyama [EMAIL PROTECTED] wrote:
  
   I'm not yet received enough information to track rpc.lockd problem.
  
   As Kris posted before, here is a patch to backout my suspected
   commit.  If someone can easily reproduce this problem, please try with
   this patch on both of server/client side of rpc.lockd (I'm not sure
   which of server/client side this affects).
  
   http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/80389
   http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/84953
  
   Any reports about this patch (OK or still problem) are welcome!
 
  Hi,
 
  Somehow I have problems with lockd after 3 boxes upgraded from
  Feb's RELENG_6 to Apr 6's. One of them has problems with lockd.
  For example, mutt and irssi will stuck in lockd (shown by
  top). I tried to back out changes in revision 1.18 for lock_proc.c,
  and do /etc/rc.d/nfslocking stop then a start. After backout it,
  mutt and irssi work well. If I put 1.18 back, mutt and irssi will stuck
  in lockd again.
 
  Last month, I played with the test program/script in those two PRs,
  found that revision 1.18 does not make any difference. I'm not 100%
  sure the problem I encountered now is related to rev 1.18. But
  it is a report that  backout 1.18 really helps.
 
  For record, all my clients involved in this mail are running RELENG_6.
  Server is RELENG_5 as of March 9. Only IPv4 here, no IPv6.

 1.18 was merged 15 months ago, so it cannot be the cause if you
 updated from Feb 2006.

Yes , I know that. But how to explain that after back-out 1.18
and restart rpc.lockd, my mutt and irssi will work. And putting
it back, they dont work? I tried backing out and putting back
three times. And, if I simply restart lockd, it does not help.

Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RELENG_6_1

2006-04-08 Thread Rong-En Fan
Hi,

According to the webpage [1], 6.1 has been branched on April 5. However,
I noticed that there is a tag called RELENG_6_1, not a branch called
RELENG_6_1. For example, sys/conf/newvers.sh [2], rev 1.69.2.11,
is on RELENG_6 branch with tag RELENG_6_1_BP and RELENG_6_1.

It is a bit strange for me. At least, we have RELENG_X_Y branch before
and RELENG_X_Y_BP tag. Is there any special reason that we have
a tag instead of a branch for 6.1?

Regards,
Rong-En Fan

[1] http://www.freebsd.org/releases/6.1R/schedule.html
[2] http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/conf/newvers.sh
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RELENG_6_1

2006-04-08 Thread Rong-En Fan
On 4/8/06, Scott Long [EMAIL PROTECTED] wrote:
 Rong-En Fan wrote:
  Hi,
 
  According to the webpage [1], 6.1 has been branched on April 5. However,
  I noticed that there is a tag called RELENG_6_1, not a branch called
  RELENG_6_1. For example, sys/conf/newvers.sh [2], rev 1.69.2.11,
  is on RELENG_6 branch with tag RELENG_6_1_BP and RELENG_6_1.
 
  It is a bit strange for me. At least, we have RELENG_X_Y branch before
  and RELENG_X_Y_BP tag. Is there any special reason that we have
  a tag instead of a branch for 6.1?

 RELENG_6_1 is a branch tag (or at least it should have been unless I
 screwed it up).  The _BP tag always comes before the branch tag.  I
 just checked CVS and it appears to agree with this.  Can you give an
 example of what is wrong?

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/conf/newvers.sh

When 6.0 is branched and moves to RC, it shows

Revision 1.69.2.8 / (download) - annotate - [select for diffs], Sun
Oct 9 16:59:34 2005 UTC (5 months, 4 weeks ago) by scottl
Branch: RELENG_6
CVS Tags: RELENG_6_0_BP
Branch point for: RELENG_6_0

When 6.1 moves to RC, it shows

Revision 1.69.2.11 / (download) - annotate - [select for diffs], Sat
Apr 8 14:42:23 2006 UTC (9 hours, 9 minutes ago) by scottl
Branch: RELENG_6
CVS Tags: RELENG_6_1_BP, RELENG_6_1

I expected to see something like the case for 6.0. I didn't see a
branch point for: RELENG_6_1 here. Did I miss something
or cvsweb shows the wrong information?

Hope we can see 6.1 RELEASE soon :-)

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: rpc.lockd brokenness (2)

2006-04-07 Thread Rong-En Fan
On 3/6/06, Jun Kuriyama [EMAIL PROTECTED] wrote:

 I'm not yet received enough information to track rpc.lockd problem.

 As Kris posted before, here is a patch to backout my suspected
 commit.  If someone can easily reproduce this problem, please try with
 this patch on both of server/client side of rpc.lockd (I'm not sure
 which of server/client side this affects).

 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/80389
 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/84953

 Any reports about this patch (OK or still problem) are welcome!

Hi,

Somehow I have problems with lockd after 3 boxes upgraded from
Feb's RELENG_6 to Apr 6's. One of them has problems with lockd.
For example, mutt and irssi will stuck in lockd (shown by
top). I tried to back out changes in revision 1.18 for lock_proc.c,
and do /etc/rc.d/nfslocking stop then a start. After backout it,
mutt and irssi work well. If I put 1.18 back, mutt and irssi will stuck
in lockd again.

Last month, I played with the test program/script in those two PRs,
found that revision 1.18 does not make any difference. I'm not 100%
sure the problem I encountered now is related to rev 1.18. But
it is a report that  backout 1.18 really helps.

For record, all my clients involved in this mail are running RELENG_6.
Server is RELENG_5 as of March 9. Only IPv4 here, no IPv6.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ps column header case changes

2006-04-06 Thread Rong-En Fan
On 4/5/06, Garance A Drosehn [EMAIL PROTECTED] wrote:
 At 10:08 PM -0400 4/5/06, Rong-En Fan wrote:
 I just updated my world from Feb's RELENG_6 as of today.  I
 noticed that the column header of ps's output is changed
 from upper to lower case.
 
 $ ps awx -r -o user|head -1
 user
 
 This is used to be USER. I found that changes in ps/keyword.c rev 1.75
 causes this (this is already MFC'ed).

 Ugh.  Sometimes the simple changes are the easiest ones to
 screw up.  That's what I get for trying to fix the previous
 bug between meetings, I guess.

 I'll look into it.  Many apologies.

Thanks for committing rev 1.76. It fixes the column header problem.
If no further problems, could you please MFC to 6.1, which is
still broken. :-)

Best,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ps column header case changes

2006-04-05 Thread Rong-En Fan
Hi,

I just updated my world from Feb's RELENG_6 as of today. I noticed that
the column header of ps's output is changed from upper to lower case.

$ ps awx -r -o user|head -1
user

This is used to be USER. I found that changes in ps/keyword.c rev 1.75
causes this (this is already MFC'ed).

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


NFS data corruption listed in 6.1 show stopper

2006-03-18 Thread Rong-En Fan
Hi,

We are planning to upgrade a NFS server from 5.x to 6.x. However,
we found there is a show stopper of 6.1 about NFS data corruption.
From the todo page, it says this item is worked in progress. Is this
has been fixed already or not yet?  From the description, it is found
by running fsx (in regression/fsx?), can somebody show how to
reproduce this? We would like to do some tests.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.1 ata panic if dma enabled

2006-03-16 Thread Rong-En Fan
Hi,

Recently, we upgrade a 4.11 box to 6.1-BETA2 by reinstall+newfs everything.
After that,  we found that if hw.ata.ata_dma=1 at boot, then as soon as it
starts fsck -p, it panics. It happens only if ad0 is setted to UDMA66 or above.
My current solution is set hw.ata.ata_dma=0 in loader.conf and manually
turn DMA on ad0 to UDMA33 and rest ad4~ad7 to UDMA100. In the days of
4.x, there is something wrong with DMA on ad0, but it will fall back to
PIO4 automatically without problem. We have been tried to 1) change the
cable 2) change from primary ata controller to the second, 3) upgrade to
RELENG_6 as of March 11, but all these are failed. There is no options in
bios to turn off DMA for the onboard ATA controller.

The ata controller and ad0 is
atapci0: VIA 82C686B UDMA100 controller port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on
pci0
atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xffa0
ata0: ATA channel 0 on atapci0
atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0
atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6
ata0: reset tp1 mask=03 ostat0=50 ostat1=00
ata0: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
ata0: stat1=0x00 err=0x01 lsb=0x00 msb=0x00
ata0: reset tp2 stat0=50 stat1=00 devices=0x1ATA_MASTER
ata0: [MPSAFE]
ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire
ad0: setting PIO4 on 82C686B chip
ad0: setting UDMA100 on 82C686B chip
ad0: 38166MB Seagate ST340016A 3.10 at ata0-master UDMA100
ad0: 78165360 sectors [19158C/16H/255S] 16 sectors/interrupt 1 depth queue

I'm pretty sure this HD is capable of UDMA100 (by the specification on Seagate
website).

The console messages are:
/dev/ad0s1e: clean, 823031 free (447 frags, 102823 blocks, 0.0% fragmentation)
ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191
ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191
ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=131647
ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=131647
ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=131647
g_vfs_done():ad0s1a[WRITE(offset=67371008, length=16384)]error = 5
[...]
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x24
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc04eef95
stack pointer   = 0x28:0xe4c714f0
frame pointer   = 0x28:0xe4c71500
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 127 (cp)
[thread pid 127 tid 100028 ]
Stopped at  turnstile_broadcast+0x9:movl0x24(%eax),%eax
db bt
Tracing pid 127 tid 100028 td 0xc474e000
turnstile_broadcast(0) at turnstile_broadcast+0x9
_mtx_unlock_sleep(c068aa60,0,0,0) at _mtx_unlock_sleep+0x6c
softdep_sync_metadata(c4958880) at softdep_sync_metadata+0x7d4
ffs_syncvnode(c4958880,1) at ffs_syncvnode+0x43d
ffs_truncate(c4958880,200,0,880,c4695d00,c474e000) at ffs_truncate+0x77e
ufs_direnter(c4958880,c49de880,e4c7192c,e4c71bd0,0) at ufs_direnter+0x85d
ufs_makeinode(81a4,c4958880,e4c71bbc,e4c71bd0) at ufs_makeinode+0x30f
ufs_create(e4c71a84) at ufs_create+0x37
VOP_CREATE_APV(c0670ec0,e4c71a84) at VOP_CREATE_APV+0x3c
VOP_CREATE(c4958880,e4c71bbc,e4c71bd0,e4c71ae0) at VOP_CREATE+0x34
vn_open_cred(e4c71ba8,e4c71cc4,1a4,c4695d00,4) at vn_open_cred+0x20c
vn_open(e4c71ba8,e4c71cc4,1a4,4) at vn_open+0x29
kern_open(c474e000,804c1c8,0,602,21b6) at kern_open+0xd4
open(c474e000,e4c71cf0) at open+0x22
syscall(3b,3b,3b,8060100,bfbfeec4) at syscall+0x337
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (5, FreeBSD ELF32, open), eip = 0x28137ccf, esp =
0xbfbfec7c, ebp = 0xbfbfecc8 ---
db call doadump
Cannot dump. No dump device defined.

The full dmesg (with boot_verbose) is available at
http://www.rafan.org/FreeBSD/ata/20060316-dmesg+db.txt

I did a alltrace in ddb:
http://www.rafan.org/FreeBSD/ata/20060311-dball.txt

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1 ata panic if dma enabled

2006-03-16 Thread Rong-En Fan
On 3/16/06, Scott Long [EMAIL PROTECTED] wrote:
 Rong-En Fan wrote:
  Hi,
 
  Recently, we upgrade a 4.11 box to 6.1-BETA2 by reinstall+newfs everything.
  After that,  we found that if hw.ata.ata_dma=1 at boot, then as soon as it
  starts fsck -p, it panics. It happens only if ad0 is setted to UDMA66 or 
  above.
  My current solution is set hw.ata.ata_dma=0 in loader.conf and manually
  turn DMA on ad0 to UDMA33 and rest ad4~ad7 to UDMA100. In the days of
  4.x, there is something wrong with DMA on ad0, but it will fall back to
  PIO4 automatically without problem. We have been tried to 1) change the
  cable 2) change from primary ata controller to the second, 3) upgrade to
  RELENG_6 as of March 11, but all these are failed. There is no options in
  bios to turn off DMA for the onboard ATA controller.

 Please review the release notes from the 6.1-BETA2 announcement.  Fixes
 went into 6.1 shortly after BETA2 was released, and are in BETA3 and BETA4.

Upgrade to today's RELENG_6, it is the same. I'm not quite if
this is hardware problem. But however, why can't ata fall back
to PIO4 is DMA write error, just like 4.x does?

ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire
ad0: setting PIO4 on 82C686B chip
ad0: setting UDMA100 on 82C686B chip
ad0: 38166MB Seagate ST340016A 3.10 at ata0-master UDMA100
ad0: 78165360 sectors [19158C/16H/255S] 16 sectors/interrupt 1 depth queue

/dev/ad0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad0s1d: clean, 624587 free (28411 frags, 74522 blocks, 1.9% fragmentation)
/dev/ad0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/ad0s1e: clean, 826458 free (466 frags, 103249 blocks, 0.0% fragmentation)
ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191
ad0: WARNING - WRITE_DMA UDMA ICRC error (retrying request) LBA=191
ad0: FAILURE - WRITE_DMA status=51READY,DSC,ERROR
error=84ICRC,ABORTED LBA=191
g_vfs_done():ad0s1a[WRITE(offset=65536, length=2048)]error = 5
mount: /dev/ad0s1a: Input/output error
Mounting root filesystem rw failed, startup aborted
Boot interrupted
Enter root password, or ^D to go multi-user

then I just continue..., finally it panics

kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x24
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc045
stack pointer   = 0x28:0xe4cfb4f0
frame pointer   = 0x28:0xe4cfb500
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 168 (cp)
[thread pid 168 tid 100044 ]
Stopped at  turnstile_broadcast+0x9:movl0x24(%eax),%eax
db bt
Tracing pid 168 tid 100044 td 0xc48de180
turnstile_broadcast(0) at turnstile_broadcast+0x9
_mtx_unlock_sleep(c068aca0,0,0,0) at _mtx_unlock_sleep+0x6c
softdep_sync_metadata(c495d660) at softdep_sync_metadata+0x7d4
ffs_syncvnode(c495d660,1) at ffs_syncvnode+0x43d
ffs_truncate(c495d660,200,0,880,c4695d00,c48de180) at ffs_truncate+0x77e
ufs_direnter(c495d660,c49e1880,e4cfb92c,e4cfbbd0,0) at ufs_direnter+0x85d
ufs_makeinode(81a4,c495d660,e4cfbbbc,e4cfbbd0) at ufs_makeinode+0x30f
ufs_create(e4cfba84) at ufs_create+0x37
VOP_CREATE_APV(c0671100,e4cfba84) at VOP_CREATE_APV+0x3c
VOP_CREATE(c495d660,e4cfbbbc,e4cfbbd0,e4cfbae0) at VOP_CREATE+0x34
vn_open_cred(e4cfbba8,e4cfbcc4,1a4,c4695d00,4) at vn_open_cred+0x20c
vn_open(e4cfbba8,e4cfbcc4,1a4,4) at vn_open+0x29
kern_open(c48de180,804c1c8,0,602,21b6) at kern_open+0xd4
open(c48de180,e4cfbcf0) at open+0x22
syscall(3b,3b,3b,8060100,bfbfeec4) at syscall+0x337
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (5, FreeBSD ELF32, open), eip = 0x28137ccf, esp =
0xbfbfec7c, ebp = 0xbfbfecc8 ---

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: nfsclient process stucks in nfsaio

2006-03-10 Thread Rong-En Fan
On 3/10/06, Rong-En Fan [EMAIL PROTECTED] wrote:
 With INVARIANT, WITNESS enabled, when I tried to ^C
 to exit dd, it panics immediately. Some ddb  kgdb
 messages below (I have KDB_TRACE, KDB_UNATTENDED).
 Core file is available. Any help is appreciated :-)

 UPDATE: sometimes, I cant ^C or kill -9 the dd process even
 with mpsafenet=0. In that situation, a panic with similar trace
 as below, which is mpsafenet=1.

Hi,

After tried with SMP with different combination of
debug.mpsafe{net,vm,vfs}, UP kernel, all the same.

Also, I did tune

options MAXDSIZ=(2048UL*1024*1024)
options MAXSSIZ=(128UL*1024*1024)
options DFLDSIZ=(2048UL*1024*1024)

in my kernel. Dont know if this afftects or not. I will try remove
them, and test again.

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: nfsclient process stucks in nfsaio

2006-03-10 Thread Rong-En Fan
Hi,

With INVARIANT, WITNESS enabled, when I tried to ^C
to exit dd, it panics immediately. Some ddb  kgdb
messages below (I have KDB_TRACE, KDB_UNATTENDED).
Core file is available. Any help is appreciated :-)

UPDATE: sometimes, I cant ^C or kill -9 the dd process even
with mpsafenet=0. In that situation, a panic with similar trace
as below, which is mpsafenet=1.

panic: VOP_STRATEGY failed bp=0xd835acd8 vp=0xc4a1baa0
cpuid = 1
KDB: stack backtrace:
kdb_backtrace(1,c05056b4,1,e7f1b7d0,1) at kdb_backtrace+0x2e
panic(c061782c,d835acd8,c4a1baa0,c4a1baa0,4) at panic+0x12b
bufstrategy(c4a1bb60,d835acd8,e7f1b80c,c471ee63,d835acd8) at bufstrategy+0x7d
bstrategy(d835acd8,c060be84,23c,a00200a6,0) at bstrategy+0x60
nfs_writebp(d835acd8,1,c4369000,e7f1b82c,c471eb73) at nfs_writebp+0xf3
nfs_bwrite(d835acd8,e7f1b904,c471e92b,d835acd8,1dd88000) at nfs_bwrite+0x13
bwrite(d835acd8,1dd88000,0,1dd86000,0) at bwrite+0x5b
nfs_flush(c4a1baa0,1,c4369000,1,e7f1b92c) at nfs_flush+0x78b
nfs_fsync(e7f1b93c) at nfs_fsync+0x1c
VOP_FSYNC_APV(c4735fc0,e7f1b93c) at VOP_FSYNC_APV+0x99
VOP_FSYNC(c4a1baa0,1,c4369000) at VOP_FSYNC+0x2e
bufsync(c4a1bb60,1,c4369000) at bufsync+0x14
bufobj_invalbuf(c4a1bb60,1,c4369000,100,0) at bufobj_invalbuf+0xda
vinvalbuf(c4a1baa0,1,c4369000,100,0) at vinvalbuf+0x1d
nfs_vinvalbuf(c4a1baa0,1,c4369000,1,c04d5738) at nfs_vinvalbuf+0xda
nfs_write(e7f1bbc8) at nfs_write+0x16f
VOP_WRITE_APV(c4735fc0,e7f1bbc8) at VOP_WRITE_APV+0x11e
VOP_WRITE(c4a1baa0,e7f1bcb0,7f0001,c49f5180) at VOP_WRITE+0x34
vn_write(c46d6ca8,e7f1bcb0,c49f5180,0,c4369000) at vn_write+0x1ad
fo_write(c46d6ca8,e7f1bcb0,c49f5180,0,c4369000) at fo_write+0x1d
dofilewrite(c4369000,4,c46d6ca8,e7f1bcb0,,,0) at
dofilewrite+0x8e
kern_writev(c4369000,4,e7f1bcb0) at kern_writev+0x41
write(c4369000,e7f1bcf0) at write+0x58
syscall(3b,3b,3b,8076000,10) at syscall+0x2cf
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (4, FreeBSD ELF32, write), eip = 0x880b9813, esp =
0xbfbfeaac, ebp = 0xbfbfead8 ---
Uptime: 4m18s
Dumping 3062 MB (2 chunks)
[...]

(kgdb) bt full
#0  0xc04a8181 in doadump () at /usr/src/sys/kern/kern_shutdown.c:233
No locals.
#1  0xc04a8841 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399
first_buf_printf = 1
#2  0xc04a8bf9 in panic (fmt=0xc061782c VOP_STRATEGY failed bp=%p vp=%p)
at /usr/src/sys/kern/kern_shutdown.c:555
td = (struct thread *) 0xc4369000
bootopt = 260
newpanic = 1
ap = 0xe7f1b7d0 ج5ؠ��Ġ���\004
buf = VOP_STRATEGY failed bp=0xd835acd8 vp=0xc4a1baa0, '\0'
repeats 208 times
#3  0xc0505689 in bufstrategy (bo=0xc4a1bb60, bp=0xd835acd8)
at /usr/src/sys/kern/vfs_bio.c:3690
i = 4
vp = (struct vnode *) 0xc4a1baa0
#4  0xc471ef28 in ?? ()
No symbol table info available.
#5  0xc4a1bb60 in ?? ()
No symbol table info available.
#6  0xd835acd8 in ?? ()
No symbol table info available.
#7  0xe7f1b80c in ?? ()
No symbol table info available.
#8  0xc471ee63 in ?? ()
No symbol table info available.
#9  0xd835acd8 in ?? ()
No symbol table info available.
#10 0xc060be84 in __func__.2 ()
No symbol table info available.
#11 0x023c in ?? ()
No symbol table info available.
#12 0xa00200a6 in ?? ()
No symbol table info available.
#13 0x in ?? ()
No symbol table info available.
#14 0xe7f1b820 in ?? ()
No symbol table info available.
#15 0xc471f2a3 in ?? ()
No symbol table info available.
#16 0xd835acd8 in ?? ()
No symbol table info available.
#17 0x0001 in ?? ()
No symbol table info available.
#18 0xc4369000 in ?? ()
No symbol table info available.
#19 0xe7f1b82c in ?? ()
No symbol table info available.
#20 0xc471eb73 in ?? ()
No symbol table info available.
#21 0xd835acd8 in ?? ()
No symbol table info available.
#22 0xe7f1b904 in ?? ()
No symbol table info available.
#23 0xc471e92b in ?? ()
No symbol table info available.
#24 0xd835acd8 in ?? ()
No symbol table info available.
#25 0x1dd88000 in ?? ()
No symbol table info available.
#26 0x in ?? ()
No symbol table info available.
#27 0x1dd86000 in ?? ()
No symbol table info available.
#28 0x in ?? ()
No symbol table info available.
#29 0xe7f1b858 in ?? ()
No symbol table info available.
#30 0xc049ee97 in _mtx_assert (m=0xd835acd8, what=-1067401596,
file=0x23c Address 0x23c out of bounds, line=-1610481498)
at /usr/src/sys/kern/kern_mutex.c:754
No locals.
Previous frame inner to this frame (corrupt stack?)
(kgdb) l *0xc0505689
0xc0505689 is in bufstrategy (/usr/src/sys/kern/vfs_bio.c:3691).
3686KASSERT(vp == bo-bo_private, (Inconsistent vnode
bufstrategy));
3687KASSERT(vp-v_type != VCHR  vp-v_type != VBLK,
3688(Wrong vnode in bufstrategy(bp=%p, vp=%p), bp, vp));
3689i = VOP_STRATEGY(vp, bp);
3690KASSERT(i == 0, (VOP_STRATEGY failed bp=%p vp=%p,
bp, bp-b_vp));
3691}
3692
3693void
3694bufobj_wrefl(struct bufobj *bo)
3695{


On 3/10/06, Rong-En Fan [EMAIL PROTECTED] wrote:
 Hi

nfsclient process stucks in nfsaio

2006-03-10 Thread Rong-En Fan
Hi,

After upgrading several our nfs clients from 5.4-RELEASE to 6.0-RELEASE
and some are now 6.1-PRERELEASE (a weeks ago). From time to time,
we saw some processes stuck in nfsaio, and unkillable. These processes
generate lots of traffic to nfs server (write to nfs, but nfs server's disk does
not really in write. from netstat, client sends ~100Mbps, on nfs server, iostat
does not show me ~12.5MB/s). The nfsd on the server side is either in RUN
or in ufs state. Server is running 5.5-PRELEASE as of yesterday.

Client mount options: rw,nosuid,bg,intr,nodev. Both client and server
are running
rpc.lockd, rpc.statd. I'm sure it's not related to any locking problems.

I have another set of nfs server/client both running 6.0-RELEASE. And I can
easily reproduce this situation on these two boxesnes, just by running

  dd if=/dev/zero of=/nfs/ooo bs=1m

If I do not add bs=1m, it works fine. Of all the boxes I mentioned above,
I did not do anything special to kernel config, i.e., they are GENERIC w/o
unnecessary devices and w/ firewal.  Basically, I can do anything on these
two boxes (they are not in production mode). Any suggestion are welcome.

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: nfsclient process stucks in nfsaio

2006-03-10 Thread Rong-En Fan
Hi,

forget to mention all the clients/servers here are SMP kernel.
After some Googling, a post on current@ 2005/01/12
NFS problems, locking up is hightly related to my situation.
An workaround is to set debug.mpsafenet=0, just verified this
indeed works.

Now I'm turning on INVARIANTS, WITNESS to see if there
are some output. However, I'm afriad that I can not get a
serial console access to these machines (and thus no ddb
output :( ).

Thanks,
Rong-En Fan

On 3/10/06, Rong-En Fan [EMAIL PROTECTED] wrote:
 Hi,

 After upgrading several our nfs clients from 5.4-RELEASE to 6.0-RELEASE
 and some are now 6.1-PRERELEASE (a weeks ago). From time to time,
 we saw some processes stuck in nfsaio, and unkillable. These processes
 generate lots of traffic to nfs server (write to nfs, but nfs server's disk 
 does
 not really in write. from netstat, client sends ~100Mbps, on nfs server, 
 iostat
 does not show me ~12.5MB/s). The nfsd on the server side is either in RUN
 or in ufs state. Server is running 5.5-PRELEASE as of yesterday.

 Client mount options: rw,nosuid,bg,intr,nodev. Both client and server
 are running
 rpc.lockd, rpc.statd. I'm sure it's not related to any locking problems.

 I have another set of nfs server/client both running 6.0-RELEASE. And I can
 easily reproduce this situation on these two boxesnes, just by running

   dd if=/dev/zero of=/nfs/ooo bs=1m

 If I do not add bs=1m, it works fine. Of all the boxes I mentioned above,
 I did not do anything special to kernel config, i.e., they are GENERIC w/o
 unnecessary devices and w/ firewal.  Basically, I can do anything on these
 two boxes (they are not in production mode). Any suggestion are welcome.

 Thanks,
 Rong-En Fan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: nfsclient process stucks in nfsaio [SOLVED]

2006-03-10 Thread Rong-En Fan
Hi all,

I believe that this behavior is caused by the ``intr'' (-i) option to
mount_nfs(8). As noted by Stephan Uphoff in PR/79700, he does
not recommend use intr option. After remove the option, the dd
works well.

Sorry for the noisy :-)

However, I think some warnings can be added to mount_nfs(8)
about the usage of intr and its consequence. So this won't be
happened again.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RELENG_6: serial console drops back from 115200 to 9600 baud

2006-02-27 Thread Rong-En Fan
On 2/27/06, Ruslan Ermilov [EMAIL PROTECTED] wrote:
 On Sun, Feb 26, 2006 at 08:26:42PM +0100, Dimitry Andric wrote:
  Ian Dowse wrote:
   Okay, but why did 4.x through 5.x through 6.x (these have all been on
   this particular machine) always boot with 115200 until now? :)
 
   They probably used 9600 for the boot blocks, and then switched to
   115200 when /boot/loader started, so you didn't notice. Now the
   settings from the boot blocks get used by /boot/loader.
 
  Ah, but this still means that /boot/loader used to use a hardcoded
  default specified in /etc/make.conf, and now it doesn't honor that anymore.
 
 Have you checked with documentation?

 : comconsole_speed
 :   Defines the speed of the serial console (i386 and amd64 only).
 :   If the previous boot stage indicated that a serial console is
 :   in use then this variable is initialized to the current speed
 :   of the console serial port.  Otherwise it is set to 9600 unless
 :   this was overridden using the BOOT_COMCONSOLE_SPEED variable
 :   when loader was compiled.  Changes to the comconsole_speed
 :   variable take effect immediately.

Which way is preferred: setting comconsole_speed,  -S in
boot.config, or using harded code BOOT_COMCONSOLE_SPEED in make.conf?
If now the most preferred way is to using -S or
comconsole_speed in loader.conf, please update that in Handbook
22.6.5.1 Setting a Faster Serial Port Speed.

Thanks,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM x236 with Serveraid

2006-02-21 Thread Rong-En Fan
On 2/21/06, Balgansuren Batsukh [EMAIL PROTECTED] wrote:
 Hello,

 We installed FreeBSD-6.0-RELEASE and cvsuped to STABLE.

 We configured machine as IPFW+NAT and installed SQUID+SQUIDGUARD+DANSGUARDIAN.

 It works well under light load, but on heavy load suddenly no response whole 
 machine.

 We guess FreeBSD-6.0 doesn't support ServeRaid-7k/7t on IBM eSeries x236 
 server.

We have IBM two eserver xSeries 236 (Model 8841) running 5.4-STABLE.
The ServeRAID 7k works good:

ips0: Adaptec ServeRAID Adapter mem 0xcfffd000-0xcfffdfff irq 38 at
device 14.0 on pci3
ips0: logical drives: 1
ips0: Logical Drive 0: RAID1 sectors: 860239872, state OK
ipsd0: Logical Drive on ips0
ipsd0: Logical Drive  (420039MB)

rafan.

 How can we to solve problem? Is there anyway to get work/support 
 ServeRAID-7k/7t controller?

 I check most of FreeBSD mailing list archive and search on google. Didn't 
 find any good answer of above question.


 Regards,
 Balgaa
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.0/amd64 boot hang if apic enabled on IBM x336

2005-12-19 Thread Rong-En Fan
Hi all,

We have a IBM xSeries 336 which has 2 Pentium 4 (EM64T) 3.2G with 2GB memory.
When we boot it with 6.0-RELEASE/amd64, it hangs after acd0 is shown.
However, if apic is disabled, then it boots. For 5.4-RELEASE/amd64, it works
great. A 7.0-CURRENT (SNAP009, Nov/2005) shows the same behavior.

To have it boot, I set following in loader.conf

hint.atkbd.0.disabled=0x4 # otherwise, I lose my keyboard
hint.apic.0.disabled=1

The verbose dmesg with/without apic, and the asl dump (with bios version 1.12,
the latest one found on IBM site) are available at:

http://www.rafan.org/FreeBSD/x336/

BTW, the SCSI timeout message from dmesg-noapic.txt is a bit wired. If I boot
without verbose message it does not show, everything works great. With
verbose booting,
those messages show up and it takes me a lot of time to get into multiuser mode.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic on RELENG_5 on em(4)

2005-10-27 Thread Rong-En Fan
Hi,

I'm running RELENG_5 around Oct 12, got a panic related to em(4).
After some searching, I saw a similar panic reported on -current
(his/her system is also RELENG_5) in May, but no further replies.
The kernel is similar to GENERIC with IPFW and have HTT
enabled in loader.conf. Box is a 2*Xeon with HTT, SMP kernel
is enabled, thus there are 4 logical cpus. For some reasons,
I did not have DDB compiled. The kgdb outputs are enclosed.
If there are people interested to help debug this, I can send
information as request.

Thanks,,
Rong-En Fan

(kgdb and console):
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault virtual address   = 0xbfc38018
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc05fb49f
stack pointer   = 0x10:0xe6448bc0
frame pointer   = 0x10:0xe6448c24
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 77 (irq16: em0)
trap number = 12
panic: page fault
cpuid = 2

#0  doadump () at pcpu.h:160
No locals.
#1  0xc04c1268 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:412
first_buf_printf = 1
#2  0xc04c1616 in panic (fmt=0xc062dcbe %s)
at /usr/src/sys/kern/kern_shutdown.c:568
td = (struct thread *) 0xc313bd80
bootopt = 260
newpanic = 0
ap = 0xc313bd80 L\023\020\t
buf = page fault, '\0' repeats 245 times
#3  0xc06121dd in trap_fatal (frame=0xe6448b80, eva=0)
at /usr/src/sys/i386/i386/trap.c:817
code = 16
type = 12
ss = 16
esp = 0
softseg = {ssd_base = 0, ssd_limit = 1048575, ssd_type = 27,
  ssd_dpl = 0, ssd_p = 1, ssd_xx = 1, ssd_xx1 = 0, ssd_def32 = 1, ssd_gran = 1}
#4  0xc0611ed4 in trap_pfault (frame=0xe6448b80, usermode=0, eva=3217260568)
at /usr/src/sys/i386/i386/trap.c:735
va = 3217260544
vm = (struct vmspace *) 0x0
map = 0xc0673280
rv = 1
ftype = 1 '\001'
td = (struct thread *) 0xc313bd80
p = (struct proc *) 0xc313a54c
#5  0xc0611ab9 in trap (frame=
  {tf_fs = -1022033896, tf_es = 16, tf_ds = -431751152, tf_edi = -10217512\
96, tf_esi = -1017843008, tf_ebp = -431715292, tf_isp = -431715412, tf_ebx = -\
1008379904, tf_edx = 0, tf_ecx = 234907650, tf_eax = 57350, tf_trapno = 12, tf\
_err = 0, tf_eip = -1067469665, tf_cs = 8, tf_eflags = 66055, tf_esp = -431715\
256, tf_ss = -1021353488}) at /usr/src/sys/i386/i386/trap.c:425
td = (struct thread *) 0xc313bd80
p = (struct proc *) 0xc313a54c
sticks = 3863251848
i = 0
ucode = 0
type = 12
code = 0
eva = 3217260568
#6  0xc05fdc4a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
No locals.
#7  0xc3150018 in ?? ()
No symbol table info available.
#8  0x0010 in ?? ()
No symbol table info available.
#9  0xe6440010 in ?? ()
No symbol table info available.
#10 0xc3195000 in ?? ()
No symbol table info available.
#11 0xc354f2c0 in ?? ()
No symbol table info available.
#12 0xe6448c24 in ?? ()
No symbol table info available.
#13 0xe6448bac in ?? ()
No symbol table info available.
#14 0xc3e55800 in ?? ()
No symbol table info available.
#15 0x in ?? ()
No symbol table info available.
#16 0x0e006802 in ?? ()
No symbol table info available.
#17 0xe006 in ?? ()
No symbol table info available.
#18 0x000c in ?? ()
No symbol table info available.
#19 0x in ?? ()
No symbol table info available.
#20 0xc05fb49f in bus_dmamap_load (dmat=0xc3353400, map=0x0, buf=0xe006802,
buflen=2046, callback=0xc045f8e8 em_dmamap_cb, callback_arg=0xe6448c48,
flags=0) at pmap.h:200
lastaddr = 0
error = 0
nsegs = 0
195 vm_paddr_t pa;
196
197 if ((pa = PTD[va  PDRSHIFT])  PG_PS) {
198 pa = (pa  ~(NBPDR - 1)) | (va  (NBPDR - 1));
199 } else {
200 pa = *vtopte(va);
201 pa = (pa  PG_FRAME) | (va  PAGE_MASK);
202 }
203 return pa;
204 }
#21 0xc04602f1 in em_get_buf (i=88, adapter=0xc3195000, nmp=0x0)
at /usr/src/sys/dev/em/if_em.c:2531
mp = (struct mbuf *) 0xc3e55800
rx_buffer = (struct em_buffer *) 0xc354f2c0
ifp = (struct ifnet *) 0xc354f2c0
paddr = 3272850816
error = -1021751296
2526
2527/*
2528 * Using memory from the mbuf cluster pool, invoke the
2529 * bus_dma machinery to arrange the memory mapping.
2530 */
2531error = bus_dmamap_load(adapter-rxtag, rx_buffer-map,
2532mtod(mp, void *), mp-m_len,
2533em_dmamap_cb, paddr, 0);
2534if (error) {
2535m_free(mp);
#22 0xc0460b6e in em_process_receive_interrupts (adapter

Re: got a panic on 5.4-STABLE

2005-09-21 Thread Rong-En Fan
On 8/28/05, Rong-En Fan [EMAIL PROTECTED] wrote:
 And I also looked at the dump file, looks like that when calling
 m_copym(), m-m_len is 20, off is 1500, m-m_next is NULL
 After first iteration, m becomes NULL...
 
 #20 0xc051d62f in m_copym (m=0x0, off0=1500, len=1480, wait=1)
 at /usr/src/sys/kern/uipc_mbuf.c:389

I have turned off mpsafenet and got another panic yesterday.
The panicstr is:
kmem_malloc(4096): kmem_map too small: 335544320 total allocated

The backtrace is here: (sorry, no console log, it was flushed by the fsck
messages)
(kgdb) bt full
#0  doadump () at pcpu.h:160
No locals.
#1  0xc04e0385 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410
first_buf_printf = 1
#2  0xc04e0733 in panic (
fmt=0xc0677392 kmem_malloc(%ld): kmem_map too small: %ld total allocated)
at /usr/src/sys/kern/kern_shutdown.c:566
td = (struct thread *) 0xc1f59c00
bootopt = 260
newpanic = 0
ap = 0xc1f59c00 L�:� �:�
buf = kmem_malloc(4096): kmem_map too small: 335544320 total
allocated, '\0' repeats 191 times
#3  0xc05fb1c4 in kmem_malloc (map=0xc0c590c0, size=4096, flags=1026)
at /usr/src/sys/vm/vm_kern.c:299
offset = 0
i = 0
entry = 0xd603f0c0
addr = 3253096448
m = 0xc0c64b48
pflags = -704384832
#4  0xc060da82 in page_alloc (zone=0xc0c52840, bytes=0, pflag=0x0, wait=0)
at /usr/src/sys/vm/uma_core.c:957
p = (void *) 0x0
#5  0xc060d4df in slab_zalloc (zone=0xc0c52840, wait=1026)
at /usr/src/sys/vm/uma_core.c:827
slabref = 0x0
slab = 0x0
flags = 2 '\002'
i = -1060746424
#6  0xc060f0ec in uma_zone_slab (zone=0xc0c52840, flags=1282)
at /usr/src/sys/vm/uma_core.c:1994
slab = 0x0
keg = 0xc0c64b40
#7  0xc060f352 in uma_zalloc_bucket (zone=0xc0c52840, flags=1282)
at /usr/src/sys/vm/uma_core.c:2103
bucket = 0xc3e6f624
slab = 0xc0c52840
saved = 0
max = 128
origflags = 1282
#8  0xc060ef3a in uma_zalloc_arg (zone=0xc0c52840, udata=0x0, flags=1282)
at /usr/src/sys/vm/uma_core.c:1911
item = (void *) 0xe4b39730
cache = 0xc0c52878
bucket = 0x0
cpu = 0
#9  0xc04d3f00 in malloc (size=192, type=0xc069d0e0, flags=1282) at uma.h:276
indx = 4
va = 0x800 Address 0x800 out of bounds
zone = 0x0
keg = 0xc0c64b40
#10 0xc05d7e67 in softdep_setup_freeblocks (ip=0xc439a604,
length=Unhandled dwarf expression opcode 0x93
)
at /usr/src/sys/ufs/ffs/ffs_softdep.c:1963
freeblks = (struct freeblks *) 0xc2330900
inodedep = (struct inodedep *) 0x1
adp = (struct allocdirect *) 0x0
vp = (struct vnode *) 0x0
bp = (struct buf *) 0x2
fs = (struct fs *) 0xc229f800
extblocks = -3098686512764636259
datablocks = Unhandled dwarf expression opcode 0x93

I read FreeBSD FAQ 5.10 and 5.11, which describe this kind of panic.
This machine has 1G memory. So I turned

kern.ipc.nmbclusters: 25600 - 32768
vm.kmem_size_max: 335544320 - 419430400
vm.kmem_size_scale: 3 - 2

Now, the vm.kmem_size is 419430400 (before is 335544320).
I'm wondering if these 3 panics are related to not enough kmem?
So, I would like to know is there any way to monitor kmem usage?
BTW, after first panic, I turned off SACK but no help. After the second
one, I turned off mpsafenet.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

got a panic on 5.4-STABLE

2005-08-28 Thread Rong-En Fan
Hi,

I got a panic on an i386 5.4-STABLE around Aug 28 with SMP enabled.
It has 2 physical CPU with HTT enabled (so, total 4 cpus). 
This is a NFS server only with external scsi raid attached.

The console log, kgdb output and sysctl.conf are as below.
I'll keep this core and if someone is interested, I can send any other
information requested.

Regards,
Rong-En Fan

[sysctl.conf]
net.link.ether.inet.log_arp_wrong_iface=0
net.inet.tcp.sendspace=65536
net.inet.tcp.recvspace=65536
net.inet.udp.recvspace=65536
kern.ipc.somaxconn=4096
kern.maxfiles=65535
kern.ipc.shmmax=104857600
kern.ipc.shmall=25600
net.inet.ip.random_id=1
kern.cam.da.retry_count=20
kern.cam.da.default_timeout=300
kern.maxvnodes=10
vfs.read_max=16

[console log]
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xc
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc051d62f
stack pointer   = 0x10:0xe7088a64
frame pointer   = 0x10:0xe7088a98
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 523 (nfsd)
trap number = 12
panic: page fault
cpuid = 0
boot() called on cpu#0
Uptime: 2h59m59s
Dumping 1023 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304
320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576
592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848
864 880 896 912 928 944 960 976 992 1008
Dump complete
Automatic reboot in 1 seconds - press a key on the console to abort
Rebooting...
cpu_reset called on cpu#0
cpu_reset: Stopping other CPUs

[kgdb output]
Unread portion of the kernel message buffer:
@��?� ��
D0p'[EMAIL PROTECTED]'

#0  doadump () at pcpu.h:160
160 __asm __volatile(movl %%fs:0,%0 : =r (td));
(kgdb) up
#1  0xc04e0385 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410
410 doadump();
(kgdb)
#2  0xc04e0733 in panic (fmt=0xc065d085 %s) at
/usr/src/sys/kern/kern_shutdown.c:566
566 boot(bootopt);
(kgdb)
#3  0xc064503d in trap_fatal (frame=0xe7088a24, eva=0) at
/usr/src/sys/i386/i386/trap.c:817
817 panic(%s, trap_msg[type]);
(kgdb)
#4  0xc0644d34 in trap_pfault (frame=0xe7088a24, usermode=0, eva=12)
at /usr/src/sys/i386/i386/trap.c:735
735 trap_fatal(frame, eva);
(kgdb)
#5  0xc0644919 in trap (frame=
  {tf_fs = -1038614504, tf_es = -418906096, tf_ds = 16, tf_edi =
1480, tf_esi = 0, tf_ebp = -418870632, tf_isp = -418870704, tf_ebx =
-1039299264, tf_edx = 20, tf_ecx = 1500, tf_eax = -1039299328,
tf_trapno = 12, tf_err = 0, tf_eip = -1068378577, tf_cs = 8, tf_eflags
= 66050, tf_esp = -1039299328, tf_ss = 1480}) at
/usr/src/sys/i386/i386/trap.c:425
425 (void) trap_pfault(frame, FALSE, eva);
(kgdb)
#6  0xc063122a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
140 calltrap
Current language:  auto; currently asm
(kgdb)
#7  0xc2180018 in ?? ()
(kgdb)
#8  0xe7080010 in ?? ()
(kgdb)
#9  0x0010 in ?? ()
(kgdb)
#10 0x05c8 in ?? ()
(kgdb)
#11 0x in ?? ()
(kgdb)
#12 0xe7088a98 in ?? ()
(kgdb)
#13 0xe7088a50 in ?? ()
(kgdb)
#14 0xc20d8d40 in ?? ()
(kgdb)
#15 0x0014 in ?? ()
(kgdb)
#16 0x05dc in ?? ()
(kgdb)
#17 0xc20d8d00 in ?? ()
(kgdb)
#18 0x000c in ?? ()
(kgdb)
#19 0x in ?? ()
(kgdb)
#20 0xc051d62f in m_copym (m=0x0, off0=1500, len=1480, wait=1)
at /usr/src/sys/kern/uipc_mbuf.c:389
389 m = m-m_next;
Current language:  auto; currently c
(kgdb)
#21 0xc0576835 in ip_fragment (ip=0xc20d8de4, m_frag=0xe7088b48,
mtu=-1039299264,
if_hwassist_flags=6, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967
967 m-m_next = m_copy(m0, off, len);
(kgdb)
#22 0xc05764a6 in ip_output (m=0xc20d8d00, opt=0xc20d8de4,
ro=0xe7088b14, flags=0, imo=0x0,
inp=0xc24317bc) at /usr/src/sys/netinet/ip_output.c:796
796 error = ip_fragment(ip, m, ifp-if_mtu,
ifp-if_hwassist, sw_csum);
(kgdb)
#23 0xc0589954 in udp_output (inp=0xc24317bc, m=0xc20d8d00,
addr=0xc36fe6d0, control=0x0,
td=0xc243ea80) at /usr/src/sys/netinet/udp_usrreq.c:870
870 error = ip_output(m, inp-inp_options, NULL, ipflags,
(kgdb)
#24 0xc058a5b2 in udp_send (so=0x0, flags=0, m=0x0, addr=0x0,
control=0x0, td=0x0)
at /usr/src/sys/netinet/udp_usrreq.c:1047
1047return udp_output(inp, m, addr, control, td);
(kgdb)
#25 0xc0520aa6 in sosend (so=0xc2443144, addr=0xc36fe6d0, uio=0x0,
top=0xc5b87b00,
control=0x0, flags=0, td=0xc243ea80) at /usr/src/sys/kern/uipc_socket.c:835
835 error = (*so-so_proto-pr_usrreqs-pru_send)(so,
(kgdb)
#26 0xc05c09f4 in nfsrv_send (so=0xc2443144, nam=0xc36fe6d0, top=0x0)
at pcpu.h:157
157 {
(kgdb)
#27 0xc05c3b5a in nfssvc_nfsd (td=0x0) at
/usr

Re: got a panic on 5.4-STABLE

2005-08-28 Thread Rong-En Fan
On 8/28/05, Rong-En Fan [EMAIL PROTECTED] wrote:
 Hi,
 
 I got a panic on an i386 5.4-STABLE around Aug 28 with SMP enabled.
 It has 2 physical CPU with HTT enabled (so, total 4 cpus).
 This is a NFS server only with external scsi raid attached.
 
 The console log, kgdb output and sysctl.conf are as below.
 I'll keep this core and if someone is interested, I can send any other
 information requested.

I have the following in make.conf:
CPUTYPE?= p4
CFLAGS= -O -pipe
COPTFLAGS= -O -pipe

The difference between GENERIC and my kernel is here:
http://www.rafan.org/FreeBSD/panic/m_copym/kernel-diff-against-GENERIC.txt

And I also looked at the dump file, looks like that when calling
m_copym(), m-m_len is 20, off is 1500, m-m_next is NULL
After first iteration, m becomes NULL...

#20 0xc051d62f in m_copym (m=0x0, off0=1500, len=1480, wait=1)
at /usr/src/sys/kern/uipc_mbuf.c:389
389 m = m-m_next;
(kgdb) l
384 while (off  0) {
385 KASSERT(m != NULL, (m_copym, offset  size of
mbuf chain));
386 if (off  m-m_len)
387 break;
388 off -= m-m_len;
389 m = m-m_next;
390 }
391 np = top;
392 top = 0;
393 while (len  0) {
(kgdb) p off
$15 = 1480
(kgdb) up
#21 0xc0576835 in ip_fragment (ip=0xc20d8de4, m_frag=0xe7088b48,
mtu=-1039299264,
if_hwassist_flags=6, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967
967 m-m_next = m_copy(m0, off, len);
(kgdb) p off
$9 = 1500
(kgdb) p len
$10 = 1480
(kgdb) p m0
$11 = (struct mbuf *) 0xc20d8d00
(kgdb) p *m0
$12 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc20d8d40
E, mh_len = 20,
mh_flags = 2050, mh_type = 2}, M_dat = {MH = {MH_pkthdr = {rcvif =
0x0, len = 8348,
header = 0xc5b25010, csum_flags = 0, csum_data = 6, tags =
{slh_first = 0x0}},
  MH_dat = {MH_ext = {
  ext_buf = 0xc33ae000 oker/a\nonline casinos a
href=http://www.lucky-777-casinos.comonline casinos/a\nviagra a
href=http://www.ALL-VIAGRA.INFOvip�, ext_free = 0,
  ext_args = 0x0, ext_size = 2048, ref_cnt = 0xdc050045,
ext_type = 549021963},
MH_databuf =
\000�:�\000\000\000\000\000\000\000\000\000\b\000\000E\000\005�\vi�
@\0   
   
[EMAIL PROTECTED]
\000xU�mJ\207�\000\000\000\0  
   001, '\0' repeats 23
times, \002\000\000\001�\000\000\000\002\000\000q�\000\000\001�\000\0
  

000\000\000\000\000\002\000\000\000\000\000\000\000\b\000\000\000\000\001\001\217\000\021\000\000\000\000\000\000\004\034\000\000\000\000\000\031\024�C\021m�\000\000\000\000B�[�\000\000\0
  
   
000\000B�[�\000\000\000\016\fo\2117\000\016\f^̺\b\000E\000\234
[EMAIL PROTECTED]  
 4p\036\034\b\001\003�
\210N�}},
M_databuf = \000\000\000\000\234
\000\000\020P��\000\000\000\000\006\000\000\000\000\000   
  
   0\000\000\000�:�\000\000\000\000\000\000\000\000\000\b\000\000E\000\005�\vi�
@\021��\214p\0360 
   
[EMAIL PROTECTED]
\000xU�mJ\207�\000\000\000\001, '\0' rep
   peats 23
times, 
\002\000\000\001�\000\000\000\002\000\000q�\000\000\001�\000\000\000\000\000
  

0\000\002\000\000\000\000\000\000\000\b\000\000\000\000\001\001\217\000\021\000\000\000\000\000\000\004\034\000\000\000\000\000\031\024�C\021m�\000\000\000\000B�[�\000\000\000\000B�[�\000
  
  0\000\000\016\fo\2117\000\016\f^̺\b\000E\000\234
\vi\000\000@...}}
(kgdb) l
962 len = ip-ip_len - off;
963 m-m_flags |= M_LASTFRAG;
964 } else
965 mhip-ip_off |= IP_MF;
966 mhip-ip_len = htons((u_short)(len + mhlen));
967 m-m_next = m_copy(m0, off, len);
968 if (m-m_next == NULL) {/* copy failed */
969 m_free(m);
970 error = ENOBUFS;/* ??? */
971 ipstat.ips_odropped++;
(kgdb) up
#22 0xc05764a6 in ip_output (m=0xc20d8d00, opt=0xc20d8de4,
ro=0xe7088b14, flags=0, imo=0x0,
inp=0xc24317bc) at /usr/src/sys/netinet/ip_output.c:796
796 error = ip_fragment(ip, m, ifp-if_mtu,
ifp

panic: page fault while in kernel mode

2005-08-18 Thread Rong-En Fan
Hi all,

It is an 5.4-STABLE running on i386, date is about Aug 10 4am UTC.
When I'm doing:
cat /var/log/maillog | ./log.pl
to do some log analysis, I panicked this system.
Here are some console log and kgdb output. I'll keeping this dump
for sometime, so if anyone wants any information, feel free to contact
me :-)

Regards,
Rong-En Fan

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1c
fault code  = supervisor write, page not present
instruction pointer = 0x8:0xc05123aa
stack pointer   = 0x10:0xea39b9cc
frame pointer   = 0x10:0xea39b9ec
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 63117 (cat)
trap number = 12
panic: page fault
cpuid = 0
boot() called on cpu#0
Uptime: 8d14h34m14s

(kgdb) bt
#0  doadump () at pcpu.h:160
#1  0xc04c141d in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410
#2  0xc04c17cb in panic (fmt=0xc062d55e %s)
at /usr/src/sys/kern/kern_shutdown.c:566
#3  0xc0611a8d in trap_fatal (frame=0xea39b98c, eva=0)
at /usr/src/sys/i386/i386/trap.c:817
#4  0xc0611784 in trap_pfault (frame=0xea39b98c, usermode=0, eva=28)
at /usr/src/sys/i386/i386/trap.c:735
#5  0xc0611369 in trap (frame=
  {tf_fs = 24, tf_es = -899547120, tf_ds = -365363184, tf_edi =
-684394740, tf_esi = -684394740, tf_ebp = -365315604, tf_isp =
-365315656, tf_ebx = -684394740, tf_edx = 0, tf_ecx = -899544064,
tf_eax = 4, tf_trapno = 12, tf_err = 2, tf_eip = -1068424278, tf_cs =
8, tf_eflags = 66198, tf_esp = 1049856, tf_ss = 33554464}) at
/usr/src/sys/i386/i386/trap.c:425
#6  0xc05fd51a in calltrap () at /usr/src/sys/i386/i386/exception.s:140
#7  0x0018 in ?? ()
#8  0xca620010 in ?? ()
#9  0xea390010 in ?? ()
#10 0xd734f70c in ?? ()
#11 0xd734f70c in ?? ()
#12 0xea39b9ec in ?? ()
#13 0xea39b9b8 in ?? ()
#14 0xd734f70c in ?? ()
#15 0x in ?? ()
#16 0xca620c00 in ?? ()
#17 0x0004 in ?? ()
#18 0x000c in ?? ()
#19 0x0002 in ?? ()
#20 0xc05123aa in vfs_vmio_release (bp=0xd734f70c) at atomic.h:365
#21 0xc0512db9 in getnewbuf (slpflag=0, slptimeo=0, size=16384, maxsize=16384)
at /usr/src/sys/kern/vfs_bio.c:1885
#22 0xc0514619 in getblk (vp=0xc4b96840, blkno=1019, size=16384, slpflag=0, 
slptimeo=0, flags=0) at /usr/src/sys/kern/vfs_bio.c:2585
#23 0xc0519810 in cluster_read (vp=0xc4b96840, filesize=25188657, lblkno=1019, 
size=16384, cred=0x0, totread=4096, seqcount=127, bpp=0x0)
at /usr/src/sys/kern/vfs_cluster.c:123
#24 0xc05af374 in ffs_read (ap=0x0) at /usr/src/sys/ufs/ffs/ffs_vnops.c:462
#25 0xc05324d2 in vn_read (fp=0xcab176a4, uio=0xea39bcb0, 
active_cred=0xc794d800, flags=0, td=0xca620c00) at vnode_if.h:398
#26 0xc04e79e0 in dofileread (td=0xca620c00, fd=0, fp=0xcab176a4, 
auio=0xea39bcb0, offset=Unhandled dwarf expression opcode 0x93
) at file.h:233
#27 0xc04e7809 in kern_readv (td=0xca620c00, fd=3, auio=0x0)
at /usr/src/sys/kern/sys_generic.c:191
#28 0xc04e76df in read (td=0x0, uap=0x0) at /usr/src/sys/kern/sys_generic.c:115
#29 0xc0611e6a in syscall (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = -1078001617, tf_edi = 1, tf_esi
= 4096, tf_ebp = -1077941336, tf_isp = -365314716, tf_ebx = 0, tf_edx
= 134541312, tf_ecx = 1, tf_eax = 3, tf_trapno = 0, tf_err = 2, tf_eip
= 671949351, tf_cs = 31, tf_eflags = 582, tf_esp = -1077941476, tf_ss
= 47})
at /usr/src/sys/i386/i386/trap.c:1009
#30 0xc05fd56f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:201
#31 0x002f in ?? ()
#32 0x002f in ?? ()
#33 0xbfbf002f in ?? ()
#34 0x0001 in ?? ()
#35 0x1000 in ?? ()
#36 0xbfbfeba8 in ?? ()
#37 0xea39bd64 in ?? ()
#38 0x in ?? ()
#39 0x0804f000 in ?? ()
#40 0x0001 in ?? ()
#41 0x0003 in ?? ()
#42 0x in ?? ()
#43 0x0002 in ?? ()
#44 0x280d2227 in ?? ()
#45 0x001f in ?? ()
#46 0x0246 in ?? ()
#47 0xbfbfeb1c in ?? ()
#48 0x002f in ?? ()
#49 0x in ?? ()
#50 0x in ?? ()
#51 0x in ?? ()
#52 0x in ?? ()
#53 0x0eb8f000 in ?? ()
#54 0xca615e20 in ?? ()
#55 0xca620c00 in ?? ()
#56 0xea39bb30 in ?? ()
#57 0xea39bb18 in ?? ()
#58 0xc3097900 in ?? ()
#59 0xc04d46f8 in sched_switch (td=0x1000, newtd=0x0, flags=Cannot
access memory at address 0xbfbfebb8
)
at /usr/src/sys/kern/sched_4bsd.c:881
Previous frame inner to this frame (corrupt stack?)
(kgdb) up
#20 0xc05123aa in vfs_vmio_release (bp=0xd734f70c) at atomic.h:365
365 atomic.h: No such file or directory.
in atomic.h
Current language:  auto; currently c
(kgdb)
#21 0xc0512db9 in getnewbuf (slpflag=0, slptimeo=0, size=16384, maxsize=16384)
at /usr/src/sys/kern/vfs_bio.c:1885
1885vfs_vmio_release(bp);
(kgdb)
#22 0xc0514619 in getblk (vp=0xc4b96840, blkno=1019, size=16384, slpflag=0,
slptimeo=0, flags=0) at /usr/src/sys/kern

Re: panic: sbflush_locked on 5.4-p5/i386

2005-08-08 Thread Rong-En Fan
Hi

After upgrading to 5-STABLE (about Aug 6), it works very good.
With mpsafenet=1, it can work more than one day without
panic. For 5.4-p5, it will panic at most half day or so. This bug seems
fixed after 5.4 is released. I'll keep watching this machine. Will let
you know if it still have similar panics ;-)

Regards,
Rong-En Fan

On 7/29/05, Robert Watson [EMAIL PROTECTED] wrote:
 On Mon, 25 Jul 2005, Rong-En Fan wrote:
  I have a 5.4-p5 running on i386. Got a panic: panic: sbflush_locked: cc
  0 || mb 0xc33bf000 || mbcnt 4294967040 It is an web server running
  Apache and Postfix as a backup MX. I'm using gmirror on all partitions
  and thus cannot get a dump (swap is on gmirror). Some ddb outputs are
  below.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: sbflush_locked on 5.4-p5/i386

2005-08-01 Thread Rong-En Fan
On 8/1/05, Alexander S. Usov [EMAIL PROTECTED] wrote:
 In just 2 days of waiting I got it, however it looks that it has fired in a
 bit different place.
 
 (kgdb) bt
 #0  doadump () at pcpu.h:159
 #1  0xc0513899 in boot (howto=260) at ../../../kern/kern_shutdown.c:410
 #2  0xc0513ede in panic (fmt=0xc06ac87f sbdrop)
 at ../../../kern/kern_shutdown.c:566
 #3  0xc05556f4 in sbdrop_locked (sb=0xc2285ad8, len=112)
 at ../../../kern/uipc_socket2.c:1149
 #4  0xc05b27f2 in tcp_input (m=0xc1e31800, off0=152)
 at ../../../netinet/tcp_input.c:2209
 #5  0xc05a9b13 in ip_input (m=0xc1e31800) at ../../../netinet/ip_input.c:776
 #6  0xc059215e in netisr_processqueue (ni=0xc070b0f8)
 at ../../../net/netisr.c:233
 #7  0xc059241d in swi_net (dummy=0x0) at ../../../net/netisr.c:346
 #8  0xc04fb9a1 in ithread_loop (arg=0xc1979500)
 at ../../../kern/kern_intr.c:547
 #9  0xc04fa9dc in fork_exit (callout=0xc04fb8ea ithread_loop, arg=0x0,
 frame=0x0)at ../../../kern/kern_fork.c:791
 #10 0xc0656a8c in fork_trampoline () at ../../../i386/i386/exception.s:209

I also got a panic:sbdrop, but a bit different on path:
http://www.rafan.org/FreeBSD/5.4-so/panic-sbdrop
(but no dump)

And, another one with trap 12
http://www.rafan.org/FreeBSD/5.4-so/panic-trap12
I have dump for this, but kgdb cant read it :-(
It says cannot read PTD.

This machine is our main www server,  so I have to make it as stable
as possible. I might be able to switch mpsafenet at night.

By the way, I have accf_http(4) and accf_data(4) compiled in. But
with or without them, I can easily get panic with mpsafenet in less
half day.



 
 I am going to keep it around for some time, so I can easisy do a full bt or
 variables.
 
 
 Corresponding dmesg  sysctl output can be found at
 https://kvip88.kvi.nl/~usov
 
 
 --
 Best regards,
   Alexander.
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: sbflush_locked on 5.4-p5/i386

2005-07-29 Thread Rong-En Fan
On 7/29/05, Robert Watson [EMAIL PROTECTED] wrote:
 On Mon, 25 Jul 2005, Rong-En Fan wrote:
  I have a 5.4-p5 running on i386. Got a panic: panic: sbflush_locked: cc
  0 || mb 0xc33bf000 || mbcnt 4294967040 It is an web server running
  Apache and Postfix as a backup MX. I'm using gmirror on all partitions
  and thus cannot get a dump (swap is on gmirror). Some ddb outputs are
  below.
 
 Is this system an SMP and/or HTT system?

SMP (2 physical CPU) and HTT is enabled. So, there are total 4.

 If this problem is reproduceable, could I ask you to capture the following
 serial console output from DDB:

will do.

 Would it be possible to add an extra ATA disk to use for swap and
 capturing a core dump?

it's a bit hard for me to add this :-(
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic: sbflush_locked on 5.4-p5/i386

2005-07-24 Thread Rong-En Fan
hello,

I have a 5.4-p5 running on i386. Got a panic:
panic: sbflush_locked: cc 0 || mb 0xc33bf000 || mbcnt 4294967040
It is an web server running Apache and Postfix as a backup MX.
I'm using gmirror on all partitions and thus cannot get a dump (swap
is on gmirror). Some ddb outputs are below.

Google told me that
http://lists.freebsd.org/pipermail/freebsd-current/2004-December/044535.html
looks related. But the code path is different. Note that the patch in that mail
is already in 5.4.

If needed, I can provide kernel conf. I also tuned following sysctls:
vfs.hirunningspace=2097152
kern.ipc.somaxconn=4096
kern.maxfiles=3
kern.maxfilesperproc=3
net.inet.ip.random_id=1
machdep.hyperthreading_allowed=1

The DDB messages go here:
cpuid = 3
KDB: enter: panic
[thread pid 61 tid 100061 ]
Stopped at  kdb_enter+0x2b: nop
db wh
Tracing pid 61 tid 100061 td 0xc311e180
kdb_enter(c05f3bc6) at kdb_enter+0x2b
panic(c05f6f09,0,c33bf000,ff00,c3a1970c) at panic+0x127
sbflush_locked(c3a1970c,c3a19654,e74aeba4,c04e4cb4,c3a1970c) at
sbflush_locked+0x6f
sbrelease_locked(c3a1970c,c3a19654) at sbrelease_locked+0xd
sofree(c3a19654) at sofree+0x26c
in_pcbdetach(c371d870,c3e996f0,c3e996f0,e74aec9c,c05355df) at in_pcbdetach+0xb6
tcp_close(c3e996f0,1,1,1042e,1) at tcp_close+0x16
tcp_input(c4513400,14,1c1e708c,0,0) at tcp_input+0x2297
ip_input(c4513400) at ip_input+0x4f1
netisr_processqueue(c0643298) at netisr_processqueue+0xa3
swi_net(0) at swi_net+0xf2
ithread_loop(c3094c80,e74aed48) at ithread_loop+0x159
fork_exit(c049c138,c3094c80,e74aed48) at fork_exit+0x75
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe74aed7c, ebp = 0 ---
db ps
   61 c311ce200 0 0 204 [CPU 3] swi1: net

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM xSeries 335 and FreeBSD 5 STABLE. SMP problem

2005-07-19 Thread Rong-En Fan
On 7/19/05, Alexander Markov [EMAIL PROTECTED] wrote:
 If you unload kernel and load it again at boot manually, can 335 boot?
 I have one 336 with 5.4 that must use this trick to boot, otherwise
 it hangs after ipfw2 initialized. On the other hand, I have 3 335 installed
 with 5.4 running SMP smoothly.
 
 Nope, this trick doesn't work for me :-(
 And btw, do you have LSI Logic SCSI controller on your 335?

Sure. It is
mpt0: LSILogic 1030 Ultra4 Adapter port 0x2300-0x23ff mem 0xfbfe-0xfbfefff

I have tried it with RAID1 (with patch, performance is fine) and It can
boot with/without the patch. Anyway, I use gmirror now.

 I'll try to upgrade BIOS today, for it seems to be the only difference
 between my and people in the list's hardware.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM xSeries 335 and FreeBSD 5 STABLE. SMP problem

2005-07-14 Thread Rong-En Fan
On 7/14/05, Alexander Markov [EMAIL PROTECTED] wrote:
 Hello!
 
 I've got IBM xseries 335 with FreeBSD 5.4 installed, which hangs during boot 
 without panic. If I boot it with hint.apic.0.disabled=1 - everything is ok, 
 except the fact, that only one CPU is detected (from two ones). I tried 
 kernels with SMP and without SMP - nothing changed, booting with apic kills 
 OS.
 
 Please look at boot log below. It seems like mounting / from LSILogic 1030 
 Ultra4 Adapter (da0 over mpt0) results in hang.
 
 Cvsuping to STABLE gave no effect :-(
 If any information or tests are required, I'd be glad to provide it.

Hello,

If you unload kernel and load it again at boot manually, can 335 boot?
I have one 336 with 5.4 that must use this trick to boot, otherwise
it hangs after ipfw2 initialized. On the other hand, I have 3 335 installed 
with 5.4 running SMP smoothly.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


5.3-p16/i386 unknown reason console hang

2005-06-15 Thread Rong-En Fan
Hello all,

Recently, one of our 5.3-p16/i386 machine got frequenctly hang. Details,

1. I can switch vty, but can't login (after typing username, got hang)
2. can response to ping, but not other tcp/udp services
3. can break into ddb

It's a IBM X236 (EM64T) with 2G memory, and running Postfix/Amavisd/clamav
and mail/openwebmail with apache2. Sometime ago, I also reported similar
hang on 5.4/amd64, in fact they are the same machines, but at that time, I
have some non-default nfs mount options. But now, nfs mount options only
includes -L and nodev, nosuid. I'm wondering if it is some kind of
hareware problems,
similar thins happens on 5.4/amd64, 5.3/5.4 i386. Anyway, I have kernel conf,
loader.conf, dmesg, and two ddb output (ps, show lockedvn, show threads) at
http://rafan.infor.org/tmp/236/. 

By the way, a strange thing is that sometims, after hang, I reboot the machine,
after *foreground* fsck, when it enters multiuser, after the login prompt, I got
another hang. But this time, I can't break into ddb.. only solution is the
power cycle.

If you need more informations, please let me know :-)

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic: ffs_blkfree: freeing free block

2005-05-25 Thread Rong-En Fan
Hi,

I have seen this knid of panic on 5.3/i386 (-p15) and 4.x. Machine
is an i386 machine with an external hardware RAID, shared with
nfs. There are total 20+ nfs client. When one exported space
gets full and some user's program keeping write (probability
with remove) data to this space, sometimes after lots of /dev/x
is full message, I got this panic.

dev = da0s1d, block = 44652432, fs = /export/b1
panic: ffs_blkfree: freeing free block
cpuid = 3
KDB: enter: panic
[thread 100112]
Stopped at  kdb_enter+0x30: leave
db wh
kdb_enter(c066992d,3,c0672f0b,e4ba6ae0,5) at kdb_enter+0x30
panic(c0672f0b,c22b58a8,2a95790,0,c23638d4) at panic+0x13e
ffs_blkfree(c2363800,c23cb318,2a95790,0,4000) at ffs_blkfree+0x3d2
indir_trunc(c2690e00,aa55e20,0,1,80c) at indir_trunc+0x30d
handle_workitem_freeblocks(c2690e00,0,2,6,0) at handle_workitem_freeblocks+0x20e
process_worklist_item(0,0,42935dad,0,0) at process_worklist_item+0x1e1
softdep_process_worklist(0,1e,c1f1a320,0,0) at softdep_process_worklist+0xcc
sched_sync(0,e4ba6d48,0,0,0) at sched_sync+0x5f2
fork_exit(c0541295,0,e4ba6d48) at fork_exit+0x80
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe4ba6d7c, ebp = 0 ---

The system disk is on ips(4) which does not support dump in 5.3
(supportted in 5.4). So, there is no dump available. I don't exactly
know what kind of accessing pattern causes this. Therefore, no idea
where to start at. Wonder if this can be fixed or so.

Cheers,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 5.4/amd64 console hang

2005-04-16 Thread Rong-En Fan
On 4/16/05, Anders Nordby [EMAIL PROTECTED] wrote:
 Hi,
 
 On Fri, Apr 15, 2005 at 03:27:11PM +0800, Rong-En Fan wrote:
  I'm using a Pentium Xeon 3.2G * 2 running 5.3/5.4 amd54 RELEASE.
 
 That's a strange combination. Don't use FreeBSD/amd64 with Intel Pentium
 Xeon processors. Maybe you made a typing error or two? :-)

Those Xeon are EM64T, compatible with x86-64 :-)

By the way, I'm thinking that more frequently hang might related with
large read/write block in mount_nfs -r/-w (I use 8192, original is 1024).


Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


5.4/amd64 console hang

2005-04-15 Thread Rong-En Fan
Hi,

I'm using a Pentium Xeon 3.2G * 2 running 5.3/5.4 amd54 RELEASE. Recently,
I have frequently hang, say 4 in 3 days. Originally, I'm using
5.3-RELEASE-p5 or so,
and it happens, so I decided upgrade to 5.4-RC1/RC2 and disable HTT in BIOS.
Somehow, I noticed that this situation happens more frequently after upgrade to
5.4-RC1/RC2. This is a Mail server running Postifx with clamd/amavisd and
apache2 with some webmail applications. All users home directory (toatl 10)
is mounted from another NFS server running 5.4-PRELEASE/i386.
I have few ddb output and kernel config here:
http://rafan.infor.org/tmp/5.4-hang/
I executes ps, show threads, show lockedvn. 

when console hangs, serial console does not response, front console,
I can use alt+f? to switch vty, caps/numlock led is fine, but keyboard does
not response. can break to ddb.

any suggestions?

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 5.3-S (Mar 6) softdep stack backtrace from getdirtybuf()... problem?

2005-04-11 Thread Rong-En Fan
On Apr 11, 2005 3:16 AM, Brandon S. Allbery KF8NH [EMAIL PROTECTED] wrote:
 I have twice so far had the kernel syslog a stack backtrace with no
 other information.  Inspection of the kernel source, to the best of my
 limited understanding, suggests that getdirtybuf() was handed a buffer
 without an associated vnode.  Kernel config file and make.conf attached.
 
 Should I be concerned?
 
 Note that this system is an older 600MHz Athlon with only 256MB RAM, and
 both times this triggered it was thrashing quite a bit (that's more or
 less its usual state...).

I saw these similar trace on a 5.4-RC1/amd64 with 9 NFS mount. I suspect
this is a issue with busy NFS server?

rafan.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: xSeries346 and FreeBSD 5.3

2005-03-09 Thread Rong-En Fan
On Wed, 9 Mar 2005 12:10:09 +0100, pck [EMAIL PROTECTED] wrote:
 Welcome,
 
 I've just bought IBM xSeries346 server. There was no problem with
 installing FreeBSD 5.3, but problem is with LAN card - BCM5721 (I
 think, reseller told me this name).
 Is there any possibility to run this card?

You have to install a 5-STABLE or manually get following files

src/sys/dev/bge/*
src/sys/dev/mii/miidevs
src/sys/dev/mii/brgphy.c

and recompile kernel.
without those two mii files, you can only run 100-baseTX.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: Fatal trap 12: page fault while in kernel mode

2005-02-17 Thread Rong-En Fan
On Wed, 16 Feb 2005 15:36:25 -0800 (PST), Doug White
[EMAIL PROTECTED] wrote:
 On Wed, 16 Feb 2005, Rong-En Fan wrote:
 
  Hello,
 
  This is a 5.3-RELEASE-p5/amd64 on IBM X236 (EM64T) with 2GB RAM
  and a LSI 21320 rmpt(4) running at 160MB/s with a hardware
  RAID (da0, da1). HTT is enabled. When I run benchmark/blogbench on
  /da0/ I can *reproduce* this panic again and again:
  (I'm getting a dump now, let me fsck first)
  kernel conf  dmesg (boot -v) are at
   http://rafan.infor.org/tmp/236/
 
 I only have an 2x244 Opteron box so I'm not sure if this is a problem with
 KSE or with hyperthreading.  I'll try the benchmark anyway and see if I
 can reproduce.
 
 Looks like I'll need to rebuild first, I'm getting the exiting from
 __thread_start error...

If I use machdep.hlt_logical_cpus=1, I got the same panic.
And when I use kgdb to read the kernel dump, I see only
#1 ?? (??) in backtrace.

I just reinstall the system to 5.3-p5, i386. It does not
panic and finsih the test two times. I'll run more to see if is
panics.

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic: Fatal trap 12: page fault while in kernel mode

2005-02-15 Thread Rong-En Fan
] irq40:
   53 ff007b7712e8 b1a8d0000 0 0 204 [IWAIT] irq39:
   52 ff007b7715d0 b1a8e0000 0 0 204
[IWAIT] irq38: ips0
   51 ff007b7718b8 b1a8f0000 0 0 204 [IWAIT] irq37:
   50 ff007b771ba0 b1a90 0 0 204 [IWAIT] irq36:
   49 ff007b7d6000 b1a910000 0 0 204 [IWAIT] irq35:
   48 ff007b78 b1a050000 0 0 204 [IWAIT] irq34:
   47 ff007b7802e8 b1a060000 0 0 204 [IWAIT] irq33:
   46 ff007b7805d0 b1a070000 0 0 204 [IWAIT] irq32:
   45 ff007b7808b8 b1a080000 0 0 204 [IWAIT] irq31:
   44 ff007b780ba0 b1a090000 0 0 204 [IWAIT] irq30:
   43 ff007b783000 b1a460000 0 0 204 [IWAIT] irq29:
   42 ff007b7832e8 b1a470000 0 0 204 [IWAIT] irq28:
   41 ff007b7835d0 b1a480000 0 0 204 [IWAIT] irq27:
   40 ff007b7838b8 b1a490000 0 0 204 [IWAIT] irq26:
   39 ff007b783ba0 b1a4a0000 0 0 204 [IWAIT] irq25:
   38 ff007b7662e8 b19c0 0 0 204 [IWAIT] irq24:
   37 ff007b7665d0 b19c10000 0 0 204 [IWAIT] irq71:
   36 ff007b7668b8 b19c20000 0 0 204 [IWAIT] irq70:
   35 ff007b766ba0 b19c30000 0 0 204 [IWAIT] irq69:
   34 ff007b7b5000 b1a00 0 0 204 [IWAIT] irq68:
   33 ff007b7b52e8 b1a010000 0 0 204 [IWAIT] irq67:
   32 ff007b7b55d0 b1a020000 0 0 204 [IWAIT] irq66:
   31 ff007b7b58b8 b1a030000 0 0 204 [IWAIT] irq65:
   30 ff007b7b5ba0 b1a040000 0 0 204 [IWAIT] irq64:
   29 ff007b7dc8b8 b199a0000 0 0 204 [IWAIT] irq63:
   28 ff007b7dcba0 b199b0000 0 0 204 [IWAIT] irq62:
   27 ff007b784000 b199c0000 0 0 204 [IWAIT] irq61:
   26 ff007b7842e8 b19bb0000 0 0 204 [IWAIT] irq60:
   25 ff007b7845d0 b19bc0000 0 0 204 [IWAIT] irq59:
   24 ff007b7848b8 b19bd0000 0 0 204 [IWAIT] irq58:
   23 ff007b784ba0 b19be0000 0 0 204 [IWAIT] irq57:
   22 ff007b766000 b19bf0000 0 0 204 [IWAIT] irq56:
   21 ff007b7d42e8 b19750000 0 0 204 [IWAIT] irq55:
   20 ff007b7d45d0 b19760000 0 0 204 [IWAIT] irq54:
   19 ff007b7d48b8 b19950000 0 0 204
[IWAIT] irq53: mpt1
   18 ff007b7d4ba0 b19960000 0 0 204 [CPU
1] irq52: mpt0
   17 ff007b7dc000 b19970000 0 0 204 [IWAIT] irq51:
   16 ff007b7dc2e8 b19980000 0 0 204 [IWAIT] irq50:
   15 ff007b7dc5d0 b19990000 0 0 204 [IWAIT] irq49:
   14 ff007b76d000 b19330000 0 0 204 [IWAIT] irq48:
   13 ff007b76d2e8 b1970 0 0 20c [Can
run] idle: cpu0
   12 ff007b76d5d0 b19710000 0 0 20c [Can
run] idle: cpu1
   11 ff007b76d8b8 b19720000 0 0 20c [Can
run] idle: cpu2
   10 ff007b76dba0 b19730000 0 0 20c [Can
run] idle: cpu3
1 ff007b7d4000 b19740000 0 1 0004200 [SLPQ
wait 0xff007b7d4000][SLP] init
0 8051e580 805f50000 0 0 200 [SLPQ
sched 0x8051e580][SLP] swapper

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


panic: Fatal trap 12: page fault while in kernel mode

2005-02-15 Thread Rong-En Fan
[IWAIT] irq38: ips0
  51 ff007b7718b8 b1a8f0000 0 0 204 [IWAIT] irq37:
  50 ff007b771ba0 b1a90 0 0 204 [IWAIT] irq36:
  49 ff007b7d6000 b1a910000 0 0 204 [IWAIT] irq35:
  48 ff007b78 b1a050000 0 0 204 [IWAIT] irq34:
  47 ff007b7802e8 b1a060000 0 0 204 [IWAIT] irq33:
  46 ff007b7805d0 b1a070000 0 0 204 [IWAIT] irq32:
  45 ff007b7808b8 b1a080000 0 0 204 [IWAIT] irq31:
  44 ff007b780ba0 b1a090000 0 0 204 [IWAIT] irq30:
  43 ff007b783000 b1a460000 0 0 204 [IWAIT] irq29:
  42 ff007b7832e8 b1a470000 0 0 204 [IWAIT] irq28:
  41 ff007b7835d0 b1a480000 0 0 204 [IWAIT] irq27:
  40 ff007b7838b8 b1a490000 0 0 204 [IWAIT] irq26:
  39 ff007b783ba0 b1a4a0000 0 0 204 [IWAIT] irq25:
  38 ff007b7662e8 b19c0 0 0 204 [IWAIT] irq24:
  37 ff007b7665d0 b19c10000 0 0 204 [IWAIT] irq71:
  36 ff007b7668b8 b19c20000 0 0 204 [IWAIT] irq70:
  35 ff007b766ba0 b19c30000 0 0 204 [IWAIT] irq69:
  34 ff007b7b5000 b1a00 0 0 204 [IWAIT] irq68:
  33 ff007b7b52e8 b1a010000 0 0 204 [IWAIT] irq67:
  32 ff007b7b55d0 b1a020000 0 0 204 [IWAIT] irq66:
  31 ff007b7b58b8 b1a030000 0 0 204 [IWAIT] irq65:
  30 ff007b7b5ba0 b1a040000 0 0 204 [IWAIT] irq64:
  29 ff007b7dc8b8 b199a0000 0 0 204 [IWAIT] irq63:
  28 ff007b7dcba0 b199b0000 0 0 204 [IWAIT] irq62:
  27 ff007b784000 b199c0000 0 0 204 [IWAIT] irq61:
  26 ff007b7842e8 b19bb0000 0 0 204 [IWAIT] irq60:
  25 ff007b7845d0 b19bc0000 0 0 204 [IWAIT] irq59:
  24 ff007b7848b8 b19bd0000 0 0 204 [IWAIT] irq58:
  23 ff007b784ba0 b19be0000 0 0 204 [IWAIT] irq57:
  22 ff007b766000 b19bf0000 0 0 204 [IWAIT] irq56:
  21 ff007b7d42e8 b19750000 0 0 204 [IWAIT] irq55:
  20 ff007b7d45d0 b19760000 0 0 204 [IWAIT] irq54:
  19 ff007b7d48b8 b19950000 0 0 204
[IWAIT] irq53: mpt1
  18 ff007b7d4ba0 b19960000 0 0 204 [CPU
1] irq52: mpt0
  17 ff007b7dc000 b19970000 0 0 204 [IWAIT] irq51:
  16 ff007b7dc2e8 b19980000 0 0 204 [IWAIT] irq50:
  15 ff007b7dc5d0 b19990000 0 0 204 [IWAIT] irq49:
  14 ff007b76d000 b19330000 0 0 204 [IWAIT] irq48:
  13 ff007b76d2e8 b1970 0 0 20c [Can
run] idle: cpu0
  12 ff007b76d5d0 b19710000 0 0 20c [Can
run] idle: cpu1
  11 ff007b76d8b8 b19720000 0 0 20c [Can
run] idle: cpu2
  10 ff007b76dba0 b19730000 0 0 20c [Can
run] idle: cpu3
   1 ff007b7d4000 b19740000 0 1 0004200 [SLPQ
wait 0xff007b7d4000][SLP] init
   0 8051e580 805f50000 0 0 200 [SLPQ
sched 0x8051e580][SLP] swapper

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


IBM ServeRAID 7k 5.3

2004-12-22 Thread Rong-En Fan
[just for a record]

Hi all,

It seems ips(4) doesn't support ServeRAID 7k, however I just
installed 5.3-RELEASE/i386 on IBM x236 which has ServeRAID
7k installed. Everything looks fine here (I'm running RAID-5 over
4 HDDs).

A little problem is that once a HDD fails, FreeBSD doesn't
know that unless I reboot it and saw the ips state is DEGRADED.


pciconf  dmesg are listed as below:

[EMAIL PROTECTED]:14:0: class=0x010400 card=0x028e1014 chip=0x02509005 rev=0x07 
hdr=0x00
vendor   = 'Adaptec Inc'
class= mass storage
subclass = RAID

ips0: Adaptec ServeRAID Adapter mem 0xcfffd000-0xcfffdfff irq 38 at
device 14.0 on pci3
ips0: Reserved 0x1000 bytes for rid 0x10 type 3 at 0xcfffd000
ips0: [GIANT-LOCKED]
ips0: logical drives: 1
ips0: Logical Drive 0: RAID5 sectors: 430116864, state OK
ipsd0: Logical Drive on ips0
ipsd0: Logical Drive  (210018MB)

Regards,
Rong-En Fan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]