Re: Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others)
Ivan Voras wrote: 2009/10/13 Larry Rosenman l...@lerctr.org: note huge packet loss. It looks like it's VM fault or something like it. It sounds like the VM is failing to execute the guest during certain types of I/O. A bit of scheduler tracing in the host OS probably wouldn't go amiss to confirm that the VM really is suspending the guest It's VMWare ESXi underneath, which is *Officially Not Linux* though some ducks may disagree - anyway, I suspect tracing the host in this way is next to impossible without some kind of diamondium-level contract. What information do you need? I have a platinum VMWare contract. What version of ESXi? Hi, It is ESXi 3.5 - but if the problem is really in ESXi I presume anyone could reproduce it. My setup is nothing special - Xeon 5405, 8 GB RAM, SATA drives on ICH9. I recall others having various weird problems in 3.5 that went away when they upgraded to 4.0. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others)
Luigi Rizzo wrote: On Mon, Oct 12, 2009 at 09:48:42PM +0200, Thomas Backman wrote: I'm copying this over from the freebsd-performance list, as I'm looking for a few more opinions - not on the problems *I* am having, but rather to check whether the problem is universal or not, and if not, find a possible common factor. In other words: I want to hear about your experiences, *good or bad*! Here's the original thread (not from the beginning, though): http://lists.freebsd.org/pipermail/freebsd-performance/2009-October/003843.html Long story short, my version: when the disk is stressed hard enough, console IO becomes COMPLETELY unbearable. 10+ seconds to switch between windows in screen(1), running (or even typing) simple commands, etc. This happens both via SSH and the serial console. hi, this issue (not specific to FreeBSD, and not new -- it has been like this forever) is discussed in some detail here http://www.bsdcan.org/2009/schedule/events/122.en.html The following code (a bit outdated) can help http://lists.freebsd.org/pipermail/freebsd-stable/2009-March/048704.html Are you certain? The reported symptoms sound very unusual. Can you reproduce the problem with the provided instructions yourself? Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Patch for FreeBSD 7.0 deadlock
Santosh Rao Gururajan wrote: I am seeing problems with FreeBSD 7.0 machines that have the symptoms described in http://lists.freebsd.org/pipermail/freebsd-stable/2008-June/043241.html Can someone please point me to a patch which has a fix for the issue described in that thread? The fix to that particular problem is included in later releases. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.1 stable panics
Nathanael Jean-Francois wrote: Hello all, I've been getting some panics with a 7.1 stable machine from March 14th. I've not been able to determine the cause nor reproduce them at will. Here's a backtrace from the latest panic on March 23rd. Let me know if any more information is needed. Thanks. Unread portion of the kernel message buffer: panic: lock (sleep mutex) Giant not locked @ /usr/src/sys/kern/kern_ntptime.c:965 This is strange because the corresponding mtx_lock is only a few lines above. Can you provide your kernel config? Kris cpuid = 1 Uptime: 1d15h34m6s Physical memory: 2034 MB Dumping 377 MB: 362 346 330 314 298 282 266 250 234 218 202 186 170 154 138 122 106 90 74 58 42em0: watchdog timeout -- resetting 5em0: link state changed to DOWN 26 10 #0 doadump () at pcpu.h:195 195 __asm __volatile(movq %%gs:0,%0 : =r (td)); (kgdb) list 190 static __inline struct thread * 191 __curthread(void) 192 { 193 struct thread *td; 194 195 __asm __volatile(movq %%gs:0,%0 : =r (td)); 196 return (td); 197 } 198 #define curthread (__curthread()) 199 (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0x0104 in ?? () #2 0x804e6fc2 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #3 0x804e73f2 in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:574 #4 0x805218b6 in witness_unlock (lock=0x80b0a400, flags=8, file=0x0, line=965) at /usr/src/sys/kern/subr_witness.c:1284 #5 0x804dadb2 in _mtx_unlock_flags (m=0x80b0a400, opts=0, file=0x8087a008 /usr/src/sys/kern/kern_ntptime.c, line=965) at /usr/src/sys/kern/kern_mutex.c:203 #6 0x804dc062 in kern_adjtime (td=0x80884bd0, delta=0x674, olddelta=Variable olddelta is not available. ) at /usr/src/sys/kern/kern_ntptime.c:965 #7 0xff00019cf870 in ?? () #8 0x05a8 in ?? () #9 0x805430b6 in soreceive_generic (so=0xff0078a9f600, psa=0x0, uio=0xfffebe695b10, mp0=Variable mp0 is not available. ) at /usr/src/sys/kern/uipc_socket.c:1652 #10 0x80523e6d in dofileread (td=0xff000181b000, fd=3, fp=0xff00016af200, auio=0xfffebe695b10, offset=Variable offset is not available. ) at file.h:245 #11 0x805241de in kern_readv (td=0xff000181b000, fd=3, auio=0xfffebe695b10) at /usr/src/sys/kern/sys_generic.c:192 #12 0x805242cc in read (td=0x0, uap=0x0) at /usr/src/sys/kern/sys_generic.c:108 #13 0x8079d0dc in syscall (frame=0xfffebe695c80) at /usr/src/sys/amd64/amd64/trap.c:907 #14 0x80781cbb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:330 #15 0x00080076ad7c in ?? () Previous frame inner to this frame (corrupt stack?) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Is some combination of gmirror, md file systems, snapshots and, maybe, quotas considered harmful?
Scott Lambert wrote: On Fri, Mar 20, 2009 at 02:41:57PM -0500, Scott Lambert wrote: I have a previously stable machine, other than a one time panic in soft-updates which I could never reproduce, running RELENG_7 from July 23, 2008. Starting update: Wed Jul 23 01:29:47 CDT 2008 Finished update: Wed Jul 23 01:31:13 CDT 2008 I had the userquota option in the fstab for /home, but I did not yet have anything in /etc/rc.conf to enable them. I have been running an unmodified GENERIC kernel config. /dev/mirror/gm0s1g on /home (ufs, local, soft-updates) It runs a few jails, using ezjails. Two of them were image based jails, 1GB and 2GB. There is also one non-image file jail. The jails live in /home/ezjails. I added another image based jail, 3GB image, on March 12th. I added this machine to our AMANDA setup on March 13, 2009. Things seemed to be okay until the 19th. On the 19th, during the dump of /home, things gradually started to hang. Nagios paged me about services not responding. I did not find any explanation for it. The disks were idle according to systat -vm. I was able to grep the log files on /var for a while, and then I could no longer do anything with it. I eventually had to go to the office and power cycle it. I tried C-A-D first, but shutdown timed out after 30 seconds. Just to make sure it wasn't something that had since been fixed, I updated to RELENG_7 as of Mar 19th. Starting update: Thu Mar 19 03:40:41 CDT 2009 Finished update: Thu Mar 19 03:48:45 CDT 2009 I rebooted to the new kernel and installed the world just after midnight on the 20th. I started getting paged by Nagios again at 3:40am. I noticed that mksnap_ffs was running on /home, cpu time used: 0:00.77, as things began to circle the drain. That was about 30 minutes after the dump attempt had been started by AMANDA. There were many processes waiting in state D. This time I did a reboot -n -q and the box rebooted but was still fscking when I got to the office. # ls -l /home/.snap -r 1 root operator 117285093376 Mar 20 03:18 dump_snapshot # df /home FilesystemSizeUsed Avail Capacity Mounted on /dev/mirror/gm0s1g106G 11G 86G11%/home I removed userquota from the fstab entry for /home and rebooted, just to be sure. The last danger combination I remember for snapshots was in combination with quotas. Am I even in the danger zone for quotas without having them compiled into the kernel? It looks like removing the .snap directory should be enough to prevent any future snapshots during the backup process. Does that sound like a reasonable workaround? It would at least remove one variable from the trouble shooting process. Any other suggestions? Thank you for any help you may be able to provide, Did it to me again tonight. I was unable to get in to look at anything. Just pushed the power button. It did give me the same shutdown timed out after 30 seconds. So, I tuned the /home fs to disable softupdates. I also removed the .snap directory. I would appreciate any suggestions... http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Big problems with 7.1 locking up :-(
Pete Carah wrote: Well, following up on my own reply earlier, I csup'd releng_7 with a date of last dec 1; the result works fine in the laptop. I'll reload the eastern soekris tonight and see how it does. If the soekris is fine also then this gives a data point for whenever the bad commit(s) happened. I had apparently made the mistaken assumption that a general release should be better debugged than the work-in-progress leading up to it... I'm sorry FreeBSD has failed to live up to your expectations. As ever, we strive to fix more bugs than we introduce, but in a changing codebase this is never possible to guarantee or to always achieve. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Big problems with 7.1 locking up :-(
Pete Carah wrote: Kris writes: You and anyone else seeing performance problems should try to work through the advice given here: http://people.freebsd.org/~kris/scaling/Help_my_system_is_slow.pdf http://people.freebsd.org/%7Ekris/scaling/Help_my_system_is_slow.pdf Well, all the people in this thread have noticed that WITH NO CONFIG CHANGES from configs that worked fine in the past, their systems are very slow and/or locking up (mine are both) with the stable branch sometime (I noticed it sometime in December, but it got worse with the release.) Most were OK in October; mine (I think) were OK in late November - may narrow things down? Two of my systems that lock up have no internal visibility when they do (Soekris 4801's routing; the only time-intensive things running are routing (done in irq context) and pflog. These run with 60+ meg ram free.) These are complete lockups, though I did manage to get a ps out of my laptop last night by waiting 20 _minutes_ for it to start (!). This is not a generic performance problem. The laptop had 55 minutes of cpu time in the softdepflush thread after being up about an hour and 10 mins; this might give a hint. I didn't spot LL/RL state threads at the same time because I didn't know to. Now I do. BTW - the same ps showed 8 or so user-space procs in R state with NO cpu time; the kernel was hogging all of it for over an hour. Firefox did indeed trigger this one as someone else noted. A soekris doing only routing+nat has no such excuse... At least PHK was nice enough to note the watchdog in another thread :-) Actually, there have been several apparently different problems reported in this thread, some of which (including the message I replied to) *are* generic my system is slower problems. For generic my system hangs problems, see the chapter on kernel debugging in the handbook or follow the (same) advice given by Robert earlier in the thread. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Big problems with 7.1 locking up :-(
Tomas Randa wrote: Hello, I have similar problems. The last good kernel I have from stable brach, october the 8. Then in next upgrade, I saw big problems with performance. I tried ULE, 4BSD etc, but nothing helps, only downgrading system back. Now I am trying 7.1-p1 and problems are here again. Mysql is waiting a lot of time with status waiting for opening table or waiting for close tables I have 32bit FreeBSD with PAE, 1x xeon 5420, supermicro motherboard, areca SATA controller. Could not be problem in da device for example? You and anyone else seeing performance problems should try to work through the advice given here: http://people.freebsd.org/~kris/scaling/Help_my_system_is_slow.pdf Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 7.0 kernel panic
l1nyx...@googlemail.com wrote: Hello, FreeBSD-stable. Last week I have a lot of kernel panics like: [r...@router1 /usr/obj/usr/src/sys/TINCOKERNEL2]# kgdb kernel.debug /var/crash/vmcore.20 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x9 fault code = supervisor read, page not present instruction pointer = 0x20:0xc079f1af stack pointer = 0x28:0xe5697c80 frame pointer = 0x28:0xe5697cbc code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= resume, IOPL = 0 current process = 14 (swi4: clock) trap number = 12 panic: page fault cpuid = 0 Uptime: 51m49s Physical memory: 2032 MB Dumping 177 MB: 162 146 130 114 98 82 66 50 34 18 2 #0 doadump () at pcpu.h:195 195 __asm __volatile(movl %%fs:0,%0 : =r (td)); (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc078d1b7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc078d479 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a0eaac in trap_fatal (frame=0xe5697c40, eva=9) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a0f42f in trap (frame=0xe5697c40) at /usr/src/sys/i386/i386/trap.c:280 #5 0xc09f565b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #6 0xc079f1af in softclock (dummy=0x0) at /usr/src/sys/kern/kern_timeout.c:202 #7 0xc076f31b in ithread_loop (arg=0xc5101250) at /usr/src/sys/kern/kern_intr.c:1036 #8 0xc076c119 in fork_exit (callout=0xc076f170 ithread_loop, arg=0xc5101250, frame=0xe5697d38) at /usr/src/sys/kern/kern_fork.c:781 #9 0xc09f56d0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:205 (kgdb) Looks likely to be a hardware fault, to me. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Kernel Trap during installworld caused unrecoverable system
Glen Barber wrote: Hi folks. I was unfortunate enough to encounter a kernel trap in single user mode yesterday when upgrading from 7.1-RC1 to -RC2 using the typical build/install world. I'm still not sure what caused the trap, as the system was rebooting when I saw the screen. As one would expect, this led to an irrecoverable system; the system would automatically drop me into single user mode, as it could only mount the root directory; /bin/sh and /bin/csh would not work (had to use /restore/csh for the minimal digging I actually could do). So, now to the question. Given I could not mount anything other than /, would there have been any way for me to gather debugging information on what caused this failure? (FWIW, I am certain it is not a hardware failure.) You can proceed recovering the system using the tools in /rescue, and once you have shared libraries working again, run savecore to save the crashdump (assuming one was made). Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: repeatable crash on RELENG7
Mike Tancsa wrote: At 09:20 AM 12/2/2008, Kostik Belousov wrote: On Tue, Dec 02, 2008 at 09:12:54AM -0500, Mike Tancsa wrote: At 08:38 AM 12/2/2008, Kostik Belousov wrote: mdconfig -a -t malloc -s 1800M You cannot have ~ 2Gb of kernel memory allocated for md, at least not on i386. Thanks, how do I find out what the limit is on a machine ? Is it vm.kvm_size ? It is much less, and highly depends on your load, since KVA is used for all kind of allocations made by kernel. I think either md(4) or mdconfig(8) have a warning about malloc backing for md. Thanks! A warning might be helpful to prevent such foot shooting :) malloc Storage for this type of memory disk is allocated with malloc(9). This limits the size to the malloc bucket limit in the kernel. If the -o reserve option is not set, creating and filling a large malloc-backed memory disk is a very easy way to panic a system. You almost never want to use malloc backing for a md, in favour of swap backing. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic: spin lock held too long on 7.1-PRERELEASE (sio)
Holger Kipp wrote: Hi, I currently encounter spin lock 0xc0c83ba8 (sio) held by 0xc6337460 (tid 15) too long panic: spin lock held too long cpuid=3 on hp server with two dual-core intel xeon (3.8GHz) and 3GB RAM. System is 7.1-PRELEASE last fetched 29.11.2008, 17:58 UTC. This happens with VSCom Card that was previously in use in a 5.3-STABLE system (also hp server, but older hardware) without problems. The card is a 8-port serial card using puc0 and puc1 with 4 ports each. Interestingly, problem seems to occur only if sendfax is using devices attache to puc1. Works without problems at the moment with sendfax only using puc0-sios. See dmesg below. Any ideas what might be causing this, or is this even related to kern/121322, 118044 or 117913? Enable WITNESS in your kernel and try to reproduce the problem. It should give additional information about the condition leading to the deadlock. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Random hangs with 7.1-PRE
Richard Tector wrote: I'm not discounting hardware here, but I'm having problems with a previously stable amd64 system (dmesg attached) now running: FreeBSD 7.1-PRERELEASE #1: Wed Nov 26 00:10:41 GMT 2008 and previously running a RELENG_7 from around Oct 15th which also exhibited the problem. The system appears to hang with SSH terminals eventually timing out, and no other services being available. The odd thing is, when I go to the console, I can switch between terminals with Alt-F2, etc, but can not type anything. Pressing the power button a couple of times gives an acpi not ready message, so it doesn't appear the system has completely hung. The system is in a cool room and under little load running basic services: samba, postgres, dhcp, etc. Never seen problems with the machine previously. Does anyone have any thoughts or suggestions on narrowing down the cause? See the developers handbook chapter on kernel debugging for full instructions on how to gather the necessary information to proceed. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: MFC ZFS: when?
Andrew Snow wrote: The problem appears to be that the latest ZFS commit in 8-CURRENT relies on too many other new features that aren't in 7.1. After 7.1 is released, then perhaps ZFS and the other new code it requires can be moved into 7-STABLE? That is certainly the intention, but I trust that everyone will appreciate the need to watch and wait before dumping an enormous and potentially risky filesystem change into the laps of all 7-STABLE users. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: MFC ZFS: when?
Wes Morgan wrote: On Tue, 25 Nov 2008, Dillon Kass wrote: I'm very excited and can't wait! I have this clone I need to promote but I'm encountering this bug http://bugs.opensolaris.org/view_bug.do?bug_id=6738349 Hopefully it gets mfc before the diff between the real fs and the clone becomes so large that my pool fills up :-) I should have a few months before that happens though. Is it possible you could boot a -current system and promote the clone without upgrading your pools/filesystems, then reboot to -stable? The promotion is not backwards compatible. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: System hanging during dump
Jeremy Chadwick wrote: Has anyone else seen anything similar? It's a known problem documented in my Wiki -- see dump/restore. Note the part about UFS2 snapshot generation. I'm almost certain this is what you're describing. I don't know how you can say this so confidently without even comparing the process wait channels. Peter, there was a bug causing dump to hang (completely unrelated to UFS2 snapshot generation) merged to RELENG_7 a month or so ago. Can you try updating? Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 7.1-PRERELEASE panic rw_rlock (udpinp)
Mario Sergio Fujikawa Ferreira wrote: Hi, I've been running FreeBSD 7.0-STABLE from August 1 without problems. I tried updating to the latest -STABLE but I got a system panic. panic: _rw_rlock (udpinp): wlock already held @ /usr/src/sys/netinet/udp_usrreq.c FreeBSD 7.1-PRERELEASE #18: Sat Sep 20 23:38:22 BRT 2008 Regards, Mario Ferreira - Kernel configuration http://people.freebsd.org/~lioux/panic/2008092100/KERNCONF - syslog output http://people.freebsd.org/~lioux/panic/2008092100/all.log http://people.freebsd.org/~lioux/panic/2008092100/messages - dmesg.boot http://people.freebsd.org/~lioux/panic/2008092100/dmesg.boot - kldstat http://people.freebsd.org/~lioux/panic/2008092100/kldstat.txt - /boot/loader.conf http://people.freebsd.org/~lioux/panic/2008092100/loader.conf - 'pciconf -lv' http://people.freebsd.org/~lioux/panic/2008092100/pciconf.txt - 'sysctl -a' http://people.freebsd.org/~lioux/panic/2008092100/sysctl.txt - /etc/sysctl.conf http://people.freebsd.org/~lioux/panic/2008092100/sysctl.conf - gdb backtrace? Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 7.1-PRERELEASE panic rw_rlock (udpinp)
Mario Sergio Fujikawa Ferreira wrote: On Sun, Sep 21, 2008 at 11:48:19AM +0100, Kris Kennaway wrote: Mario Sergio Fujikawa Ferreira wrote: Hi, I've been running FreeBSD 7.0-STABLE from August 1 without problems. I tried updating to the latest -STABLE but I got a system panic. panic: _rw_rlock (udpinp): wlock already held @ /usr/src/sys/netinet/udp_usrreq.c FreeBSD 7.1-PRERELEASE #18: Sat Sep 20 23:38:22 BRT 2008 Regards, Mario Ferreira - Kernel configuration http://people.freebsd.org/~lioux/panic/2008092100/KERNCONF - syslog output http://people.freebsd.org/~lioux/panic/2008092100/all.log http://people.freebsd.org/~lioux/panic/2008092100/messages - dmesg.boot http://people.freebsd.org/~lioux/panic/2008092100/dmesg.boot - kldstat http://people.freebsd.org/~lioux/panic/2008092100/kldstat.txt - /boot/loader.conf http://people.freebsd.org/~lioux/panic/2008092100/loader.conf - 'pciconf -lv' http://people.freebsd.org/~lioux/panic/2008092100/pciconf.txt - 'sysctl -a' http://people.freebsd.org/~lioux/panic/2008092100/sysctl.txt - /etc/sysctl.conf http://people.freebsd.org/~lioux/panic/2008092100/sysctl.conf - gdb backtrace? I had a hard lock there. I got the system panic but it locked instead of dumping core. I had to remove the power cord to reboot it. I do not have much experience with kernel traces. How do I force a backtrace? Any unattended ddb scripts? See the developers handbook. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: WARNING: 7-STABLE BROKEN -- please wait to upgrade / Should be OK now
O. Hartmann wrote: Kris Kennaway wrote: O. Hartmann wrote: Steve Bertrand wrote: Dan Allen wrote: Well I got bit by this and am dead in the water. Nothing builds. I tried the DEBUG_FLAGS=-g trick but to no avail. My 7.0 box upgraded fine this morning: FreeBSD ids.eagle.ca 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Fri Aug 29 11:38:12 EDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP i386 The dmesg if it is relevant: http://ww3.ibctech.ca/ids.dmesg Steve Well, mine did not. I was capable of compiling world and kernel and also capable of installing all things like I did in the past, but ZFS seems still broken - the module does not load automatically at initialization time nor is it loadable via kldload (there is an error about missing opensolaris module but I can't find anything about this module ...). It's new, you need to build it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] Well, I hoped the 'buildworld' build everything process would do so? Do I need an extra option in my kernel config? It is a kernel module, it doesn't get built by buildworld (by default). If you use the default build settings for your kernel, it builds all modules including this one. If you use MODULES_OVERRIDE or similar to specify a list of modules to build, you have to add it to the list. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: WARNING: 7-STABLE BROKEN -- please wait to upgrade / Should be OK now
O. Hartmann wrote: O. Hartmann wrote: Kris Kennaway wrote: O. Hartmann wrote: Kris Kennaway wrote: O. Hartmann wrote: Steve Bertrand wrote: Dan Allen wrote: Well I got bit by this and am dead in the water. Nothing builds. I tried the DEBUG_FLAGS=-g trick but to no avail. My 7.0 box upgraded fine this morning: FreeBSD ids.eagle.ca 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Fri Aug 29 11:38:12 EDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP i386 The dmesg if it is relevant: http://ww3.ibctech.ca/ids.dmesg Steve Well, mine did not. I was capable of compiling world and kernel and also capable of installing all things like I did in the past, but ZFS seems still broken - the module does not load automatically at initialization time nor is it loadable via kldload (there is an error about missing opensolaris module but I can't find anything about this module ...). It's new, you need to build it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] Well, I hoped the 'buildworld' build everything process would do so? Do I need an extra option in my kernel config? It is a kernel module, it doesn't get built by buildworld (by default). If you use the default build settings for your kernel, it builds all modules including this one. If you use MODULES_OVERRIDE or similar to specify a list of modules to build, you have to add it to the list. Kris The module is indeed present, and it was all the time present. Obviously is something wrong or not in the right order with my config. When makeing buildworld and then rebooting the box, ZFS module does not automatically load OpenSOLARIS module when it detects its absence. I guess I need to fix an 'option opensolaris' in my kernel config. Thanks, Oliver here I am again. On a less critical mashine the installation process (buildwork/installworld and the same to the kernel) went through without problems. This box does have a ZFS device for backup purposes only, not essential. The box starts/booot. When I try loading either opensolaris.ko/zfs.ko, I get this: link_elf_obj: symbol stack_save undefined kldload: /boot/kernel/opensolaris.ko: Unsupported file type link_elf_obj: symbol stack_save undefined kldload: /boot/kernel/opensolaris.ko: Unsupported file type KLD zfs.ko: depends on opensolaris - not available kldload: /boot/kernel/zfs.ko: Unsupported file type I guess I forgot some of the recommended compiler switches ... ?? The link_elf_obj error is important. It looks like they rely on either DDB or STACK in your kernel. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: WARNING: 7-STABLE BROKEN -- please wait to upgrade / Should be OK now
O. Hartmann wrote: Kris Kennaway wrote: O. Hartmann wrote: O. Hartmann wrote: Kris Kennaway wrote: O. Hartmann wrote: Kris Kennaway wrote: O. Hartmann wrote: Steve Bertrand wrote: Dan Allen wrote: Well I got bit by this and am dead in the water. Nothing builds. I tried the DEBUG_FLAGS=-g trick but to no avail. My 7.0 box upgraded fine this morning: FreeBSD ids.eagle.ca 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Fri Aug 29 11:38:12 EDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP i386 The dmesg if it is relevant: http://ww3.ibctech.ca/ids.dmesg Steve Well, mine did not. I was capable of compiling world and kernel and also capable of installing all things like I did in the past, but ZFS seems still broken - the module does not load automatically at initialization time nor is it loadable via kldload (there is an error about missing opensolaris module but I can't find anything about this module ...). It's new, you need to build it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] Well, I hoped the 'buildworld' build everything process would do so? Do I need an extra option in my kernel config? It is a kernel module, it doesn't get built by buildworld (by default). If you use the default build settings for your kernel, it builds all modules including this one. If you use MODULES_OVERRIDE or similar to specify a list of modules to build, you have to add it to the list. Kris The module is indeed present, and it was all the time present. Obviously is something wrong or not in the right order with my config. When makeing buildworld and then rebooting the box, ZFS module does not automatically load OpenSOLARIS module when it detects its absence. I guess I need to fix an 'option opensolaris' in my kernel config. Thanks, Oliver here I am again. On a less critical mashine the installation process (buildwork/installworld and the same to the kernel) went through without problems. This box does have a ZFS device for backup purposes only, not essential. The box starts/booot. When I try loading either opensolaris.ko/zfs.ko, I get this: link_elf_obj: symbol stack_save undefined kldload: /boot/kernel/opensolaris.ko: Unsupported file type link_elf_obj: symbol stack_save undefined kldload: /boot/kernel/opensolaris.ko: Unsupported file type KLD zfs.ko: depends on opensolaris - not available kldload: /boot/kernel/zfs.ko: Unsupported file type I guess I forgot some of the recommended compiler switches ... ?? The link_elf_obj error is important. It looks like they rely on either DDB or STACK in your kernel. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] As recommended formerly in this list I defined 'options KDTRACE_HOOKS' in my kernel config due to the DTRACE merge. But I do not have any debugging option defined yet in any of my machine's kernels because of performance reasons. So, if I understand your comment the right way, I need to enable either STACK or DDB or both? I will try this anyway, starting with STACK, but I do not understand why I'm forced to do so. The code you are using requires it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: WARNING: 7-STABLE BROKEN -- please wait to upgrade / Should be OK now
O. Hartmann wrote: Steve Bertrand wrote: Dan Allen wrote: Well I got bit by this and am dead in the water. Nothing builds. I tried the DEBUG_FLAGS=-g trick but to no avail. My 7.0 box upgraded fine this morning: FreeBSD ids.eagle.ca 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #0: Fri Aug 29 11:38:12 EDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP i386 The dmesg if it is relevant: http://ww3.ibctech.ca/ids.dmesg Steve Well, mine did not. I was capable of compiling world and kernel and also capable of installing all things like I did in the past, but ZFS seems still broken - the module does not load automatically at initialization time nor is it loadable via kldload (there is an error about missing opensolaris module but I can't find anything about this module ...). It's new, you need to build it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: the future of sun4v
Peter Jeremy wrote: [Replies re-directed to freebsd-sun4v] On 2008-Aug-21 14:42:55 -0700, Kip Macy [EMAIL PROTECTED] wrote: I believe that there is a general expectation by freebsd users and developers that unsupported code should not be in CVS. Although sun4v is a very interesting platform for developers doing SMP work, I simply do not have the time or energy to maintain it. If someone else would like to step up and try his hand I would be supportive of his efforts. In the likely event that no one steps forward by the time that 7.1 is released I will ask that it be moved to the Attic. Since there are no other current SPARC CPUs that FreeBSD can run on (the US-II has been obsolete for about 6 years and FreeBSD won't run on any more recent sun4u chips), that will also remove the justification for maintaining a SPARC64 port. I don't have the knowledge or available time to maintain the sun4v port by myself but would be happy to be part of a team doing so. One impediment I have is that I don't have a T-1 or T-2 system that I can dedicate to FreeBSD. I could work on FreeBSD in a guest domain - but since FreeBSD doesn't support either the virtual disk or virtual network, actually getting FreeBSD running there presents somewhat of a challenge. There are two t1000 systems in the freebsd.org cluster that are available for people to work on. Rink Springer has also expressed interest in this. Perhaps Kip can explain some more about what things he looked at, but the most serious bugs might be in pmap or perhaps trap handling. Operationally, things like buildworld -jN die quickly with random signals, kernel traps, etc. Kris P.S. It looks like marius has made progress on US III but sun4u is still an architectural dead end. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sun4v arch
Nikolay Kalev wrote: I would also like to help as well. As KMacy knows before i asked a lot of questions for T2 types of servers but unfortunately i have no more access to those kind of hardware as well. I;m willing to participate if a team will be formated. Just so everyone is on the same page, what is needed to keep sun4v viable are people with experience with (or intention to learn about) low level architectural and implementation details of the FreeBSD kernel and the sun4v hardware platform, who know their way around things like pmap.c and other MD places where the kernel interfaces with the bare metal, and who are willing to make a long term (multi-year) commitment to supporting the platform. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 6.3/amd64: cvsup: Bus error (core dumped)
Joseph Koshy wrote: This problem has been reported a couple of times but it is not clear what change caused it. I upgraded my RELENG_6/amd64 system yesterday and ran into this bug. There's a fix (and an explanation of the bug) now in PR bin/124353. Koshy Thanks very much for tracking this down! Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: umtxn and Apache 2.2
Borja Marcos wrote: ((Sorry for the long dump)) (gdb) bt #0 0x3827cfe7 in __error () from /lib/libthr.so.3 #1 0x3827cd4a in __error () from /lib/libthr.so.3 #2 0x08702120 in ?? () As you can see the debugging symbols are still not available. Refer to the developers handbook if you need more assistance doing this. Also, it is worth carefully checking your php configuration. For example, php is notoriously sensitive to the order in which modules are defined, and will crash or misbehave without giving any other warnings if you don't meet its expectations. That may or may not be relevant in your situation. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: umtxn and Apache 2.2
Borja Marcos wrote: On Aug 13, 2008, at 3:18 PM, Kris Kennaway wrote: Borja Marcos wrote: ((Sorry for the long dump)) (gdb) bt #0 0x3827cfe7 in __error () from /lib/libthr.so.3 #1 0x3827cd4a in __error () from /lib/libthr.so.3 #2 0x08702120 in ?? () As you can see the debugging symbols are still not available. Refer to the developers handbook if you need more assistance doing this. Hmm. Weird. I compiled the port having WITH_DEBUG defined (as I saw in the Makefile) and indeed the gcc invocations has the -g flag set. What is strange is the error gdb issued, offering a coredump, etc. It is likely that the binaries are stripped on install then. You can try to run gdb against the compiled versions in the port work/ directory. Also, it is worth carefully checking your php configuration. For example, php is notoriously sensitive to the order in which modules are defined, and will crash or misbehave without giving any other warnings if you don't meet its expectations. That may or may not be relevant in your situation. Just in case I didn't change the order of the modules. I'll keep looking at it. This could be why :) Some people report that previously working configurations stopped working after an upgrade until the ordering was changed. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: umtxn and Apache 2.2
Borja Marcos wrote: Hello, I'm running a server with FreeBSD 7-STABLE as of August 8, Apache 2.2 with mpm/worker and threads support, and PHP 5.2.6. Everything works like a charm, but I see that Apache is leaking processes that get stuck in umtxn state. This graph shows it pretty well (I upgraded the system last Friday). Attaching gdb to one of the stray processes and doing a backtrace of the active thread, I see this: [Switching to Thread 0x8705900 (LWP 100647)] 0x382a8789 in _umtx_op () from /lib/libc.so.7 (gdb) bt #0 0x382a8789 in _umtx_op () from /lib/libc.so.7 #1 0x3825fe0d in __error () from /lib/libthr.so.3 #2 0x084b2b80 in ?? () #3 0x0005 in ?? () #4 0x in ?? () #5 0x in ?? () #6 0x in ?? () #7 0x38261914 in ?? () from /lib/libthr.so.3 #8 0xbe0e5ca8 in ?? () #9 0x3825b818 in pthread_mutex_getprioceiling () from /lib/libthr.so.3 Previous frame identical to this frame (corrupt stack?) (gdb) and it seems all the threads in the process are stuck here. Any ideas? This trace doesn't show anything really. You need to recompile the binaries with debugging symbols as well. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Max size of one swap slice
Lin Jui-Nan Eric wrote: On Wed, Aug 6, 2008 at 12:46 AM, Chuck Swiger [EMAIL PROTECTED] wrote: It's hard to conceive of why you'd want to add so much swap space, anyway-- if you've got programs which actually need to deal with 10s of gigabytes worth of data, then they ought to maintain a smaller/reasonable-sized working set in RAM and do disk I/O as needed themselves rather than depend upon the VM pager, anyways. We are running varnish, and found that it is not stable while using mmap mode. We don't know whether if the problem is in the code of varnish or file system, but we found that if we run varnish using malloc mode with big swap, it became stable. Thank you all for the information, I'll try to look into the kernel code. See http://www.freebsd.org/cgi/getmsg.cgi?fetch=540837+0+/usr/local/www/db/text/2008/freebsd-questions/20080706.freebsd-questions Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 6.3/amd64: cvsup: Bus error (core dumped)
Eugene Kazarinov wrote: Hello. Dont know is this list right for this topic, but dont know witch one is. So I got 6.3-STABLE-200807-amd64-disc1.iso I have installed it cd /usr/ports/net/cvsup-without-gui/ make install make clean #cvsup some-stable-sup-file Connected to cvsup.xx.ru Bus error (core dumped) I cant get fresh src and ports trees and cant compile fresh 6.X-stable system with athlon64 optimization. :( How to fix cvsup? How to keep src-tree of 6.x-stable up-to-date without working cvsup? How to keep ports-tree up-to-date without working cvsup? PS I manage about 10 servers under 6.x-stable and cvsup under 6.3-stable is very important for me. Pls help. This problem has been reported a couple of times but it is not clear what change caused it. Someone needs to do a binary search of the kernel source changes to work out when it started, then we can determine how to fix it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 6.3/amd64: cvsup: Bus error (core dumped)
Eugene Kazarinov wrote: nice. After posting I think to get a look to manual if it has changed since last time then I look into and tatata... ;) *Note:* The implementation of *CVSup* protocol included with the FreeBSD system is called *csup*. It first appeared in FreeBSD 6.2. Well, that is a workaround for some users, but it's not a solution. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+
Royce Williams wrote: Royce Williams wrote, on 7/22/2008 10:38 PM: Jeremy Chadwick wrote, on 7/22/2008 9:34 PM: On Tue, Jul 22, 2008 at 11:45:30AM -0800, Royce Williams wrote: We have 10 SuperMicro PDSMi+ 5015M-MTs that are panic'ing every few days. This started shortly after upgrade from 6.2-RELEASE to 6.3-RELEASE with freebsd-update. We use the same hardware (board and chassis), and have no such problems running both RELENG_6 and RELENG_7. I don't think your issue is specific to the board or chassis. Kris's explanation makes a lot more sense. :-) Jeremy/Kris/Clifton - Looks like we have consensus. :-) Thanks, all of you, for your helpful insight. I've bumped vm.kmem_size up to 400M on half of the affected boxes, leaving the other half as a control group. I'll report back once I have something to report. After having bumped up to 400M, a few boxes panic'd again yesterday. I caught a core, and it is kmem_map too small, just as Kris suspected: Jul 31 15:38:05 [redacted] savecore: reboot after panic: kmem_malloc(4096): kmem_map too small: 419430400 total allocated The docs state that 400M should be plenty for systems up to 6G, but Kris said earlier in this thread that it's better to say 'increase until the pain stops'. :-) Accordingly, I have some some follow-up questions; hopefully, this will be useful to others. - What is a reasonable increment? (I'm trying 448M next). - What are the practical and hard maximums? - I suspect that it's worth trying to make kmem 'as big as I need, but no bigger', so that non-kernel memory is also maximized? - In a larger sense, if 400M is probably big enough for 6G systems, and these are 4G systems, should I be suspicious that 400M isn't cutting it? In other words, is there a point at which should I be looking for obvious places where the kernel is eating too much memory and reduce them, rather than feeding it more? For example, I recall now that a network guy in my group did some sysctl tuning relating to networking on these systems, and I see from man tuning(7) that a number of these tweaks (obviously) can cause increased kernel consumption. $ egrep -v '^#|^$' /etc/sysctl.conf | sort kern.corefile=/var/cores/%U/%N-%P.core kern.ipc.maxsockbuf=8388608 kern.ipc.maxsockets=32768 kern.ipc.nmbclusters=65535 kern.ipc.somaxconn=4096 kern.maxfiles=262144 kern.maxfilesperproc=65535 net.inet.ip.portrange.first=8192 net.inet.ip.portrange.hifirst=8192 net.inet.ip.portrange.hilast=65535 net.inet.ip.portrange.last=65535 net.inet.ipf.fr_tcpclosed=60 net.inet.ipf.fr_tcpclosewait=120 net.inet.ipf.fr_tcphalfclosed=300 net.inet.ipf.fr_udptimeout=120 net.inet.tcp.delayed_ack=0 net.inet.tcp.inflight.enable=0 net.inet.tcp.msl=1 net.inet.tcp.mssdflt=1460 net.inet.tcp.recvspace=65536 net.inet.tcp.rfc1323=1 net.inet.tcp.sendspace=65536 net.inet.udp.maxdgram=57344 net.inet.udp.recvspace=65535 vfs.nfs.iodmax=32 vfs.nfs.iodmin=8 My apologies for not including this sooner. I didn't think of it until yesterday, primarily because it had been fine under 6.2. In retrospect, that was bad reasoning. Royce The statement that 400MB should be enough for up to 6GB is completely bogus. The amount of memory your kernel needs is a function of the work you give it to do. On i386 the kernel only has 1GB of address space though you can increase it by tuning KVA_PAGES at the expense of less memory available to user processes (everything comes out of 4GB address space). So that is the upper bound although other things need to fit in there too. On amd64 the situation is more complicated but on older versions than 8.0 there is 2GB for the kernel address space and in practise a limit of about 1500MB of kmem. It is possible that you are hitting a memory leak but I would just increase kmem further and see if it persists. Looking at vmstat -m etc may help to figure out if something is leaking over time. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0 Crashing
Kostik Belousov wrote: On Mon, Jul 28, 2008 at 01:16:39PM -0400, Michael toth wrote: I had someone run a Dell Diags CD on the machine and it passed all tests. Before that it core'd again; here is the backtrace from that one. Is there any other want (maybe in freebsd) to test the hardware better? and/or should I submit a bug report for this? Sure, you can submit bug report. From my POV, there is actually no data to resolve it. i.e. you should have no expectation that anyone else will be able to resolve it either. Note that diagnostics can tell you when a machine has failed, but can never tell you when a machine is working perfectly. Kris You may use memtest86 (Google for it) for memory/chipset/cpu cache test. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0 Crashing
Kostik Belousov wrote: On Mon, Jul 28, 2008 at 09:52:45PM +0200, Kris Kennaway wrote: Kostik Belousov wrote: On Mon, Jul 28, 2008 at 01:16:39PM -0400, Michael toth wrote: I had someone run a Dell Diags CD on the machine and it passed all tests. Before that it core'd again; here is the backtrace from that one. Is there any other want (maybe in freebsd) to test the hardware better? and/or should I submit a bug report for this? Sure, you can submit bug report. From my POV, there is actually no data to resolve it. i.e. you should have no expectation that anyone else will be able to resolve it either. Why ?! This was directed to Michael, sorry. Kris Note that diagnostics can tell you when a machine has failed, but can never tell you when a machine is working perfectly. Kris You may use memtest86 (Google for it) for memory/chipset/cpu cache test. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0 Crashing
Michael Toth wrote: Hi, I am running 7.0 stable on a Dell Power Edge 2950 and it is core dumping on me. Below is the dmesg and some core info I am hoping that someone can help me find out why it keeps core dumping on me. Thanks # cd /usr/obj/usr/src/sys/GENERIC/ # kgdb kernel.debug /var/crash/vmcore.5 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 4; apic id = 04 fault virtual address = 0x188 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0775284 stack pointer = 0x28:0xe7d6bad0 frame pointer = 0x28:0xe7d6bae8 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 4838 (egrep) trap number = 12 panic: page fault cpuid = 4 Uptime: 1h2m48s Physical memory: 2035 MB Dumping 87 MB: 72 56 40 24 8 Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko #0 doadump () at pcpu.h:195 195 __asm __volatile(movl %%fs:0,%0 : =r (td)); (kgdb)q # cat /var/run/dmesg.boot Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-STABLE #0: Sun Jul 27 08:58:11 EDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.20-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf48 Stepping = 8 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNXT-ID,CX16,xTPR AMD Features=0x2010NX,LM AMD Features2=0x1LAHF Cores per package: 2 Logical CPUs per core: 2 real memory = 2147221504 (2047 MB) avail memory = 2091675648 (1994 MB) ACPI APIC Table: DELL PE BKC FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0: Changing APIC ID to 8 ioapic1: Changing APIC ID to 9 ioapic2: Changing APIC ID to 10 ioapic3: Changing APIC ID to 11 ioapic0 Version 2.0 irqs 0-23 on motherboard ioapic1 Version 2.0 irqs 32-55 on motherboard ioapic2 Version 2.0 irqs 64-87 on motherboard ioapic3 Version 2.0 irqs 96-119 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: DELL PE BKC on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter ACPI-fast frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 pcib1: ACPI PCI-PCI bridge at device 2.0 on pci0 pci1: ACPI PCI bus on pcib1 pcib2: ACPI PCI-PCI bridge at device 0.0 on pci1 pci2: ACPI PCI bus on pcib2 amr0: LSILogic MegaRAID 1.53 mem 0xf81f-0xf81f,0xfe9c-0xfe9f irq 46 at device 14.0 on pci2 amr0: [ITHREAD] amr0: delete logical drives supported by controller amr0: LSILogic PERC 4e/Di Firmware 521S, BIOS H430, 256MB RAM pcib3: ACPI PCI-PCI bridge at device 0.2 on pci1 pci3: ACPI PCI bus on pcib3 pcib4: ACPI PCI-PCI bridge at device 4.0 on pci0 pci4: ACPI PCI bus on pcib4 pcib5: ACPI PCI-PCI bridge at device 5.0 on pci0 pci5: ACPI PCI bus on pcib5 pcib6: ACPI PCI-PCI bridge at device 0.0 on pci5 pci6: ACPI PCI bus on pcib6 em0: Intel(R) PRO/1000 Network Connection 6.9.5 port 0xecc0-0xecff mem 0xfe6e-0xfe6f irq 64 at device 7.0 on pci6 em0: [FILTER] em0: Ethernet address: 00:14:22:21:da:67 pcib7: ACPI PCI-PCI bridge at device 0.2 on pci5 pci7: ACPI PCI bus on pcib7 em1: Intel(R) PRO/1000 Network Connection 6.9.5 port 0xdcc0-0xdcff mem 0xfe4e-0xfe4f irq 65 at device 8.0 on pci7 em1: [FILTER] em1: Ethernet address: 00:14:22:21:da:68 pcib8: ACPI PCI-PCI bridge at device 6.0 on pci0 pci8: ACPI PCI bus on pcib8 pcib9: ACPI PCI-PCI bridge at device 0.0 on pci8 pci9: ACPI PCI bus on pcib9 amr1: LSILogic MegaRAID 1.53 mem 0xf80f-0xf80f irq 106 at device 4.0 on pci9 amr1: [ITHREAD] amr1: delete logical drives supported by controller amr1: LSILogic PERC 4/DC Firmware 351S,
Re: 7.0 Crashing
Michael Toth wrote: Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko #0 doadump () at pcpu.h:195 195 __asm __volatile(movl %%fs:0,%0 : =r (td)); (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0xc0782597 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc0782859 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:572 #3 0xc0a8b39c in trap_fatal (frame=0xe7d6ba90, eva=392) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a8b620 in trap_pfault (frame=0xe7d6ba90, usermode=0, eva=392) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a8bfcc in trap (frame=0xe7d6ba90) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a71bdb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0775284 in _mtx_lock_sleep (m=0xc600d174, tid=3318745216, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:339 #8 0xc09a93d7 in vm_fault (map=0xc56b5570, vaddr=671809536, fault_type=2 '\002', fault_flags=8) at /usr/src/sys/vm/vm_fault.c:293 #9 0xc0a8b50b in trap_pfault (frame=0xe7d6bd38, usermode=1, eva=671813488) at /usr/src/sys/i386/i386/trap.c:789 #10 0xc0a8be57 in trap (frame=0xe7d6bd38) at /usr/src/sys/i386/i386/trap.c:357 #11 0xc0a71bdb in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #12 0x2806e607 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) q Not much there, check for RAM/hardware problems. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: zfs, raidz, spare and jbod
Claus Guttesen wrote: Hi. I installed FreeBSD 7 a few days ago and upgraded to the latest stable release using GENERIC kernel. I also added these entries to /boot/loader.conf: vm.kmem_size=1536M vm.kmem_size_max=1536M vfs.zfs.prefetch_disable=1 Initially prefetch was enabled and I would experience hangs but after disabling prefetch copying large amounts of data would go along without problems. To see if FreeBSD 8 (current) had better (copy) performance I upgraded to current as of yesterday. After upgrading and rebooting the server responded fine. The server is a supermicro with a quad-core harpertown e5405 with two internal sata-drives and 8 GB of ram. I installed an areca arc-1680 sas-controller and configured it in jbod-mode. I attached an external sas-cabinet with 16 sas-disks at 1 TB (931 binary GB). I created a raidz2 pool with 10 disks and added one spare. I copied approx. 1 TB of small files (each approx. 1 MB) and during the copy I simulated a disk-crash by pulling one of the disks out of the cabinet. Zfs did not activate the spare and the copying stopped until I rebooted after 5-10 minutes. When I performed a 'zpool status' the command would not complete. I did not see any messages in /var/log/message. State in top showed 'ufs-'. That means that it was UFS that hung, not ZFS. What was the process backtrace, and what role does UFS play on this system? Kris A similar test on solaris express developer edition b79 activated the spare after zfs tried to write to the missing disk enough times and then marked it as faulted. Has any one else tried to simulate a disk-crash in raidz(2) and succeeded? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: zfs, raidz, spare and jbod
Jeremy Chadwick wrote: On Fri, Jul 25, 2008 at 09:46:34AM +0200, Claus Guttesen wrote: Hi. I installed FreeBSD 7 a few days ago and upgraded to the latest stable release using GENERIC kernel. I also added these entries to /boot/loader.conf: vm.kmem_size=1536M vm.kmem_size_max=1536M vfs.zfs.prefetch_disable=1 Initially prefetch was enabled and I would experience hangs but after disabling prefetch copying large amounts of data would go along without problems. To see if FreeBSD 8 (current) had better (copy) performance I upgraded to current as of yesterday. After upgrading and rebooting the server responded fine. With regards to RELENG_7, I completely agree with disabling prefetch. The overall performance (of the system and disk I/O) appears signicantly smoother, e.g. less hard lock-ups and stalls, is better when prefetch is disabled. FYI I do not get lock-ups when running with prefetch. It is supposed to just affect performance, i.e. if you have few disks or they have low bandwidth or high seek times (e.g. crappy ATA) then it can saturate them and you will have poor response times. However if your hardware is more capable then it is a performance optimization. Someone needs to obtain the usual debugging information. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John Sullivan wrote: Removing KDB_UNATTENDED from your kernel will allow you to interact with the debugger and obtain backtraces etc, which is useful when dumps are not being saved. Easier said than done, this cause a few panics - no dumps though ...g!! Still the same result ... the system seems to panic twice then hang. I will keep trying unless you have some other ideas?? Right, after trying for a number of days the system still just hung without letting me get either a dump or to interactively debug in the failed state, I reverted back to the Generic kernel, removed half the memory (2 of the 4 1GB sticks) and the system became stable. I inserted 1 of the 2 removed sticks and all was fine. I swapped that stick with the remaining stick and all was fine. I put them both back in and I started to see the crashes again - the first of which, gave me this dump -- server251# kgdb /boot/kernel/kernel /var/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address= 0xb0 fault code= supervisor read data, page not present instruction pointer= 0x8:0x8068d4bd stack pointer= 0x10:0xb20738e0 frame pointer= 0x10:0x0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process= 72836 (objdump) trap number= 12 panic: page fault cpuid = 1 Uptime: 28m4s Physical memory: 4082 MB Dumping 518 MB: 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 #0 doadump () at pcpu.h:194 194pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0x0004 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff003c39c000, eva=18446742974629017808) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb2073830, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb2073830) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x8068d4bd in vm_page_cache_remove (m=0xff00da9ec3b8) at /usr/src/sys/vm/vm_page.c:896 #9 0x8068e1b5 in vm_page_alloc (object=0xff00374ffc30, pindex=14, req=64) at /usr/src/sys/vm/vm_page.c:1080 #10 0x8067fa77 in vm_fault (map=0xff0005f23d00, vaddr=34365804544, fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_fault.c:432 #11 0x8072efaf in trap_pfault (frame=0xb2073c70, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:618 #12 0x8072fbf8 in trap (frame=0xb2073c70) at /usr/src/sys/amd64/amd64/trap.c:309 #13 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #14 0x00080059c54f in ?? () Previous frame inner to this frame (corrupt stack?) So to answer your question are the backtraces always the same, no, they are not. But I am still confused as to what this means?? I would appreciate any further insight anyone can give. That's another corrupted backtrace that doesn't point to an actual software problem. Still sounds like bad RAM, or bad hardware. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Michael Grant wrote: I have been having what seems like similar panics. I too cannot manage to get a crash dump, neither classic style nor minidump. Nor can I get it to work with DDB, there seems to be a problem with DDB and my Geom mirror. They're not at all similar, please don't confuse the issue :) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sleeping without queue ?
Mikhail Teterin wrote: Hello! My attempt to build openoffice.org-3 seems to be hanging. Pressing Ctrl-T produces: load: 0.11 cmd: tcsh 79759 [sleeping without queue] 0.00u 0.00s 0% 0k (tcsh is used by OOo's build-script). What is this sleeping without queue state, and why is process in it for so long? This is an 4-CPU amd64 system with 4Gb of RAM. Only 16% of the swap is currently in use and the box seems to be perfectly fine otherwise. Uptime is 55 days, two different X-sessions are functional... The kernel is FreeBSD 7.0-STABLE #0: Sat Mar 8 16:02:37. Thanks! What is the process backtrace? Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sleeping without queue ?
Jeremy Chadwick wrote: On Tue, Jul 22, 2008 at 12:13:25PM -0400, Mikhail Teterin wrote: Kris Kennaway ???(??): Mikhail Teterin wrote: Hello! My attempt to build openoffice.org-3 seems to be hanging. Pressing Ctrl-T produces: load: 0.11 cmd: tcsh 79759 [sleeping without queue] 0.00u 0.00s 0% 0k (tcsh is used by OOo's build-script). What is this sleeping without queue state, and why is process in it for so long? This is an 4-CPU amd64 system with 4Gb of RAM. Only 16% of the swap is currently in use and the box seems to be perfectly fine otherwise. Uptime is 55 days, two different X-sessions are functional... The kernel is FreeBSD 7.0-STABLE #0: Sat Mar 8 16:02:37. Thanks! What is the process backtrace? Hard to say... The process ID 79759. According to ps(1), that PID exists: 79759 p6 DE+0:00,00 /bin/tcsh -fc /meow/ports/editors/openoffice.org-3/work/BEB300_m3/solver/300/unxfbsdx.pro/bin/makedepend @/tmp/mk2WUYYi ../../../unxfbsdx.pro/misc/s_addincol.dpcc According to gdb, it does not: gdb 79759 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd...79759: No such file or directory. Syntax appears wrong; gdb [program] 79759 would be what you want. Well, I mean kernel backtrace. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: sleeping without queue ?
Mikhail Teterin wrote: Kris Kennaway написав(ла): Well, I mean kernel backtrace. Can I obtain that remotely and without restarting/panicking the box? Thanks, -mi kgdb on /dev/mem or procstat Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+
Royce Williams wrote: db trace Tracing pid 71182 tid 100325 td 0xcc08b180 kdb_enter(c095f294) at kdb_enter+0x2b panic(c09768ad,1000,1400,c145bc88,1000,...) at panic+0x127 kmem_malloc(c14680c0,1000,102,eba6a8cc,c07e3fa5,...) at kmem_malloc+0x89 You forgot to include the panic, but this is probably the kmem_map too small panic. It says that your kernel ran out of memory, and the solution is to fix that situation by giving more memory to the kernel. Increase the vm.kmem_size tunable until your system stops running out of memory on your workload. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+
Royce Williams wrote: Kris Kennaway wrote, on 7/22/2008 12:12 PM: Royce Williams wrote: db trace Tracing pid 71182 tid 100325 td 0xcc08b180 kdb_enter(c095f294) at kdb_enter+0x2b panic(c09768ad,1000,1400,c145bc88,1000,...) at panic+0x127 kmem_malloc(c14680c0,1000,102,eba6a8cc,c07e3fa5,...) at kmem_malloc+0x89 You forgot to include the panic Is there is a way to get the panic after dropping into the debugger? This serial console setup has no scrollback, so I couldn't see the preceding text. You can either show msgbuf, or x/x panicstr and then x/s 0x where that is the hex value from the previous step. The latter only diplays the format string and not the arguments, but on i386 you can read them off from the panic() line in the stack trace. Actually on i386 the panicstr is the first argument (0xc09768ad here). but this is probably the kmem_map too small panic. Ah, I see this now, at faq/book.html#KMEM-MAP-TOO-SMALL: Compile your own kernel, and add the VM_KMEM_SIZE_MAX to your kernel configuration file, increasing the maximum size to 400 MB (options VM_KMEM_SIZE_MAX=419430400). 400 MB appears to be sufficient for machines with up to 6 GB of memory. It says that your kernel ran out of memory, and the solution is to fix that situation by giving more memory to the kernel. Increase the vm.kmem_size tunable until your system stops running out of memory on your workload. Comparing the FAQ, kern_malloc.c and your mentioning it as tunable, please clarify: is the Right Thing to do to use vm.kmem_size, or vm.kmem_size_max? kmem_size_max is used for automatically tuning based on RAM size. To increase the actual value explicitly you just need to tune vm.kmem_size. I tried vm.kmem_size_max, which yields: # sysctl -a | grep kmem vm.kmem_size: 419430400 vm.kmem_size_max: 419430400 vm.kmem_size_scale: 3 Should I contribute some additional language to the FAQ, saying that the vm.kmem_size[_max] tunable can be used without recompiling the kernel? Yes, that would be fantastic! I would also note that the loader tunable is usually more convenient since it doesn't require a kernel recompile, and probably reword the claim about 400MB: the memory needed depends very much on the workload you are giving your kernel, so the best advice is to increase the value until you determine empirically how much you need (i.e. the memory exhaustion stops). You can also use 400M notation for loader tunables which is often more convenient. Thanks, Kris Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Portsclean doesnt like my upgrade from 6.3 7.0
David Southwell wrote: It looks as though I have missed something!! FreeBSD dns1.vizion2000.net 7.0-STABLE FreeBSD 7.0-STABLE #0: Wed Jul 16 09:27:38 PDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC amd64 [EMAIL PROTECTED] ~]# portsclean FFaattaall eeoorr ''Thread is not system scope. Thread is not system scope. '' aatt lliinnee 331199 iinn ffiillee /usr/src/lib/libpthread/thread/thr_sig.c/usr/src/lib/libpthread/thread/thr_sig.c ((eennoo == 22)) Segmentation fault: 11 (core dumped) Ok where do I go from here?? Find out which port(s) you didnt recompile as part of the upgrade (e.g. check mtime in /usr/local), and do that now. You may need to also recompile the ports that depend on them to undo the damage. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Portsclean doesnt like my upgrade from 6.3 7.0
David Southwell wrote: On Thursday 17 July 2008 06:39:26 Kris Kennaway wrote: David Southwell wrote: It looks as though I have missed something!! FreeBSD dns1.vizion2000.net 7.0-STABLE FreeBSD 7.0-STABLE #0: Wed Jul 16 09:27:38 PDT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC amd64 [EMAIL PROTECTED] ~]# portsclean FFaattaall eeoorr ''Thread is not system scope. Thread is not system scope. '' aatt lliinnee 331199 iinn ffiillee /usr/src/lib/libpthread/thread/thr_sig.c/usr/src/lib/libpthread/thread/th r_sig.c ((eennoo == 22)) Segmentation fault: 11 (core dumped) Ok where do I go from here?? Find out which port(s) you didnt recompile as part of the upgrade (e.g. check mtime in /usr/local), and do that now. You may need to also recompile the ports that depend on them to undo the damage. Kris ___ Thanks Kris I have been unable to find instructions in the manual about recompiling ports as part of a system upgrade process. There seems to be no reference to it. The upgrade from 6.1 to 6.3 seemed to work OK once I sorted out a problem with perl. However 6.3 to 7.0 seems to produce more difficulties than I bargained for!!! It was clearly mentioned in the release announcement :) How can I best reconfigure and recompile all th installed ports? As you can see from below: [EMAIL PROTECTED] ~]# portupgrade -a Fatal error 'Thread is not system scope. ' at line 319 in file /usr/src/lib/libpthread/thread/thr_sig.c (errno = 2) Segmentation fault: 11 (core dumped) Try pkg_deleting portupgrade and ruby* and reinstalling them, then proceed with portupgrade -fa (note -f!) or -faP. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John Sullivan wrote: John, a question, how is swap set up on your system? I was swapping to a file (a memory disk device /dev/md0). I was doing this because for some reason lost in ancient history, this machine was not set up with a real swap partition. Hence, no crash dump. Swap is a partition on the 1st disk. Last night I repartitioned a second disk, set up a real swap partition and now I'm currently waiting for this to happen again so I can get a crash dump. I will try creating a swap partition on my second drive to see if that improves things ... I am able to cause a panic on demand but a crash dump is rarely written (presumably because the system believes the device is not accessible?). I must have crashed it 10-20 times now with various corruptions of the panic screen - once it had blue text with trap 12 trap 12 all over the screen, I liked that one ;-). I did manage to complete a make index while the background FSCK was running, once it had finished, performing the same task caused a panic locking the machine up again with no crash dump. OK, the first thing to do is disable bg fsck, then force a full fsck of all filesystems. bg fsck does a poor job of fixing arbitrary filesystem corruption (it's not designed to do so, in fact), and you can get into a situation where corrupted filesystems cause further panics. Removing KDB_UNATTENDED from your kernel will allow you to interact with the debugger and obtain backtraces etc, which is useful when dumps are not being saved. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
Michael Grant wrote: On Wed, Jul 16, 2008 at 10:38 AM, John Sullivan [EMAIL PROTECTED] wrote: Could be memory, but I'd also suggest looking at temperatures. I've had overheating systems produce lots of such errors. Temperature is fine - it never get's that hot here in the UK ;-) Seriously, I put my hand in the box, touched a few heat sync's, it is not running hot enough to cause a problem. The BIOS reports that all is well with the temperature inside the box of just over 30 degrees C. John This looks like the same panic I reported yesterday but I'm running 6.3 patch 2. Unless you have information you haven't yet shared, no it doesn't :) Fatal trap 12 is an effect, not a cause. We still need your backtrace to make progress understanding the cause of your panic. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
John Sullivan wrote: Can the system in question run memtest86+ successfully (no errors) for an hour? It would help diminish (but not entirely rule out) hardware (memory or chipset) issues. Sorry, forgot to mention, I ran memtest over night without any problem reported. I ran Fedora 9 for a month without any issue - FreeBSD 7.0 crashes within an hour. Well, that doesn't rule out hardware failure. Different OSes may use different capabilities of the hardware, or just use it in a different way, and that can provoke failures from marginal hardware. Please collect kgdb/ddb backtraces. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Multi-machine mirroring choices
Jeremy Chadwick wrote: Compared to UFS2 snapshots (e.g. dump -L or mksnap_ffs), ZFS snapshots are fantastic. The two main positives for me were: 1) ZFS snapshots take significantly less time to create; I'm talking seconds or minutes vs. 30-45 minutes. I also remember receiving mail from someone (on -hackers? I can't remember -- let me know and I can dig through my mail archives for the specific mail/details) stating something along the lines of over time, yes, UFS2 snapshots take longer and longer, it's a known design problem. 2) ZFS snapshots, when created, do not cause the system to more or less deadlock until the snapshot is generated; you can continue to use the system during the time the snapshot is being generated. While with UFS2, dump -L and mksnap_ffs will surely disappoint you. a known design problem in the sense of intentional, yes. They were written to support bg fsck, not as a lightweight filesystem feature for general use. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Multi-machine mirroring choices
Oliver Fromme wrote: Yet another way would be to use DragoFly's Hammer file system which is part of DragonFly BSD 2.0 which will be released in a few days. It supports remote mirroring, i.e. mirror source and mirror target can run on different machines. Of course it is still very new and experimental (however, ZFS is marked experimental, too), so you probably don't want to use it on critical production machines. Let's not get carried away here :) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Multi-machine mirroring choices
Wesley Shields wrote: On Tue, Jul 15, 2008 at 07:54:26AM -0700, Jeremy Chadwick wrote: One of the annoyances to ZFS snapshots, however, was that I had to write my own script to do snapshot rotations (think incremental dump(8) but using ZFS snapshots). There is a PR[1] to get something like this in the ports tree. I have no idea how good it is but I hope to get it in the tree soon. http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/125340 There is also sysutils/freebsd-snapshot (pkg-descr is out of date, it supports ZFS too). I found it more convenient to just write my own tiny script. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
[EMAIL PROTECTED] wrote: (kgdb) backtrace #0 doadump () at pcpu.h:194 #1 0xff0004742440 in ?? () #2 0x80477699 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0x80477a9d in panic (fmt=0x104 Address 0x104 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:563 #4 0x8072ed44 in trap_fatal (frame=0xff00048ee000, eva=18446742974275512528) at /usr/src/sys/amd64/amd64/trap.c:724 #5 0x8072f115 in trap_pfault (frame=0xb1d7f4e0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:641 #6 0x8072fa58 in trap (frame=0xb1d7f4e0) at /usr/src/sys/amd64/amd64/trap.c:410 #7 0x807156be in calltrap () at /usr/src/sys/amd64/amd64/exception.S:169 #8 0x0064 in ?? () #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 OK, that is if (zone-uz_ctor != NULL) { if (zone-uz_ctor(item, zone-uz_keg-uk_size, uz_ctor is indeed not null, but it's got 3 bits set. Not impossible that it's bad RAM still. I didn't spot anything that could cause it otherwise but I don't know this code in detail. Do all of the panics have the same backtrace? Kris #10 0x80661ecf in ffs_vget (mp=0xff00047f4978, ino=47884512, flags=2, vpp=0xb1d7f728) at uma.h:277 #11 0x8066d010 in ufs_lookup (ap=0xb1d7f780) at /usr/src/sys/ufs/ufs/ufs_lookup.c:573 #12 0x804dfa89 in vfs_cache_lookup (ap=Variable ap is not available. ) at vnode_if.h:83 #13 0x8077235f in VOP_LOOKUP_APV (vop=0x809e7de0, a=0xb1d7f840) at vnode_if.c:99 ---Type return to continue, or q return to quit--- #14 0x804e6394 in lookup (ndp=0xb1d7f950) at vnode_if.h:57 #15 0x804e7228 in namei (ndp=0xb1d7f950) at /usr/src/sys/kern/vfs_lookup.c:219 #16 0x804f4717 in kern_stat (td=0xff00048ee000, path=0x8006f7040 Address 0x8006f7040 out of bounds, pathseg=Variable path seg is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2109 #17 0x804f4987 in stat (td=Variable td is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:2093 #18 0x8072f397 in syscall (frame=0xb1d7fc70) at /usr/src/sys/amd64/amd64/trap.c:852 #19 0x807158cb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:290 #20 0x0043127c in ?? () Previous frame inner to this frame (corrupt stack?) I really don't understand this -any advice you can give would really be appreciated. John This message was sent using IMP, the Internet Messaging Program. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fresh 7.0 Install: Fatal Trap 12 panic when put under load
[EMAIL PROTECTED] wrote: #9 0x8067d3ee in uma_zalloc_arg (zone=0xff00bfed07e0, udata=0x0, flags=-256) at /usr/src/sys/vm/uma_core.c:1835 From the frame #9, please do p *zone I am esp. interested in the value of the uz_ctor member. It seems that it becomes corrupted, it value should be 0, as this seems to be ffs inode zone. I suspect that gdb would show 0x64 instead. I am afraid that you may need to spell out each step for me :-( (kgdb) p *zone No symbol zone in current context. (kgdb) list *0x8067d3ee 0x8067d3ee is in uma_zalloc_arg (/usr/src/sys/vm/uma_core.c:1835). 1830(uma_zalloc: Bucket pointer mangled.)); 1831cache-uc_allocs++; 1832critical_exit(); 1833#ifdef INVARIANTS 1834ZONE_LOCK(zone); 1835uma_dbg_alloc(zone, NULL, item); 1836ZONE_UNLOCK(zone); 1837#endif 1838if (zone-uz_ctor != NULL) { 1839if (zone-uz_ctor(item, zone-uz_keg-uk_size, Is this that you were looking for? Are you sure that is the same source tree you are running? The 7.0-RELEASE source has the zone-uz_ctor on line 1835, which is consistent with your backtrace. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Multi-machine mirroring choices
Jeremy Chadwick wrote: On Tue, Jul 15, 2008 at 11:47:57AM -0400, Sven Willenberger wrote: On Tue, 2008-07-15 at 07:54 -0700, Jeremy Chadwick wrote: ZFS's send/recv capability (over a network) is something I didn't have time to experiment with, but it looked *very* promising. The method is documented in the manpage as Example 12, and is very simple -- as it should be. You don't have to use SSH either, by the way[1]. The examples do list ssh as the way of initiating the receiving end; I am curious as to what the alterative would be (short of installing openssh-portable and using cipher=no). rsh or netcat come to mind. I haven't tried using either though. I wouldn't recommend either for the obvious reasons: weak or no authentication and integrity protection. Even if the former is not a concern for some reason then the latter should be (your data stream could be corrupted in transit and you'd never know until you tried to verify or restore the backup). Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: trim src/UPDATING in RELENG_7?
Ronald Klop wrote: Hi, The file src/UPDATING in RELENG_7 goes back until 2004 (the RELENG_5 branchpoint) and is now almost 1000 lines long. Is it an idea to trim this file a bit? And to update this sentence: 'To upgrade in-place from 5.x-stable to current'? And footnote [5] seems a bit dated also. 'if you last updated from current before 20020224 or from -stable before 20020408.' Or is upgrading from 5.x to 7 also supported? I think so, yes. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: disk questions: geom and zfs
[EMAIL PROTECTED] wrote: hail, I have a 7-stable: [EMAIL PROTECTED] /usr/home/matheus]$ uname -a FreeBSD xxx 7.0-STABLE FreeBSD 7.0-STABLE #2: Sun Jul 6 15:03:26 BRT 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/xxx_7 i386 and there exists three geom things. gconcat status Name Status Components concat/concat0 UP ad4 ad5 gmirror status NameStatus Components mirror/mirror0 COMPLETE ad8s1 ad10s1 gstripe status Name Status Components stripe/stripe0 UP ad8s2 ad10s2 and a small (100GB) zfs pool. the thing is, if I take all these disks to a 6.3R-p2 system, will I get in trouble ? what if this 6.3R becomes 7-stable also, will this trouble disappear ? I think it should just ignore the parts it can't recognise. I have a vague memory that something about the metadata format changed in one of the geom providers (mirror/stripe/something) so there might be a problem there. Try to research whether that is the case. In general you'll be better off if you just run it on 7.0 of course. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: disk questions: geom and zfs
[EMAIL PROTECTED] wrote: first of all thanks, and, so far no gmirror and I wouldn't be surprised if after this info other couldn't work as well. no problem cause I can upgrade, just have to plan that. But about the disk order ? will both geom_* and zfs work ? zpool works on top of GEOM, so it's all fine. e.g. you can build a zpool on top of a gmirror if you wanted to (although that example would probably be pointless) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Makefile in FreeBSD 7.0 Stable
[EMAIL PROTECTED] wrote: I have a fresh install of FreeBSD 7.0 but it it seems several system related modules are broken. One I would like to have a solution on urgently is the Makefile. I am not able to compile or install any program with it. The errors are too many no matter which application I try to install from ports collection. I have not built a custom kernel yet. The behavior of Makefile worries me more than anything at the moment. pkg_ add is minimally working itself. I am also noticing that quite many apllications in the port collection are bad. Any idea for me to get started? You'll have to start by showing us exactly what you are doing, and what is going wrong. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Makefile in FreeBSD 7.0 Stable
[EMAIL PROTECTED] wrote: -- Original message -- From: Kris Kennaway [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I have a fresh install of FreeBSD 7.0 but it it seems several system related modules are broken. One I would like to have a solution on urgently is the Makefile. I am not able to compile or install any program with it. The errors are too many no matter which application I try to install from ports collection. I have not built a custom kernel yet. The behavior of Makefile worries me more than anything at the moment. pkg_ add is minimally working itself. I am also noticing that quite many apllications in the port collection are bad. Any idea for me to get started? You'll have to start by showing us exactly what you are doing, and what is going wrong. Thanks ... say you have a package: cd /usr/ports/sysutils/usermin In this folder you have Makefile, distinfo, files, pkg-descr, pkg-message, and pkg-plist The issue the command while in this folder: perl Makefile voila,, you get: Semicolon seems to be missing at Makefile line 9 Bareword found where operator expected at Makefile line 11 .near /www [Missing operator before www?] syntax erro at Makefile line 10 Execution of Makefile aborted due to Compilation errors. And many more of these errors for each and every file/application I try to install. Why are you running that command? It's not how you build ports. Take a look at the handbook for directions. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: tracking -stable in the enterprise
Andy Kosela wrote: On Jun 25, 2008, at 3:46 AM, Peter Wemm wrote: I think we still have FreeBSD-3.x machines in production. I know we have FreeBSD-4.3. 99.9% of security issues don't affect us. We have our own package system built on top of FreeBSD's pkg_add format and have the ability to push packages to machines. If circumstances warrant it, we can push a fix for something. It'll either push a new binary or be a source patch that is compiled directly on the machines in question. The machines run a custom software stack. More often we push fixes for driver or performance fixes or things like timezone updates. Ports infrastructure do not support such old FreeBSD versions, so how do you deal with that? Do you maintain your own CVS branches of selected packages and backports necessary security patches? I guess it demands considerable effort to compile the latest apache on FreeBSD 3.x or 4.x. It would be easy to maintain 4.x compatibility in Yahoo's package system. They probably only need a relatively small number of ports, and there is no need to stay in sync with changes to the ports infrastructure. Those changes are almost all completely gratuitous from the point of view of deploying packages within a site since they are changes to the *ports* infrastructure. The FreeBSD *package* infrastructure has changed almost not at all over time (but yahoo have their own package system anyway). To the extent that the vendor applications still support old versions, the model would be the same: vendor source + patches -- binary. You can do that with a system based on the ports collection from last century if you like :) I would guess that Yahoo actually forked the ports system long ago (in the 2.x days?) or never used it directly, and either port their changes directly or by taking patches from freebsd ports. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 7-STABLE deadlock!
Lev Serebryakov wrote: Hello, John. You wrote 23 июня 2008 г., 18:47:33: On Monday 16 June 2008 07:21:15 am Lev Serebryakov wrote: Lev Serebryakov wrote: It seems to be ATA/SATA or UFS2 problem: now I have computer in state, when 4 iozone processes are hanged in Disk wait state, and I can not cd to filesystem, which is tested by iozone. But I can create processes, work on system, etc., if I don't touch this filesystem. I can reproduce it, creating gmirror on 5 disks (yes, not very useful configuration, but I've started from non-base-system RAID5 and need to exclude it), FS with 64Kb blocks, and 4 threads of iozone with mixed workload (-i 8 -+p 70). All 5 disks are ICH9DO-based, SATA-II WD5000AAKS HDDs. Try getting the 'ps' output from ddb. Also, get a crash dump if you can. It was tracked douwn to known deadlock in buffer allocator when buffer map is fragmented (thnx to [EMAIL PROTECTED]). Workaround is known: don't use FSes with 16Kb and 64Kb blocks on same system in one time. 16/32 mixture works well :) Is there a PR filed with this bug? Having the specific information recorded will be very useful. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 7-STABLE deadlock!
Lev Serebryakov wrote: Hello, Kris. You wrote 23 июня 2008 г., 19:56:14: Is there a PR filed with this bug? Having the specific information recorded will be very useful. Kostik (kib@) says, that I don't need to fill PR for this issue... OK, that is good enough for me :) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: infinite loop when copying to ext2fs
Martin Cracauer wrote: Kris Kennaway wrote on Sat, Mar 01, 2008 at 10:22:26PM +0100: Jakub Siroky wrote: I've just confirmed the same situation on 6.2-RELEASE amd64/GENERIC. I did not noticed it before because I started using ext2fs extensively some months ago. Regards, Jakub On Sat, 19 Jan 2008 16:44:34 +0100 Kris Kennaway [EMAIL PROTECTED] wrote: Kris Kennaway wrote: Jakub Siroky wrote: I have two large ext2fs partitions (368 and 313GB) to hold data shared between several OSes. While there were no problems on 6-STABLE branch I was quite disappointed after upgrade to 7-STABLE. Whenever I copy/write to ext2fs partition the system freezes totally without crashdump. So I set debugging settings to kernel config (DEBUG,WITNESS,..) and in console I reproduced error situation ending with full screen of unstoppable running text with lot of memory addresses and a few recognisable words: 'new block bit set for ext already' - again with no crashdump. Then I have formatted 1GB partition with ext2fs and the problem on this small partition appears only sometimes. OK, I am able to reproduce this. Kris Is anyone able to look at this? I could not spot a candidate change that has not been merged to 6.x. Kris Sounds like it may have been broken by the change to ext2_bitops.h by cracauer. Can you confirm whether backing out 1.2.2.1 fixes it? I don't think my change can cause a new endless loop. I only reversed the order of tests to ensure we don't overrun a page bounddary (into possibly unmapped space). - while(*p == ~0U ofs sz) { + while(ofs sz *p == ~0U) { It is, however, likely that the code was buggy in the first place. Linux has replaced all this (the allocation code). Also note that the code I fixed is amd64 only. If the endless loop appears on i386 it's something else. Martin It is amd64 only. I am able to reproduce using the method in the original mails, can you? Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: infinite loop when copying to ext2fs
Martin Cracauer wrote: Kris Kennaway wrote on Mon, Jun 16, 2008 at 11:27:53AM +0200: Martin Cracauer wrote: Kris Kennaway wrote on Sat, Mar 01, 2008 at 10:22:26PM +0100: Jakub Siroky wrote: I've just confirmed the same situation on 6.2-RELEASE amd64/GENERIC. I did not noticed it before because I started using ext2fs extensively some months ago. Regards, Jakub On Sat, 19 Jan 2008 16:44:34 +0100 Kris Kennaway [EMAIL PROTECTED] wrote: Kris Kennaway wrote: Jakub Siroky wrote: I have two large ext2fs partitions (368 and 313GB) to hold data shared between several OSes. While there were no problems on 6-STABLE branch I was quite disappointed after upgrade to 7-STABLE. Whenever I copy/write to ext2fs partition the system freezes totally without crashdump. So I set debugging settings to kernel config (DEBUG,WITNESS,..) and in console I reproduced error situation ending with full screen of unstoppable running text with lot of memory addresses and a few recognisable words: 'new block bit set for ext already' - again with no crashdump. Then I have formatted 1GB partition with ext2fs and the problem on this small partition appears only sometimes. OK, I am able to reproduce this. Kris Is anyone able to look at this? I could not spot a candidate change that has not been merged to 6.x. Kris Sounds like it may have been broken by the change to ext2_bitops.h by cracauer. Can you confirm whether backing out 1.2.2.1 fixes it? I don't think my change can cause a new endless loop. I only reversed the order of tests to ensure we don't overrun a page bounddary (into possibly unmapped space). - while(*p == ~0U ofs sz) { + while(ofs sz *p == ~0U) { It is, however, likely that the code was buggy in the first place. Linux has replaced all this (the allocation code). Also note that the code I fixed is amd64 only. If the endless loop appears on i386 it's something else. Martin It is amd64 only. I am able to reproduce using the method in the original mails, can you? Didn't try yet, but I did get a probably unrelated panic on ext2fs just last week :-) I'll fire it up this week. How big does the partition have to be to show the problem in this bug? Sorry, I don't remember. I probably tried it on a md that was a couple of GB. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ZFS version 8 on stable?
Timothy Wilson wrote: Hello everyone, I know -stable is supposed to be, well, stable, but I seem to be in a bit of a pickle. I'm trying to import my zfs file system from a Macintosh machine. It was created at version 8. Being new to zfs, I did not realise that FreeBSD would be running at a lower version; I thought zfs was zfs! Of course, trying to import my zpool fails, complaining that the version is too new. freebsd# zpool import bigstore cannot import 'bigstore': pool is formatted using a newer ZFS version So either I want to downgrade the zpool, or upgrade zfs on FreeBSD. Does anyone know if I'll be able to import zfs v8, or am I wasting my time? I'd prefer to follow -stable, but if I must follow -current, then golly goshkins, I'll have no choice! It is planned to update to a newer ZFS release in HEAD, but this has not happened yet (let alone in 7.x). I don't know if it is possible to downgrade a pool - you should check the ZFS documentation/support materials. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: [7-STABLE] ping -s 4000 with ipsec panic
Patrick Lamaizière wrote: generic_bcopy () at /usr/src/sys/i386/i386/support.s:498 #8 0xc1f7267e in ?? () #9 0x8fb82d87 in ?? () #10 0x361fe9de in ?? () #11 0x39402686 in ?? () #12 0x0fa0 in ?? () #13 0xc29cf380 in ?? () #14 0xc2ea9654 in ?? () #15 0x in ?? () #16 0xd61a095c in ?? () #17 0xc0700746 in crypto_invoke (cap=0x8, crp=0xd61a0950, hint=-1616994916) at cryptodev_if.h:53 Previous frame inner to this frame (corrupt stack?) (kgdb) Unfortunately the trace is bogus. Try to rebuild with -O instead of -O2 and reproduce the panic. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pkg_delete core dump when removing linux-tiff
Jona Joachim wrote: On Sun, Jun 08, 2008 at 03:57:55PM +0200, Kris Kennaway wrote: Jona Joachim wrote: Hi! pkg_delete core dumps on me when it tries to remove linux-tiff. I can reproduce this reliably. FWIW you can find the core dump here: http://www.hcl-club.lu/~jaj/stuff/pkg_delete.core You need to obtain the backtrace, see the developers handbook. I built pkg_delete with -g but gdb says 'no debugging symbols found'. Is the following information sufficient or do I need to rebuild everything with debugging information turned on? It was probably stripped at install, I think you can set STRIP= (i.e. empty value) but doesn't it also explain this in the handbook? Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pkg_delete core dump when removing linux-tiff
Jona Joachim wrote: Hi! pkg_delete core dumps on me when it tries to remove linux-tiff. I can reproduce this reliably. FWIW you can find the core dump here: http://www.hcl-club.lu/~jaj/stuff/pkg_delete.core You need to obtain the backtrace, see the developers handbook. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Chris Marlatt wrote: Kris Kennaway wrote: Jo Rhett wrote: On Jun 4, 2008, at 11:39 AM, Kris Kennaway wrote: Also, it's not like anyone should have been caught by surprise by the 6.2 EoL; the expiry date has been advertised since the 6.2 release itself. It has changed multiple times. I keep reviewing and finding 6.3 bugs outstanding, and then observe the EoL get pushed. I'm surprised that it failed to get pushed this time. I'm sorry that the FreeBSD project failed to conform to your expectations. However, I invite you to actually try 6.3 for yourself instead of assuming that it will fail. Kris In an effort to potentially find a compromise between those who believe FreeBSD is EoL'ing previous releases too quickly and those who don't. Have those in a position to set FreeBSD release schedules debated the option of setting a long term support release, a specific release picked by the team to be support for,.. 4 or 5 years? Other projects have done this will relative success and considering the only work required for this release would be security patches the work load should be minimized. Hopefully something like this could free up more time for the FreeBSD developers to continue their work on the newer release(s) while still answering the requests of what seems like quite a few of the legacy FreeBSD users. Thoughts? If this has already been discussed on-list I apologize for beating a dead horse but I can't recall it bring brought up before. Uh yeah, this has been in place for *years*. Have you actually read the support announcements? They are public ;) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Chris Marlatt wrote: Kris Kennaway wrote: Chris Marlatt wrote: Kris Kennaway wrote: Jo Rhett wrote: On Jun 4, 2008, at 11:39 AM, Kris Kennaway wrote: Also, it's not like anyone should have been caught by surprise by the 6.2 EoL; the expiry date has been advertised since the 6.2 release itself. It has changed multiple times. I keep reviewing and finding 6.3 bugs outstanding, and then observe the EoL get pushed. I'm surprised that it failed to get pushed this time. I'm sorry that the FreeBSD project failed to conform to your expectations. However, I invite you to actually try 6.3 for yourself instead of assuming that it will fail. Kris In an effort to potentially find a compromise between those who believe FreeBSD is EoL'ing previous releases too quickly and those who don't. Have those in a position to set FreeBSD release schedules debated the option of setting a long term support release, a specific release picked by the team to be support for,.. 4 or 5 years? Other projects have done this will relative success and considering the only work required for this release would be security patches the work load should be minimized. Hopefully something like this could free up more time for the FreeBSD developers to continue their work on the newer release(s) while still answering the requests of what seems like quite a few of the legacy FreeBSD users. Thoughts? If this has already been discussed on-list I apologize for beating a dead horse but I can't recall it bring brought up before. Uh yeah, this has been in place for *years*. Have you actually read the support announcements? They are public ;) Kris I do actually - and when was the last release that was support for such a duration of time,.. 4.11? As of recent the longest I've seen has been 24 months with others being only 12. Yes, and this is the FreeBSD definition of long term support. Don't like it? Do something about it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Chris Marlatt wrote: Kris Kennaway wrote: Chris Marlatt wrote: Kris Kennaway wrote: Chris Marlatt wrote: Kris Kennaway wrote: Jo Rhett wrote: On Jun 4, 2008, at 11:39 AM, Kris Kennaway wrote: Also, it's not like anyone should have been caught by surprise by the 6.2 EoL; the expiry date has been advertised since the 6.2 release itself. It has changed multiple times. I keep reviewing and finding 6.3 bugs outstanding, and then observe the EoL get pushed. I'm surprised that it failed to get pushed this time. I'm sorry that the FreeBSD project failed to conform to your expectations. However, I invite you to actually try 6.3 for yourself instead of assuming that it will fail. Kris In an effort to potentially find a compromise between those who believe FreeBSD is EoL'ing previous releases too quickly and those who don't. Have those in a position to set FreeBSD release schedules debated the option of setting a long term support release, a specific release picked by the team to be support for,.. 4 or 5 years? Other projects have done this will relative success and considering the only work required for this release would be security patches the work load should be minimized. Hopefully something like this could free up more time for the FreeBSD developers to continue their work on the newer release(s) while still answering the requests of what seems like quite a few of the legacy FreeBSD users. Thoughts? If this has already been discussed on-list I apologize for beating a dead horse but I can't recall it bring brought up before. Uh yeah, this has been in place for *years*. Have you actually read the support announcements? They are public ;) Kris I do actually - and when was the last release that was support for such a duration of time,.. 4.11? As of recent the longest I've seen has been 24 months with others being only 12. Yes, and this is the FreeBSD definition of long term support. Don't like it? Do something about it. Kris You seem awful hostile - do you really think that's the best way to represent the project you're involved with? Initially belittle someone for offering their opinion and then when they reply telling them to do it themselves or shut up? Try and have an open mind about these things. The option provided seems like a fairly good compromise to both interests. Pick 6.3 (or anything the release team wishes) to support for a longer period of time. Keep all other releases to 12 month support and continue doing what I believe is some fairly incredible work. I really don't see the downside to it. If anything it should reduce the work load for the team and let them focus on making considerable progress. Especially considering Ken Smith's recent post regarding future release schedules. IMHO, the attitude and opinion you have right now accomplishes nothing other than alienating your supporters. There has been nothing of value offered in this thread, and it's only served to piss off a number of developers who already put huge amounts of volunteer time into supporting FreeBSD, and who take pride in the quality of their work. Asking the volunteers to a) fix unspecified problems that the submitter will not name in detail but which are OMG SHOWSTOPPER YOU MUST FIX b) donate even more unpaid time to supporting branches because it seems like a good compromise (!) shows a complete failure of understanding and frankly beggars belief. Such people are not acting as supporters of the project, however well-intentioned they may believe themselves to be. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Paul Schmehl wrote: --On Thursday, June 05, 2008 17:53:01 +0100 Tom Evans [EMAIL PROTECTED] wrote: I think that, especially with open source products, there is a large emphasis on testing in your own environments, and choosing the 'correct' version of a particular software package is important. For example, at $JOB, we had a lot of servers running 6.1 as it was an extended lifetime release, so no point jumping to 6.2, instead we waited for 6.3 to pass our integration testing. Not everyone has those kinds of resources. The domain I'm referring to is a hobby site, run by a husband and wife. They started with shared hosting and moved to a dedicated box when I volunteered to help with the backend work. For several years we ran one server hosting dns, imaps, smtps, mail lists and websites. Yes, it's not ideal, but when you have zero income you do what you can. Testing like you describe is out of the question. We now have the embarrassment of riches of two servers; one for web and the old one for the rest. The old box is still running 5.4 SECURITY. The new box is running 6.1. I'd *like* to upgrade both boxes, and the older box can go offline comfortably for several hours without anyone but me noticing. But if the web box goes down for 30 seconds, queries from the users start pouring in. Come now, even some of the biggest websites on the planet have scheduled downtime :) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Doug Barton wrote: Jo Rhett wrote: Okay, I totally understand that FreeBSD wants people to upgrade from 6.2 to 6.3. It isn't that we want people to upgrade, it's that we are trying to be realistic regarding what we have the resources to support. But given that 6.3 is still experiencing bugs with things that are working fine and stable in 6.2, this is a pretty hard case to make. I admit to not having been following 6.x too closely, but are these things that have been reported, or problems you're having personally? This is also a fairly significant investment in terms of time and money for any business to handle this ugprade. Having an upgrade path is something every operation needs. Set it and forget it isn't a viable strategy in the current culture where 0-day vulnerabilities are becoming increasingly common. Also, it's not like anyone should have been caught by surprise by the 6.2 EoL; the expiry date has been advertised since the 6.2 release itself. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Stephen Clark wrote: Scott Long wrote: Jo Rhett wrote: Okay, I totally understand that FreeBSD wants people to upgrade from 6.2 to 6.3. But given that 6.3 is still experiencing bugs with things that are working fine and stable in 6.2, this is a pretty hard case to make. Can you describe the bugs that are affecting you? This is also a fairly significant investment in terms of time and money for any business to handle this ugprade. It totally understand obsoleting 5.x now that 7.x is out. But 6.2 is barely a year old... The expectation is always that newer versions of a stable branch will have few regressions, and thus upgrading is a low risk. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] Can just the kernel be upgraded or does all of user space have to be upgrades to. Most things will work fine with slightly mismatched kernels, but it's not recommended to do this (some utilities may not work properly). How would someone recommend upgrading 500 hundred remote sites spread throughout Thoroughly test on identically configured machines, roll out incrementally and make sure you have a fallback (i.e. console access) in case something goes catastrophically wrong. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: challenge: end of life for 6.2 is premature with buggy 6.3
Jo Rhett wrote: On Jun 4, 2008, at 11:39 AM, Kris Kennaway wrote: Also, it's not like anyone should have been caught by surprise by the 6.2 EoL; the expiry date has been advertised since the 6.2 release itself. It has changed multiple times. I keep reviewing and finding 6.3 bugs outstanding, and then observe the EoL get pushed. I'm surprised that it failed to get pushed this time. I'm sorry that the FreeBSD project failed to conform to your expectations. However, I invite you to actually try 6.3 for yourself instead of assuming that it will fail. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: jail process limits
On Thu, May 22, 2008 at 03:26:13PM -0400, Vivek Khera wrote: While we're on the topic of jail resource limits, I think I'll ask my question again... I asked last month but got no response... I've got a jail server (FreeBSD 6.3/amd64) which runs a bunch of web site development environments. There is an apache or lighttpd running in each jail as user httpd (same UID on base system and each jail). On the jail host, I counted 231 processes owned by httpd. If I try to start an application server (or any process) as user httpd in one of the jails, it exits immediately with Cannot fork: Resource temporarily unavailable. Even if I su httpd I get the same error on any command I try to run such as ls. If I run the same on the jail host, it has no problems. The jail itself only has 34 processes running. On the jail host, the following is logged: Apr 22 16:34:38 staging kernel: maxproc limit exceeded by uid 80, please see tuning(7) and login.conf(5). Can anyone tell me where to look to find out what is limiting user httpd from creating new processes inside the jail, and what exactly that limit is? More importantly, how to increase it. I'd start by instrumenting the code path that leads to the above kernel printf, to try and differentiate any possible causes. Kris -- In God we Trust -- all others must submit an X.509 certificate. -- Charles Forsythe [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: g_vfs_done error third part--PLEASE HELP!
Willy Offermans wrote: Hello Roland and FreeBSD friends, I'm sorry to be so quite for a while, but I went away for a vacation. But now I'm back, I like to solve this issue. On Mon, Apr 21, 2008 at 10:10:47PM +0200, Roland Smith wrote: On Mon, Apr 21, 2008 at 09:04:03PM +0200, Willy Offermans wrote: Dear FreeBSD friends, It is already the third time that I report this error. Can someone help me in solving this issue? Probably the reason that you hear so little is that you provide so little information. Most of us are not clairvoyant. Over and over again and always after heavy disk I/O I see the following errors in the log files. If I force ar0s1g to unmount the machine spontaneously reboots. Nothing seriously seems to be damaged by this act, but anyway I cannot afford something bad happening to this production machine. Why would you force an unmount? Otherwise the device keeps on reporting to be unavailable and cannot be unmounted: sun# umount /share/ umount: unmount of /share failed: Resource temporarily unavailable Apr 18 20:02:19 sun kernel: g_vfs_done():ar0s1g[WRITE(offset=290725068800, length=4096)]error = 5 I have no clue what the errors mean, since offsets of 290725068800, 290725072896, and 290725074944 seem to be ridiculous. Does anybody have a clue what is going on? For starters, how big is ar0s1g? If the offset is in bytes, it is around 270 GB, which is not that unusual in this day and age. I have to admit that I was a bit confused by an offset value of 290725068800. There is no indication of a unit, so I assumed that it was sector but probably it is simply bytes and then indeed the number does make sense. I'm using FreeBSD 7.0, but found the error being reported before with previous versions of FreeBSD. I can and will provide more details on demand. What does 'df' say? Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ar0s1a 20308398 230438 18453290 1%/ devfs 11 0 100%/dev /dev/ar0s1d 21321454 3814482 1580125619%/usr /dev/ar0s1e 50777034 5331686 4138318611%/var /dev/ar0s1f 101554150 18813760 7461605820%/home /dev/ar0s1g 274977824 34564876 21841472414%/share pretty normal I would say. Did you notice any file corruption in the filesystem on ar0s1g? No the two disks are brand new and I did not encounter any noticeable file corruption. However I assume that nowadays bad sectors on HD are handled by the hardware and do not need any user interaction to correct this. But maybe I'm totally wrong. Unmount the filesystem and run fsck(8) on it. Does it report any errors? sun# fsck /dev/ar0s1g ** /dev/ar0s1g ** Last Mounted on /share ** Phase 1 - Check Blocks and Sizes INCORRECT BLOCK COUNT I=34788357 (272 should be 264) CORRECT? [yn] y INCORRECT BLOCK COUNT I=34789217 (296 should be 288) CORRECT? [yn] y ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? [yn] y SUMMARY INFORMATION BAD SALVAGE? [yn] y BLK(S) MISSING IN BIT MAPS SALVAGE? [yn] y 182863 files, 17282440 used, 120206472 free (12448 frags, 15024253 blocks, 0.0% fragmentation) * FILE SYSTEM MARKED CLEAN * * FILE SYSTEM WAS MODIFIED * The usual stuff I would say. No, any form of filesystem corruption is not usual. Any hints are very much appreciated. Did you manage to create a partition larger than the disk is (using newfs's -s switch)? In that case it could be that you're trying to write past the end of the device. No, look to the following output: sun# bsdlabel -A /dev/ar0s1 # /dev/ar0s1: type: unknown disk: amnesiac label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 60799 sectors/unit: 976751937 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # milliseconds track-to-track seek: 0 # milliseconds drivedata: 0 8 partitions: #size offsetfstype [fsize bsize bps/cpg] a: 4194304004.2BSD0 0 0 b: 8388608 41943040 swap c: 9767519370unused0 0 # raw part, don't edit d: 44040192 503316484.2BSD 2048 16384 28552 e: 104857600 943718404.2BSD 2048 16384 28552 f: 209715200 1992294404.2BSD 2048 16384 28552 g: 567807297 4089446404.2BSD 2048 16384 28552 /dev/ar0s1g starts after 408944640*512/1024/1024=199680MB So I have to conclude that the write error message does make sense and that something seems to be wrong with the disks. The next question is what can I do about it? Should I return the disks to the shop and ask for new ones? #define EIO 5 /* Input/output error */ At least one of your disks is toast. Kris ___ freebsd-stable@freebsd.org mailing list
Re: Panic after hung rsync, probably zfs related
Xin LI wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ben Stuyts wrote: | Hi, | | While doing an rsync from a zfs filesystem to an external usb hd (also | zfs), the rsync processes hung in zfs state. I could not kill these | processes, although the rest of the server seemingly continued to run | fine. The reboot command did not work. Next I tried a shutdown now | command. This caused a panic: Sound like you somehow run out of memory, there is an known issue with ZFS which causes livelock when there is memory pressure. Which rsync version are you using? With rsync 3.x the memory usage would drop drastically which would help to prevent this from happening. This isn't known to cause a double fault though. Unfortunately we probably need more information than was available in order to proceed. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: nfs-server silent data corruption
On Mon, Apr 21, 2008 at 01:02:33AM +0200, Arno J. Klaassen wrote: I didn't stress-test this MB for a while, but last time I did was with 7-PRELEASE/RC?/CANTremember-exactly-but-close-to-release and all worked great I did add 2G ECC to the 2nd CPU since, though I doubt that interferes with NFS. Uh, you're getting server-side data corruption, it could definitely be because of the memory you added. Kris -- In God we Trust -- all others must submit an X.509 certificate. -- Charles Forsythe [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0 kernel crash: page fault while in kernel mode
Toni Schmidbauer wrote: hi, i'm running FreeBSD murus 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Mon Mar 3 20:53:07 CET 2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 today this machine crashed, but lucky me i did get a crash dump. i'm not a kernel developer so any help would be great in finding out the reason for the crash. the machine is running as a firewall with various services (imap, smtp,dns ...). i'm also using if_bridge/vlans to filter traffic for internal clients. thanks for your time toni [EMAIL PROTECTED] /usr/obj/usr/src/sys/GENERIC {1027}# kgdb kernel.debug /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xbc fault code = supervisor read, page not present instruction pointer = 0x20:0xc078075e stack pointer = 0x28:0xe5550ab4 frame pointer = 0x28:0xe5550ac4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 950 (apcupsd) trap number = 12 panic: page fault cpuid = 0 Uptime: 48d14h37m48s Physical memory: 946 MB Dumping 202 MB: 187 171 155 139 123 107 91 75 59 43 27 11 #0 doadump () at pcpu.h:195 195 __asm __volatile(movl %%fs:0,%0 : =r (td)); (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a4905c in trap_fatal (frame=0xe5550a74, eva=188) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a492e0 in trap_pfault (frame=0xe5550a74, usermode=0, eva=188) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a49c8c in trap (frame=0xe5550a74) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc078075e in rman_reserve_resource_bound (rm=0x0, start=0, end=3232598804, count=753, bound=0, flags=590486, dev=0xe5550b2c) at /usr/src/sys/kern/subr_rman.c:325 #8 0xc0788b72 in kern_select (td=0xc437a420, nd=5, fd_in=0xbfbfecc0, fd_ou=0x0, fd_ex=0x0, tvp=0xe5550c70) at /usr/src/sys/kern/sys_generic.c:845 #9 0xc07890de in select (td=0xc437a420, uap=0xe5550cfc) at /usr/src/sys/kern/sys_generic.c:663 #10 0xc0a49635 in syscall (frame=0xe5550d38) at /usr/src/sys/i386/i386/trap.c:1035 #11 0xc0a2fc70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #12 0x0033 in ?? () kern_select() does not call rman_reserve_resource_bound() so ether this trace is corrupt or you have a RAM error. Note that the IP for this function (0xc078075e) is a single bit flip from being very close to kern_select (0xc0788b72), which is what you would expect if kern_select tried to call an associated function in the same source file but the address was corrupted in RAM. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Digitally Signed Binaries w/ Kernel support, etc.
Peter Wemm wrote: On Fri, Apr 4, 2008 at 9:55 AM, Roland Smith [EMAIL PROTECTED] wrote: On Fri, Apr 04, 2008 at 10:58:40AM +0200, Ivan Voras wrote: Signing binaries could be naturally tied in with securelevel, where some securelevel (1?) would mean kernel no longer accepts new keys. If you set the system immutable flag on the binaries, you cannot modify them at all at securelevel 0. Signing the binaries would be pointless in that case. I think these are separate things. Modifying binaries is separate from introducing new binaries. SCHG would prevent the former, but not the latter. If you set the SCHG flag on the directories in $PATH, you can't put anything new there as well. There's nothing magical about $PATH. A person could put a malicious binary in /tmp or $HOME and run it with /tmp/crashme or whatever. Sure, you could set SCHG on every single writeable directory on the system to prevent any files being created. MNT_NOEXEC might be an option. The existence of script languages or even scriptable binaries does diminish the strength of a lockdown, but it depends on what you're trying to achieve. eg: If you're trying to prevent your users from downloading a self-built irc client or bot and running it, then yes, requiring signed binaries would be useful. In any case, there are legitimate uses for signed binaries. But I'm not volunteering to do it. csjp@ had a mac_chkexec module that looks like it was never committed. http://groups.google.com/group/mailing.freebsd.hackers/msg/074eec7def84c52b Shouldn't be hard to update it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NFS server on FreeBSD 6, client on FreeBSD 7 ?
Ken Chen wrote: Hi, The NFS server is running FreeBSD 6.0, and no problems with other NFS client with FreeBSD 6. When a new client with FreeBSD 7 comes, the NFS server always says: Apr 2 03:52:01 rpc.lockd: clntudp_create: RPC: Program not registered Apr 2 03:52:01 rpc.lockd: Unable to return result to 192.168.4.248 The 192.168.4.248's OS is FreeBSD 7, and it can successfully mount the directories on NFS server. It seems something wrong with rpc.lockd ? The 'rpc.lockd' seems totally different in FreeBSD 6 7. How should I fix this problem? Thanks! It should be the same code in 6 and 7, modulo small changes. Sounds like it's not running properly on your 6 system. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: s/stable/broken/g
Peter Much wrote: And the party continues... When starting my usual environment, there is already the next pagefault kernel panic! Tracking it down... it's the type keyword in devfs rules. According to the manpage, support for this was _not_ withdrawn. But actually, entering something like devfs rule apply type tape WHATEVER crashes the system. Software always has bugs, and it is a mistake to think that the stable designation does not mean has no bugs. It's unfortunate that you have hit a couple of them, but please continue to work through the process of documenting them. With regards to your ethernet problems, old cards like ed do not get much testing thesedays because few people use them. Combined with the fact that ethernet problems are often specific to certain hardware models or revisions, you may be the only person to have tried this particular case in many years. By the same token, these problems are difficult to fix without a developer having access to the same problem hardware. You might consider offering to ship it to an interested developer if one can be found. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: s/stable/broken/g
Kris Kennaway wrote: Peter Much wrote: And the party continues... When starting my usual environment, there is already the next pagefault kernel panic! Tracking it down... it's the type keyword in devfs rules. According to the manpage, support for this was _not_ withdrawn. But actually, entering something like devfs rule apply type tape WHATEVER crashes the system. Software always has bugs, and it is a mistake to think that the stable designation does not mean has no bugs. One too many negatives in that sentence. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: gcc -O2 error
Mikael Ikivesi wrote: On Sat, 22 Mar 2008 19:39:32 +0100 Kris Kennaway [EMAIL PROTECTED] wrote: So, did you consider perhaps following this advice? ;-) Kris Yes I did. The reason I send to this list also is that in make.conf manual says: CFLAGS(str) Controls the compiler setting when compiling C code. Optimization levels other than -O and -O2 are not sup- ported. That means that -O2 should be supported in FreeBSD. And now it happens to produce bad code. GCC people think that this should be fixed in gcc 4.3. I have not yet installed and verified that. However I tried the code with linux installation with gcc 4.1.2 and it was ok. As I don't known if gcc has some maintaining done in FreeBSD tree by patching or just by integrating the next snapshot from time to time. So I just though to report it in case that maintainers of FreeBSD version of gcc might want to take a look at this. -Mikael The latter. When the gcc people fix it, if there is a patch that applies to 4.2 then we could import it. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 7.0 kernel panic on Intel SR2400
Ken Chen wrote: Hello, I upgrade from 6.2 to 7.0 on this Intel SR2400 server, then it panic after mounting storage when boot with 7.0 kernel. I don't leave enough space for core dumping, so I should get nothing more for the panic. Any way to gather enough information for bug reporting? Configure DDB in the kernel and then obtain a backtrace from the panic, and either hand-transcribe or take a photo of it. See the developers handbook for detailed instructions. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: gcc -O2 error
Mikael Ikivesi wrote: Hi I am running uptodate RELENG_7. It has gcc (GCC) 4.2.1 20070719 [FreeBSD]. I tried to track down segfaults from my code and I accidentaly found a optimization error. Code did not segfault when compiled without optimization but crashed when -O2 was used. I tried to track it I could make the gcc give me following error by simply stripping few lines: --- wrong.c: In function 'wrong': wrong.c:11: error: Attempt to delete prologue/epilogue insn: (insn/f 47 46 48 2 (set (mem:SI (plus:SI (reg/f:SI 6 bp) (const_int -8 [0xfff8])) [0 S4 A8]) (reg:SI 3 bx)) -1 (nil) (nil)) wrong.c:11: internal compiler error: in propagate_one_insn, at flow.c:1735 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. So, did you consider perhaps following this advice? ;-) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: machine wedged - KDB: enter: lock violation
Brad Pitney wrote: Not sure why it keeps wedging, at first I thought it was something to do with the LORs, now after adding some more debugging options I think I might have found the answer! KDB: stack backtrace: db_trace_self_wrapper(c074b5ee,e70599ac,c05b6853,c4a9e000,e70599ac,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c4a9e000,e70599ac,c07025c5,e70599bc,c4c44d98,...) at kdb_backtrace+0x29 vfs_badlock(c4a37900,e70599bc,c07b00a0,c4c44d98,c4a9e000) at vfs_badlock+0x23 assert_vop_elocked(c4c44d98,c0752ee7,c4a9e000,1b9,0,...) at assert_vop_elocked+0x53 cache_lookup(c4c4815c,e7059bc0,e7059bd4,e7059bc0,c4aa4400,...) at cache_lookup+0x53c vfs_cache_lookup(e7059aa8,c07545ba,c4c4815c,2,c4c4815c,...) at vfs_cache_lookup+0xaa VOP_LOOKUP_APV(c4a37900,e7059aa8,c4a9e000,c075356a,19b,...) at VOP_LOOKUP_APV+0xe5 lookup(e7059bac,e7059ae8,c6,bf,c4aa542c,...) at lookup+0x53e namei(e7059bac,2,c0754d92,c0577808,c0811ae0,...) at namei+0x28e kern_stat(c4a9e000,2820258c,0,e7059c1c,c074d152,...) at kern_stat+0x3d stat(c4a9e000,e7059cfc,8,c074e1dc,c0785e00,...) at stat+0x2f syscall(e7059d38) at syscall+0x273 Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (188, FreeBSD ELF32, stat), eip = 0x281aa48f, esp = 0xbfbfea4c, ebp = 0xbfbfeae8 --- cache_lookup: 0xc4c44d98 is not exclusive locked but should be KDB: enter: lock violation Locked vnodes [...] Apparently 0xc4c44d98 is not locked at all, it didnt appear in your list. Are you sure that was all of it? What does 'show vnode 0xc4c44d98' report? This is likely to be a unionfs bug. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Upgrading to 7.0 - stupid requirements
Marko Lerota wrote: Kris Kennaway [EMAIL PROTECTED] writes: Then the servers. Why should I reinstall all my databases and such? I always liked that FreeBSD base (OS) is separated from packages. And no matter what I do with the packages, my OS will always work. I don't want dependency hell like in Linux. Now you are telling me that my database might not work after upgrade to a new version. Is that it? First, try to relax. Sorry, but I'm pissed off now, not relax any more. portupgrade -faP requests to reinstall everything from precompiled packages. It will only fall back to compiling them locally if the package is unavailable (e.g. for legal reasons). It passed two days from portupgrade -faP, and it didn't finished yet. To be worse, I have to do it again because the PC had to be rebooted. So in the next 2-3 days I can sit with my PC and wait with him to finish the upgrade. It will be three days because of [EMAIL PROTECTED]@[EMAIL PROTECTED] Are you connected via a modem or something? 2-3 days to download some packages cannot be right if you have a decent internet connection. And I have to pray the god that I don't have the power loss. Now apache and acroread doesn't work any more and I'm afraid that I'll find some other stuff that don't work too. So can anyone tell me this is not stupid??? Reinstalling all applications because of upgrade? This can be called new installation. Not upgrade. Now I'm thinking that It would be much easier that I backup my files, databases and other stuff and do fresh installation. But why So I can do the same thing when 8_0 comes out? This is the worst thing that I found about FreeBSD for now. This have to be changed or fixed somehow, because the upgrade is not possible if you have lots of ports installed, and certainly can't be called upgrade! I'm sorry you feel that way, but you haven't understood the explanations that you have been given already. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Upgrading to 7.0 - stupid requirements
Kevin Oberman wrote: Or, is the system failing to retrieve the packages and failing over to building the ports? This would take a long time! I always tee the output of portupgrade to a file so, if it dies in the middle, it's pretty easy to pick up where it left off and not re-build everything twice. Yes, also I am pretty sure that if you rerun portupgrade -faP a second time it will reuse the cached packages it downloaded last time, if they are not out of date. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.3 to 7.0 release out of memory allocating error
Ruben Lara wrote: Hi all! I'm trying to buildworld from 6.3 release to 7.0. All ok, but when system is building world, i get cc1: out of memory allocating 97582896 bytes error I goolge it and get response in a forum: Check /etc/make.conf for CFLAGS, and if present remove it. O remove CFLAGS ande installation can continue a seconds more, but then: cc1: out of memory allocating 15133360 bytes I don't know what can i do Set CFLAGS=-O (i.e. reduce optimization level, which has high memory requirements with gcc 4.x). Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Implicit declaration of built-in function upgrading to 7.0
Ruben Lara wrote: Hi! I solved problem with memory, but now, make buildworld crash with next error: /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:238: warning: incompatible implicit declaration of built-in function ´abort´ /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:In function ´_Unwind_Resume_or_Rethrow´: /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:263: warning: incompatible implicit declaration of built-in function ´abort´ *** Error code 1 . Anybody can help me¿¿? Probably because you set CFLAGS= (no optimization). Note that no-one told you to do that :-) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap_pager: indefinite wait buffer
Michael Grant wrote: On Wed, Mar 5, 2008 at 11:08 AM, Ruben van Staveren [EMAIL PROTECTED] wrote: On 5 Mar 2008, at 10:06, Michael Grant wrote: My server just literally was brought to it's knees with this message spewing on the console: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1203133, size: 4096 (blkno and size were varying) Some searching says that this is or was a bug. Has this been fixed yet? If so, what should I upgrade to? I'm currently running 6.3 You may consider partition backed swap instead of file backed swap if that is the case. Hmm, I can't easily do that, I didn't leave any empty partitions around as I never considered swapping to a file to be a so bad. Is swapping to a file so bad under normal conditions? The message indicates that it took 30 seconds to complete an operation, so it was timed out assuming the I/O was lost by the device. In your case it was probably not lost, just delayed for more than 30 seconds by an overloaded filesystem. Does this mean that this bug is still not fixed in 7.0? It's not clear whether it's a bug or your disk is just too overloaded to complete the filesystem operation in a reasonable time period (swapping to a file is slower than swapping to a partition, which is already something you never want to do in normal operation). You can increase the timeout by editing the kernel. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What's new on the 127.0.0/24 block in 7?
Chris H. wrote: Greetings, I'm having some difficulty working with anything past 127.0.0.1. It seems impossible to use (create) any addresses on the loopback past 127.0.0.1. What evidence do you have for this? Show your ifconfig commands, etc. I use 127/8 addresses all the time without problems. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: jerky mouse still in 7.0-RELEASE
Eric L. Chen wrote: Hi Kris, I have this problem, too. If moused is enabled, use /dev/sysmouse in xorg.conf, X11 will freeze if mouse not moving. If moused is disabled, use /dev/psm0 in xorg.conf. Every thing works fine. I am running 7-STATBEL/i386. OK, please start a new thread so we don't confuse the issue further. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 7.9-stable: weird messages in /var/log/messages?
Torfinn Ingolfsen wrote: Hello One one of my stable machines I see these messages in /var/log/messages: Mar 3 18:37:41 kg-i82 kernel: 16.011e9e3975b3aa06 too long Mar 3 21:41:42 kg-i82 kernel: 16.016a24cf0742715c too long Mar 3 21:41:58 kg-i82 kernel: 15.feb784aee196608c too short Does anyone know hwat the messages mean, or which part of the kernel they are from? Googling didn't help me. It is reporting large variations in the rate of your time clock (see kern_tc.c). Also, you appear to be emailing from the distant future. Please reply with stock tips :) Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: portupgrade, recommended by 7 release notes, breaks perl
Steven Hartland wrote: Would fix this particular package but again: how many others do this? Maybe this is something that BSDPAN could / should override? It might be possible, you should talk to the BSDPAN maintainer. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Very large kernel
Alex de Kruijff wrote: I noticed that the kernel directory was very large compaired to 6.1. Is this for debugging and can I safely remove the symbols files I want to save some space? Yes but if you encounter a panic and need to submit a bug report then you will need at least the kernel.debug and whatever modules you are using. Kris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]