GEOM problems again...
Hi I've had problems before with GEOM mirror and my SATA drives, and i've posted about it here before too. The solution seemd to be a change of motherboard, this reduced the crash very much (and also the speeds archieved was greatly improved, from 10-15MB/s to 40-50MB/s..). However after the change i had one or two crashes, but now it has been running for well over 50-60 days or so without any problems. Then, 11 days ago I upgraded to 6.1... And now I got these crashes again (the mirror is crashed that is, the system still runs fine): May 21 02:04:58 elfi kernel: ad6: FAILURE - device detached May 21 02:04:58 elfi kernel: subdisk6: detached May 21 02:04:58 elfi kernel: ad6: detached May 21 02:04:58 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad6s1 disconnected. May 21 02:04:58 elfi kernel: g_vfs_done():mirror/gm0s1f[READ (offset=11006308352, length=2048)]error = 6 May 21 02:04:58 elfi kernel: g_vfs_done():mirror/gm0s1f[READ (offset=164847927296, length=131072)]error = 6 May 21 02:04:58 elfi kernel: g_vfs_done():mirror/gm0s1f[READ (offset=256680296448, length=32768)]error = 6 Some info about the controller and disks: May 9 22:46:52 elfi kernel: ata1: ATA channel 1 on atapci0 May 9 22:46:52 elfi kernel: atapci1: nVidia nForce2 Pro SATA150 controller port 0xec00-0xec07,0xe880-0xe883,0xe800-0xe807,0xe480-0xe483,0x7f00-0x7f0f, 0x7c0 0-0x7c7f irq 22 at device 11.0 on pci0 May 9 22:46:52 elfi kernel: ad4: 286188MB Maxtor 7L300S0 BANC1G10 at ata2-master SATA150 May 9 22:46:52 elfi kernel: ad6: 286188MB Maxtor 7L300S0 BANC1G10 at ata3-master SATA150 May 9 22:46:52 elfi kernel: GEOM_MIRROR: Device gm0s1 created (id=4118114647). May 9 22:46:52 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad4s1 detected. May 9 22:46:52 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad6s1 detected. May 9 22:46:52 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad6s1 activated. May 9 22:46:52 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad4s1 activated. May 9 22:46:52 elfi kernel: GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched. May 9 22:46:52 elfi kernel: Trying to mount root from ufs:/dev/ mirror/gm0s1a Anyone got any new clues? Afaik the disks should be working fine (they are 6 months old and this same problem has occured multiple times...) Hope to solve this ;) Thanks Johan
Re: improper handling of dlpened's C++/atexit() code?
Any hints on this available? Suggestions, more info, anything else? On 5/15/06, m m [EMAIL PROTECTED] wrote: On 5/14/06, Alexander Kabaev [EMAIL PROTECTED] wrote: On Thu, 11 May 2006 20:57:20 -0400 m m [EMAIL PROTECTED] wrote: I am writing in regard to PR at http://www.freebsd.org/cgi/query-pr.cgi?pr=bin%2F59552 . I am experiencing behavior on 6.1-PRERELEASE FreeBSD 6.1-PRERELEASE #11: Sun Mar 26 00:03:52 EST 2006 which looks a lot like something that would be caused by this PR. This happens when apache-1.3 processes that run with Mason code receive a SIGUSR1 (when newsyslog does log rotation) and apache gracefully kills off all processes when restarting. The following is the stack trace that lead me to this PR: You'll need to build ld-elf.so.1 and libc.so.6 to get a sensible backtrace. Please find the new stack trace below. If there is more information I can provide, just ask. (This is 6.1-STABLE, cvsup very shortly before May 11 23:14 EDT) Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) bt #0 0x in ?? () #1 0x294c0ad8 in __do_global_dtors_aux () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #2 0x294c1d4c in _fini () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #3 0x280b4c80 in ?? () #4 0x280aaab8 in ?? () from /libexec/ld-elf.so.1 #5 0xbfbfe6e8 in ?? () #6 0x2808dca6 in objlist_call_fini (list=0x280a96d8) at /usr/src/libexec/rtld-elf/rtld.c:1336 #7 0x2808e1d4 in rtld_exit () at /usr/src/libexec/rtld-elf/rtld.c:1528 #8 0x281d58ea in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:184 #9 0x281d55ba in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:69 #10 0x0805d0cb in clean_child_exit () #11 0x0805ea77 in just_die () #12 0x0805ea9a in usr1_handler () #13 0xbfbfffb4 in ?? () #14 0x001e in ?? () #15 0x in ?? () #16 0xbfbfe7c0 in ?? () #17 0x0002 in ?? () #18 0x0805ea80 in just_die () #19 0x0806011e in child_main () #20 0x080607de in make_child () #21 0x08060868 in startup_children () #22 0x08060e81 in standalone_main () #23 0x08061702 in main () ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 4.11 snapshots?
On Wed, May 17, 2006 at 01:22:59PM +0300, Dmitry Pryanishnikov wrote: Hello! On Tue, 16 May 2006, Colin Percival wrote: Personally, since FreeBSD 4.11 will reach its EoL about 8 months from now, and the 4.x-[56].x upgrade path is non-trivial, I recommend installing FreeBSD 6.1 instead. Well, have you seen my simple performance benchmarking RELENG_4 vs 6? IMHO it mimics quote common usage pattern: it just downloads a large file with 10Mbps rate and stores it on UFS filesystem. On the same hardware (i386 uniprocessor Celeron-333 system with 128Mb RAM and fast SAMSUNG SP0802N HDD using UDMA33) under the same conditions, using more optimal config (INVARIANTS removed) RELENG_6 (and 5) _still_ uses = 50% of CPU time for (Intr+Sys), while RELENG_4 doesn't use more than 28% for them. So (unless this performance difference will be minimized) I predict _a lot_ of requests to extend RELENG_4 support further, because people just couldn't afford 4-6 upgrade due to a loss of performance. This is a network+filesystem benchmark, and it's probably the network part that is using extra CPU, not the filesystem part. But until you run those profiling tests we can't be sure. Kris pgpLxj04Kg5CZ.pgp Description: PGP signature
Re: 4.11 snapshots?
On Mon, May 15, 2006 at 07:35:53PM -0600, Brett Glass wrote: Is there a server currently furnishing snapshots of the FreeBSD 4.11 security branch? We have some servers running various 4.x versions that might not be happy with 6.x due to memory requirements. They also might have slower file access (The file system in FreeBSD 6.x still isn't as snappy as the one in 4.x, though I hope this will change). So, we'd like to upgrade them to a patch level that includes all recent security fixes. Are ISOs available? FYI, it is only soft updates (really bufdaemon, through which all the I/O passes) that has performance problems under high load on 6.x compared to 4.x. Other mount modes perform quite a bit better (e.g. 30% faster for async write performance) than 4.x on the hardware I tested. http://people.freebsd.org/~kris/bsdcan/Filesystem%20Performance.pdf Kris P.S. Don't link to the above URL, I need to send it to Dan for publication on the BSDCan site, to which you should link instead. pgpG9rwhX0lLL.pgp Description: PGP signature
Fatal trap 12: page fault while in kernel mode
Hi, got the following trap on an i386 SMP system running with recent RELENG_6 sources. The system was doing copies from/to geli encrypted disks, using hifn(4) hardware crypto. Core and debug kernel are available, but the trace appears to be corrupted. - Christian Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0xc43cb554 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07854fe stack pointer = 0x28:0xd44c1c34 frame pointer = 0x28:0xd44c1c5c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 32 (irq19: hifn0) [thread pid 32 tid 100033 ] Stopped at generic_bcopy+0x1a: repe movsl (%esi),%es:(%edi) db tr generic_bcopy(c4ebe118,ff0,10,c43cb554,c4ebe0a0) at generic_bcopy+0x1a hifn_callback(c351e800,c3d2e000,0,868,0) at hifn_callback+0x333 hifn_intr(c351e800,d44c1cd8,c05a0e8d,c084a8a0,1) at hifn_intr+0x2a7 ithread_execute_handlers(c33f1a3c,c3345400,c07d243c,30f,c3310780) at ithread_execute_handlers+0x128 ithread_loop(c35260f0,d44c1d38,c07d220e,31d,dfff) at ithread_loop+0x84 fork_exit(c0590a00,c35260f0,d44c1d38) at fork_exit+0xc1 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xd44c1d6c, ebp = 0 --- Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0xc43cb554 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07854fe stack pointer = 0x28:0xd44c1c34 frame pointer = 0x28:0xd44c1c5c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 32 (irq19: hifn0) Dumping 511 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 511MB (130796 pages) 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc048f326 in db_fncall (dummy1=0, dummy2=0, dummy3=1999, dummy4=0xd44c1a14 `\027\204À\f) at /data/build/STABLE/src/sys/ddb/db_command.c:492 #2 0xc048f0a2 in db_command (last_cmdp=0xc0840e64, cmd_table=0x0, aux_cmd_tablep=0xc07fbbc0, aux_cmd_tablep_end=0xc07fbbc4) at /data/build/STABLE/src/sys/ddb/db_command.c:350 #3 0xc048f1b5 in db_command_loop () at /data/build/STABLE/src/sys/ddb/db_command.c:458 #4 0xc04913a5 in db_trap (type=12, code=0) at /data/build/STABLE/src/sys/ddb/db_main.c:221 #5 0xc05cb1be in kdb_trap (type=0, code=0, tf=0xd44c1bf4) at /data/build/STABLE/src/sys/kern/subr_kdb.c:473 #6 0xc0787e1b in trap_fatal (frame=0xd44c1bf4, eva=0) at /data/build/STABLE/src/sys/i386/i386/trap.c:827 #7 0xc0787ac2 in trap_pfault (frame=0xd44c1bf4, usermode=0, eva=3292312916) at /data/build/STABLE/src/sys/i386/i386/trap.c:744 #8 0xc078766d in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -1002654380, tf_esi = -991182864, tf_ebp = -733209508, tf_isp = -733209568, tf_ebx = 16, tf_edx = 4080, tf_ecx = 4, tf_eax = -11471516, tf_trapno = 12, tf_err = 2, tf_eip = -1065855746, tf_cs = 32, tf_eflags = 66066, tf_esp = 16, tf_ss = -991174344}) at /data/build/STABLE/src/sys/i386/i386/trap.c:434 #9 0xc07713da in calltrap () at /data/build/STABLE/src/sys/i386/i386/exception.s:139 #10 0xc07854fe in generic_bcopy () at /data/build/STABLE/src/sys/i386/i386/support.s:489 Previous frame inner to this frame (corrupt stack?) -- Christian Brueffer [EMAIL PROTECTED] [EMAIL PROTECTED] GPG Key: http://people.freebsd.org/~brueffer/brueffer.key.asc GPG Fingerprint: A5C8 2099 19FF AACA F41B B29B 6C76 178C A0ED 982D pgp2UdzA7D9eX.pgp Description: PGP signature
Re: 4.11 snapshots?
Well, y'know, if they could release a FreeBSD 2.2.9 (as was done last month), it shouldn't be a problem to do a 4.12 release as a last gasp to tide us over until September. (Hopefully, Colin and the summer of code folks can work on performance enhancements to the network stack, UFS2, and the hard disk device drivers in time for the 6.2 release in September. I'm just a little scared of 6.1, given the known problems that couldn't be fixed in time and the slower performance we're seeing on databases, etc.) --Brett ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
generic_bcopy() corrupts backtrace?
On Sun, May 21, 2006 at 09:04:05PM +0200, Christian Brueffer wrote: Core and debug kernel are available, but the trace appears to be corrupted. Sorry to hijack your thread, but I'm also seeing corrupted backtraces from kgdb involving generic_bcopy(). Is there something about its asm implementation that confuses kgdb? #10 0xc07854fe in generic_bcopy () at /data/build/STABLE/src/sys/i386/i386/support.s:489 Previous frame inner to this frame (corrupt stack?) Kris pgpIwy0lqd8An.pgp Description: PGP signature
Re: 4.11 snapshots?
On Sun, 2006-May-21 13:20:24 -0600, Brett Glass wrote: Well, y'know, if they could release a FreeBSD 2.2.9 (as was done last month), it shouldn't be a problem to do a 4.12 release as a last gasp to tide us over Maybe for April 1st next year - though novel April Fools Day jokes are always much better. -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: improper handling of dlpened's C++/atexit() code?
On Sun, May 21, 2006 at 01:13:35PM -0400, m m wrote: Any hints on this available? Suggestions, more info, anything else? On 5/15/06, m m [EMAIL PROTECTED] wrote: On 5/14/06, Alexander Kabaev [EMAIL PROTECTED] wrote: On Thu, 11 May 2006 20:57:20 -0400 m m [EMAIL PROTECTED] wrote: I am writing in regard to PR at http://www.freebsd.org/cgi/query-pr.cgi?pr=bin%2F59552 . I am experiencing behavior on 6.1-PRERELEASE FreeBSD 6.1-PRERELEASE #11: Sun Mar 26 00:03:52 EST 2006 which looks a lot like something that would be caused by this PR. This happens when apache-1.3 processes that run with Mason code receive a SIGUSR1 (when newsyslog does log rotation) and apache gracefully kills off all processes when restarting. The following is the stack trace that lead me to this PR: You'll need to build ld-elf.so.1 and libc.so.6 to get a sensible backtrace. Please find the new stack trace below. If there is more information I can provide, just ask. (This is 6.1-STABLE, cvsup very shortly before May 11 23:14 EDT) Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) bt #0 0x in ?? () #1 0x294c0ad8 in __do_global_dtors_aux () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #2 0x294c1d4c in _fini () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #3 0x280b4c80 in ?? () #4 0x280aaab8 in ?? () from /libexec/ld-elf.so.1 #5 0xbfbfe6e8 in ?? () #6 0x2808dca6 in objlist_call_fini (list=0x280a96d8) at /usr/src/libexec/rtld-elf/rtld.c:1336 #7 0x2808e1d4 in rtld_exit () at /usr/src/libexec/rtld-elf/rtld.c:1528 #8 0x281d58ea in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:184 #9 0x281d55ba in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:69 #10 0x0805d0cb in clean_child_exit () #11 0x0805ea77 in just_die () #12 0x0805ea9a in usr1_handler () #13 0xbfbfffb4 in ?? () #14 0x001e in ?? () #15 0x in ?? () #16 0xbfbfe7c0 in ?? () #17 0x0002 in ?? () #18 0x0805ea80 in just_die () #19 0x0806011e in child_main () #20 0x080607de in make_child () #21 0x08060868 in startup_children () #22 0x08060e81 in standalone_main () #23 0x08061702 in main () Could you, please, put somewhere: 1. /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so 2. output of lsof -p some apache child process pid for apache running in your usual configuration. Also, could you run the apache with LD_PRELOAD=/usr/lib/libstdc++.so.5 and report whether the problem persists ? pgpBQKJe6UYtu.pgp Description: PGP signature
Re: 4.11 snapshots?
On Sun, May 21, 2006 at 01:20:24PM -0600, Brett Glass wrote: Well, y'know, if they could release a FreeBSD 2.2.9 (as was done last month), it shouldn't be a problem to do a 4.12 release as a last gasp to tide us over until September. (Hopefully, Colin and the summer of code folks can work on performance enhancements to the network stack, UFS2, and the hard disk device drivers in time for the 6.2 release in September. I'm just a little scared of 6.1, given the known problems that couldn't be fixed in time and the slower performance we're seeing on databases, etc.) release(7) FreeBSD provides a complete build environment suitable for users to make full releases of the FreeBSD operating system. In the week that this email thread has been kicking around you could have made one yourself. Andrew ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: improper handling of dlpened's C++/atexit() code?
On Sun, 21 May 2006 13:13:35 -0400 m m [EMAIL PROTECTED] wrote: Any hints on this available? Suggestions, more info, anything else? On 5/15/06, m m [EMAIL PROTECTED] wrote: On 5/14/06, Alexander Kabaev [EMAIL PROTECTED] wrote: On Thu, 11 May 2006 20:57:20 -0400 m m [EMAIL PROTECTED] wrote: I am writing in regard to PR at http://www.freebsd.org/cgi/query-pr.cgi?pr=bin%2F59552 . I am experiencing behavior on 6.1-PRERELEASE FreeBSD 6.1-PRERELEASE #11: Sun Mar 26 00:03:52 EST 2006 which looks a lot like something that would be caused by this PR. This happens when apache-1.3 processes that run with Mason code receive a SIGUSR1 (when newsyslog does log rotation) and apache gracefully kills off all processes when restarting. The following is the stack trace that lead me to this PR: You'll need to build ld-elf.so.1 and libc.so.6 to get a sensible backtrace. Please find the new stack trace below. If there is more information I can provide, just ask. (This is 6.1-STABLE, cvsup very shortly before May 11 23:14 EDT) Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) bt #0 0x in ?? () #1 0x294c0ad8 in __do_global_dtors_aux () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #2 0x294c1d4c in _fini () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #3 0x280b4c80 in ?? () #4 0x280aaab8 in ?? () from /libexec/ld-elf.so.1 #5 0xbfbfe6e8 in ?? () #6 0x2808dca6 in objlist_call_fini (list=0x280a96d8) at /usr/src/libexec/rtld-elf/rtld.c:1336 #7 0x2808e1d4 in rtld_exit () at /usr/src/libexec/rtld-elf/rtld.c:1528 #8 0x281d58ea in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:184 #9 0x281d55ba in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:69 #10 0x0805d0cb in clean_child_exit () #11 0x0805ea77 in just_die () #12 0x0805ea9a in usr1_handler () #13 0xbfbfffb4 in ?? () #14 0x001e in ?? () #15 0x in ?? () #16 0xbfbfe7c0 in ?? () #17 0x0002 in ?? () #18 0x0805ea80 in just_die () #19 0x0806011e in child_main () #20 0x080607de in make_child () #21 0x08060868 in startup_children () #22 0x08060e81 in standalone_main () #23 0x08061702 in main () Looks like normal atexit path to me. At this point a close look at Syslog.so is needed IMHO. I do not see anything criminal implicating FreeBSD in this crash in any way. -- Alexander Kabaev signature.asc Description: PGP signature
6.1 stability (Re: 4.11 snapshots?)
On Mon, May 22, 2006 at 10:03:33AM +1200, Andrew Thompson wrote: On Sun, May 21, 2006 at 01:20:24PM -0600, Brett Glass wrote: Well, y'know, if they could release a FreeBSD 2.2.9 (as was done last month), it shouldn't be a problem to do a 4.12 release as a last gasp to tide us over until September. (Hopefully, Colin and the summer of code folks can work on performance enhancements to the network stack, UFS2, and the hard disk device drivers in time for the 6.2 release in September. I'm just a little scared of 6.1, given the known problems that couldn't be fixed in time and the slower performance we're seeing on databases, etc.) We did ourselves a big disservice by not pointing out clearly in the todo list that most of the listed problems are VERY RARE and are unlikely to affect most/all users. In future we're going to have to be clearer about that, because you're not the only user who was scared for no reason. We really had to push FreeBSD hard to find those problems; 6.1 stands up to enormous test loads that all previous tested releases could not handle. I wouldn't be surprised if 4.x cannot handle the same workloads, simply because no-one may have ever attempted such loads on a 4.x system. Kris P.S. If you're willing to put some effort into analyzing and profiling it, I'd be willing to work with you on your performance problem. pgp3834QpJGhf.pgp Description: PGP signature
Re: GEOM problems again...
Hi, Sorry this is only a 'me too' message... On Sun, 21 May 2006 11:16:14 +0200, Johan Ström [EMAIL PROTECTED] said: May 21 02:04:58 elfi kernel: ad6: FAILURE - device detached May 21 02:04:58 elfi kernel: subdisk6: detached May 21 02:04:58 elfi kernel: ad6: detached May 21 02:04:58 elfi kernel: GEOM_MIRROR: Device gm0s1: provider ad6s1 disconnected. I have a similar problem on a different M/B (Intel D925XECV2). I'm not sure if it is only a coincidence or somewhat related. May 21 07:43:49 elvenbow kernel: ad4: FAILURE - device detached May 21 07:43:49 elvenbow kernel: subdisk4: detached May 21 07:43:49 elvenbow kernel: ad4: detached May 21 07:43:49 elvenbow kernel: GEOM_MIRROR: Device gm0s1: provider ad4s1 disconnected. excerpts from dmesg: atapci0: Intel ICH6 UDMA100 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 ata0: ATA channel 0 on atapci0 ata1: ATA channel 1 on atapci0 atapci1: Intel ICH6 SATA150 controller port 0xec00-0xec07,0xe800-0xe803,0xe400-0xe407,0xe000-0xe003,0xdc00-0xdc0f mem 0xf3afbc00-0xf3afbfff irq 19 at device 31.2 on pci0 ata2: ATA channel 0 on atapci1 ata3: ATA channel 1 on atapci1 ata4: ATA channel 2 on atapci1 ata5: ATA channel 3 on atapci1 ad4: 239372MB Maxtor 7V250F0 VA111610 at ata2-master SATA150 ad6: 239372MB Maxtor 7V250F0 VA111610 at ata3-master SATA150 I purchased and started using this new PC last December, and the problem occurred several times by now. Both ad4 and ad6 have been detached (not at a time). 'atacontrol reinit' paused the system for a second, and returned without detecting the detached device. I need a complete power cycle or the device won't recognized by BIOS again. There is no SMART error recorded on these drives. I'm considering to change M/B, but it is difficult right now... dmesg.boot is attached. Ah, the system is running FreeBSD 6.1-STABLE amd64. FreeBSD elvenbow.cc.kyushu-u.ac.jp 6.1-STABLE FreeBSD 6.1-STABLE #0: Mon May 8 16:54:22 JST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ELVENBOW amd64 -- Yoshiaki Kasahara Computing and Communications Center, Kyushu University [EMAIL PROTECTED] Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-STABLE #0: Mon May 8 16:54:22 JST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ELVENBOW Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3000.10-MHz K8-class CPU) Origin = GenuineIntel Id = 0xf43 Stepping = 3 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNTX-ID,CX16,b14 AMD Features=0x20100800SYSCALL,NX,LM Logical CPUs per core: 2 real memory = 2145579008 (2046 MB) avail memory = 2060705792 (1965 MB) ACPI APIC Table: INTEL D925CV2 ioapic0 Version 2.0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: INTEL D925CV2 on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR Timecounter ACPI-fast frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0 cpu0: ACPI CPU on acpi0 est0: Enhanced SpeedStep Frequency Control on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: Please update driver or contact the maintainer. est: cpu_vendor GenuineIntel, msr f2d0f2d, bus_clk, 64 device_attach: est0 attach returned 6 p4tcc0: CPU Frequency Thermal Control on cpu0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0 pci1: ACPI PCI bus on pcib1 pci1: display, VGA at device 0.0 (no driver attached) pci0: multimedia at device 27.0 (no driver attached) pcib2: ACPI PCI-PCI bridge at device 28.0 on pci0 pci5: ACPI PCI bus on pcib2 pcib3: ACPI PCI-PCI bridge at device 28.1 on pci0 pci4: ACPI PCI bus on pcib3 pcib4: ACPI PCI-PCI bridge at device 28.2 on pci0 pci3: ACPI PCI bus on pcib4 pcib5: ACPI PCI-PCI bridge at device 28.3 on pci0 pci2: ACPI PCI bus on pcib5 uhci0: Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-A port 0xcc00-0xcc1f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-A on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-B port 0xd000-0xd01f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: Intel 82801FB/FR/FW/FRW (ICH6) USB controller USB-B on
Re: improper handling of dlpened's C++/atexit() code?
On Sun, May 21, 2006 at 06:22:34PM -0400, m m wrote: n 5/21/06, Konstantin Belousov [EMAIL PROTECTED] wrote: Program received signal SIGSEGV, Segmentation fault. 0x in ?? () (gdb) bt #0 0x in ?? () #1 0x294c0ad8 in __do_global_dtors_aux () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #2 0x294c1d4c in _fini () from /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so #3 0x280b4c80 in ?? () #4 0x280aaab8 in ?? () from /libexec/ld-elf.so.1 #5 0xbfbfe6e8 in ?? () #6 0x2808dca6 in objlist_call_fini (list=0x280a96d8) at /usr/src/libexec/rtld-elf/rtld.c:1336 #7 0x2808e1d4 in rtld_exit () at /usr/src/libexec/rtld-elf/rtld.c:1528 #8 0x281d58ea in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:184 #9 0x281d55ba in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:69 #10 0x0805d0cb in clean_child_exit () #11 0x0805ea77 in just_die () #12 0x0805ea9a in usr1_handler () #13 0xbfbfffb4 in ?? () #14 0x001e in ?? () #15 0x in ?? () #16 0xbfbfe7c0 in ?? () #17 0x0002 in ?? () #18 0x0805ea80 in just_die () #19 0x0806011e in child_main () #20 0x080607de in make_child () #21 0x08060868 in startup_children () #22 0x08060e81 in standalone_main () #23 0x08061702 in main () Could you, please, put somewhere: 1. /usr/local/lib/perl5/5.8.8/mach/auto/Sys/Syslog/Syslog.so 2. output of lsof -p some apache child process pid for apache running in your usual configuration. Also, could you run the apache with LD_PRELOAD=/usr/lib/libstdc++.so.5 and report whether the problem persists ? Konstantin, Thank you for looking into this. lsof: http://www.savefile.com/files/6494253 Syslog.so: http://www.savefile.com/files/2163369 Although it's not an indicator of certainty (I have had it exit cleanly in the past), it appears that running Apache with LD_PRELOAD of libstdc++ does allow it to exit cleanly. Please let me know how I can further assist. If it would make things easier - I can provide access to a jail on this machine which exhibits the same behavior. Ok, I have a theory how it happens. Investigation of your instance of Syslog.so shows that crash happens at the following code of /usr/lib/crtbeginS.o: 282: if (__deregister_frame_info) 283: __deregister_frame_info (__EH_FRAME_BEGIN__); (this comes in from contrib/gcc/crtstuff.c, lines 282-283). Symbol __deregister_frame_info is weak and undefined in all your DSOs except libstdc++.so.5. This symbol provides part of the C++ runtime support for exception handling, and reasonably included from c++ runtime support library. Both lines 282 and 283 produce dynamic relocations in final DSO, but line 282 implies R_386_GLOB_DAT, and 283 - R386_JUMP_SLOT (for PLT). First relocation is resolved immediately on DSO load, second one is resolved on demand. My theory is that, at the time of loading Syslog.so, libstdc++.so.5 is loaded in the process, resulting in first relocation being satisfied by rtld immediately. But, at the time exit() processing comes to _fini() function of Syslog.so, libstdc++.so.5 is unloaded. And weak PLT relocation is resolved to 0. As result we got the frame #0 from your trace. This theory is confirmed by presence of libstdc++ in lsof output. Please, check that it does not show up at the time of crash dump by using show shared gdb command on crash dump. Short-time fix is to use LD_PRELOAD hack. The real solution would be to mark the libstdc++ DSO as unloadable and implement support for unloadable DSO in rtld (BTW, I think this is also needed for threading libraries libpthread and libthr for the same reason). I know that glibc dynamic loader has support for this feature. P.S. Apache seems to call exit(3) from the signal handler. This is wrong. pgpUopUulIugD.pgp Description: PGP signature
FreeBSD Security Survey
Dear FreeBSD users and system administrators, While the FreeBSD Security Team has traditionally been very good at investigating and responding to security issues in FreeBSD, this only solves half of the security problem: Unless users and administrators of FreeBSD systems apply the security patches provided, the advisories issued accomplish little beyond alerting potential attackers to the presence of vulnerabilities. The Security Team has been concerned for some time by anecdotal reports concerning the number of FreeBSD systems which are not being promptly updated or are running FreeBSD releases which have passed their End of Life dates and are no longer supported. In order to better understand which FreeBSD versions are in use, how people are (or aren't) keeping them updated, and why it seems so many systems are not being updated, I have put together a short survey of 12 questions. The information gathered will inform the work done by the Security Team, as well as my own personal work on FreeBSD this summer. If you administrate system(s) running FreeBSD (in the broad sense of are responsible for keeping system(s) secure and up to date), please visit http://people.freebsd.org/~cperciva/survey.html and complete the survey below before May 31st, 2006. Thanks, Colin Percival FreeBSD Security Officer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Security Survey
On May 21, 2006, at 11:55 , Colin Percival wrote: The Security Team has been concerned for some time by anecdotal reports concerning the number of FreeBSD systems which are not being promptly updated or are running FreeBSD releases which have passed their End of Life dates and are no longer supported. In order to better understand which FreeBSD versions are in use, how people are (or aren't) keeping them updated, and why it seems so many systems are not being updated, I I have a 6-STABLE box that is not going to be updated to 6.1 any time soon, because my personal mail will have to be offline while I do so --- including nuking and rebuilding all ports because the ports tree has been thrashed by multiple low level updates that affect a large percentage of the tree --- and it's only a 600MHz box so it will be offline for most of a week during that upgrade. And I'm uncertain how downgrading it to 6.0-RELEASE+security patches will complicate things (downgrading via cvsup/buildworld is not a supported option, last I checked). Granted, I probably should have stuck with 6.0-R --- but then, experience has shown me that the more reliable option is to wait a week or two after release and then install -STABLE. In short: keeping FreeBSD up to date tends to be painful at best. -- brandon s. allbery [linux,solaris,freebsd,perl] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon university KF8NH ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Security Survey
On May 21, 2006, at 20:55, Colin Percival wrote: If you administrate system(s) running FreeBSD (in the broad sense of are responsible for keeping system(s) secure and up to date), please visit http://people.freebsd.org/~cperciva/survey.html and complete the survey below before May 31st, 2006. What doesn't fit into the survey very well is that all my servers are production ones and it causes a lot of grief for users when I bring them down. I try to hold updates to once per year because of that. I am currently in the middle of upgrading from 5.3 to 6.0. The easy machines are done but there are still a few that will take considerable on-site time which is not easy to come by. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Security Survey
On Sun, 21 May 2006, Colin Percival wrote: In order to better understand which FreeBSD versions are in use, how people are (or aren't) keeping them updated, and why it seems so many systems are not being updated, I have put together a short survey of 12 questions. I applaud this survey, however question 9 missed an important point, at least to me. I was torn between answering less than once a month and I never update. While I find ports to be the single most useful feature of the FreeBSD experience, and can't thank contributors enough for the efforts, I on the other hand find updating my installed ports collection (for security reasons or otherwise) to be quite painful. I typically use portupgrade to perform this task. On several occasions I got bit by doing a portupgrade which wasn't able to completely upgrade all dependencies (particularly when X, GUI's, and desktops are in the mix -- though I always follow the special Gnome upgrade methods when appropriate). I can't rule out some form of pilot error, but the end result was pain. After several instances of unsatisfactory portupgrades (mostly in the 5.2 through early 5.4 timeframe), I adopted the practice of either not upgrading ports at all for the life of a particular installation on a machine (typically about one year), or when necessary by removing *all* ports from the machine, cvsup'ing, and reinstalling. This has served me quite well, particularly considering the minimal threat profile these particularly systems face. So, in short, that's why *I* rarely update ports for security reasons. There are steps that could be taken at the port maintenance level that would work well for my particular case, however that's beyond the scope of the survey. Thanks for taking the time put the survey together, I certainly hope it proves useful. Thank you, Brent Casavant ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Security Survey
Doug Hardie wrote: On May 21, 2006, at 20:55, Colin Percival wrote: If you administrate system(s) running FreeBSD (in the broad sense of are responsible for keeping system(s) secure and up to date), please visit http://people.freebsd.org/~cperciva/survey.html and complete the survey below before May 31st, 2006. What doesn't fit into the survey very well is that all my servers are production ones and it causes a lot of grief for users when I bring them down. I try to hold updates to once per year because of that. I am currently in the middle of upgrading from 5.3 to 6.0. The easy machines are done but there are still a few that will take considerable on-site time which is not easy to come by. A good failover strategy comes into play here. If you have one, then taking a single production machine off-line for a short period should be no big deal, even routine, and should not even be noticed by users if done correctly. This should be planned for and part of the network/system design. Yes, it definitely requires more resources to support, but I'll rephrase the same problem: what happens when (and I mean *when* and not *if*) a motherboard or network card fries or you suffer a hard disk crash (even 2+ drives failing at the same time on a raid array is not particularly unusual considering that drives are quite often from the same manufactured batch)? Lack of a failover on mission critical systems that *can't* be offline is like playing russian roulette. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD Security Survey
Brent Casavant wrote: On Sun, 21 May 2006, Colin Percival wrote: In order to better understand which FreeBSD versions are in use, how people are (or aren't) keeping them updated, and why it seems so many systems are not being updated, I have put together a short survey of 12 questions. I applaud this survey, however question 9 missed an important point, at least to me. I was torn between answering less than once a month and I never update. While I find ports to be the single most useful feature of the FreeBSD experience, and can't thank contributors enough for the efforts, I on the other hand find updating my installed ports collection (for security reasons or otherwise) to be quite painful. I typically use portupgrade to perform this task. On several occasions I got bit by doing a portupgrade which wasn't able to completely upgrade all dependencies (particularly when X, GUI's, and desktops are in the mix -- though I always follow the special Gnome upgrade methods when appropriate). I can't rule out some form of pilot error, but the end result was pain. After several instances of unsatisfactory portupgrades (mostly in the 5.2 through early 5.4 timeframe), I adopted the practice of either not upgrading ports at all for the life of a particular installation on a machine (typically about one year), or when necessary by removing *all* ports from the machine, cvsup'ing, and reinstalling. This has served me quite well, particularly considering the minimal threat profile these particularly systems face. So, in short, that's why *I* rarely update ports for security reasons. There are steps that could be taken at the port maintenance level that would work well for my particular case, however that's beyond the scope of the survey. Thanks for taking the time put the survey together, I certainly hope it proves useful. Thank you, Brent Casavant I share this frustration with you. I was once told that the pain in upgrading is due largely to a somewhat invisible difference between installing a pre-compiled package, and building+installing a port. In theory, if you stick to one method or the other, things will stay mostly consistent. But if you mix them, and particularly if you update the ports tree in the process, the end result is a bit more undefined. One thing that I wish for is that the ports tree would branch for releases, and that those branches would get security updates. I know that this would involve an exponentially larger amount of effort from the ports team, and I don't fault them for not doing it. Still, it would be nice to have. Scott ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]