Re: gptboot broken when compiled with clang 6 and WITHOUT_LOADER_GELI -- clang 5 is OK
On Thu, 26-Apr-2018 at 11:55:19 +0200, Dimitry Andric wrote: > On 25 Apr 2018, at 18:58, Andre Albsmeier <andre.albsme...@siemens.com> wrote: > > > > I have set up a new system disk for an i386 11.2-PRERELEASE box. I did the > > usual > > > > gpart create -s gpt $disk > > gpart add -t freebsd-boot -s 984 $disk > > gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 $disk > > ... > > > > thing, just to notice that the box wouldn't boot. It seems to hang where > > stage 2 should be run -- when the '\' should start spinning the screen > > turns white and the box hangs (tested on two machines, an Asus P5W and a > > Supermicro A2SAV). > > > > So I replaced gptboot on the new disk by the one from an older machine > > and everything was fine. I wanted to find out what is actually causing > > the issue and recompiled /usr/src/stand after updating the old sources > > in several steps. > > > > Eventually it turned out that it depends on the compiler. When compiling > > the latest /usr/src/stand with clang 5.0.1 the resulting gptboot works. > > When using 6.0.0 it doesn't. To be exact, it's gptboot.o which is causing > > the problems. When using a gptboot.o from a clang 5 system it is OK, when > > using a gptboot.o from a clang 6 system it fails. > > > > To add more confusion: I usually have WITHOUT_LOADER_GELI in my make.conf. > > When removing this, the resulting gptboot works even when compiled with > > clang 6... > > > > I can reproduce this in case s.o. wants me to do some tests... > > I can't reproduce it on my end, unfortunately. I downloaded the latest > FreeBSD-11.2-PRERELEASE-i386-20180420-r332802 snapshot, installed it to > a GPT partitioned disk, and it boots just fine. Out of curiosity I removed -O1 from stand/i386/gptboot/Makefile so the standard -O2 is used. Now the resulting gptboot is: -rw-rw 1 andre wheel 22023 30 Apr 11:51 /usr/obj/src/src-11/stand/i386/gptboot/gptboot and it works! So something is screwed up when compiling gptboot with -O1 (and WITHOUT_LOADER_GELI and with clang 6.0.0). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: gptboot broken when compiled with clang 6 and WITHOUT_LOADER_GELI -- clang 5 is OK
On Thu, 26-Apr-2018 at 11:55:19 +0200, Dimitry Andric wrote: > On 25 Apr 2018, at 18:58, Andre Albsmeier <andre.albsme...@siemens.com> wrote: > > > > I have set up a new system disk for an i386 11.2-PRERELEASE box. I did the > > usual > > > > gpart create -s gpt $disk > > gpart add -t freebsd-boot -s 984 $disk > > gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 $disk > > ... > > > > thing, just to notice that the box wouldn't boot. It seems to hang where > > stage 2 should be run -- when the '\' should start spinning the screen > > turns white and the box hangs (tested on two machines, an Asus P5W and a > > Supermicro A2SAV). > > > > So I replaced gptboot on the new disk by the one from an older machine > > and everything was fine. I wanted to find out what is actually causing > > the issue and recompiled /usr/src/stand after updating the old sources > > in several steps. > > > > Eventually it turned out that it depends on the compiler. When compiling > > the latest /usr/src/stand with clang 5.0.1 the resulting gptboot works. > > When using 6.0.0 it doesn't. To be exact, it's gptboot.o which is causing > > the problems. When using a gptboot.o from a clang 5 system it is OK, when > > using a gptboot.o from a clang 6 system it fails. > > > > To add more confusion: I usually have WITHOUT_LOADER_GELI in my make.conf. > > When removing this, the resulting gptboot works even when compiled with > > clang 6... > > > > I can reproduce this in case s.o. wants me to do some tests... > > I can't reproduce it on my end, unfortunately. I downloaded the latest > FreeBSD-11.2-PRERELEASE-i386-20180420-r332802 snapshot, installed it to > a GPT partitioned disk, and it boots just fine. It is clearly reproducible here with r333082 and with /etc/make.conf having only WITHOUT_LOADER_GELI. The resulting gptboot is: -rw-rw 1 andre wheel 20859 30 Apr 11:46 /usr/obj/src/src-11/stand/i386/gptboot/gptboot Here are the logs, maybe something is hidden here: machine -> /src/src-11/sys/i386/include x86 -> /src/src-11/sys/x86/include gzip -cn /src/src-11/stand/i386/gptboot/gptboot.8 > gptboot.8.gz cc -O2 -pipe -I/src/src-11/stand/i386/btx/lib -nostdinc -I/usr/obj/src/src-11/stand/libsa -I/src/src-11/stand/libsa -D_STANDALONE -I/src/src-11/sys -Ddouble=jagged-little-pill -Dfloat=floaty-mcfloatface -DLOADER_DISK_SUPPORT -ffreestanding -mno-mmx -mno-sse -mno-avx -msoft-float -march=i386 -I. -DBOOTPROG=\"gptboot\" -O1 -DGPT -DUFS1_AND_UFS2 -DSIOPRT=0x3f8 -DSIOFMT=0x3 -DSIOSPD=9600 -I/src/src-11/stand/common -I/src/src-11/stand/i386/common -I/src/src-11/stand/i386/boot2 -Wall -Waggregate-return -Wbad-function-cast -Wno-cast-align -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wpointer-arith -Wshadow -Wstrict-prototypes -Wwrite-strings -Winline -Wno-pointer-sign -g -std=gnu99 -Wsystem-headers -Werror -Wno-pointer-sign -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef -Wno-address-of-packed-member -Wno-sw itch -Wno-switch-enum -Wno-knr-promoted-parameter -Wno-parentheses -Qunused-arguments -no-integrated-as -c /src/src-11/stand/i386/gptboot/gptldr.S -o gptldr.o ld -static -N --gc-sections -e start -Ttext 0x7c00 -o gptldr.out gptldr.o objcopy -S -O binary gptldr.out gptldr.bin cc -O2 -pipe -I/src/src-11/stand/i386/btx/lib -nostdinc -I/usr/obj/src/src-11/stand/libsa -I/src/src-11/stand/libsa -D_STANDALONE -I/src/src-11/sys -Ddouble=jagged-little-pill -Dfloat=floaty-mcfloatface -DLOADER_DISK_SUPPORT -ffreestanding -mno-mmx -mno-sse -mno-avx -msoft-float -march=i386 -I. -DBOOTPROG=\"gptboot\" -O1 -DGPT -DUFS1_AND_UFS2 -DSIOPRT=0x3f8 -DSIOFMT=0x3 -DSIOSPD=9600 -I/src/src-11/stand/common -I/src/src-11/stand/i386/common -I/src/src-11/stand/i386/boot2 -Wall -Waggregate-return -Wbad-function-cast -Wno-cast-align -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wpointer-arith -Wshadow -Wstrict-prototypes -Wwrite-strings -Winline -Wno-pointer-sign -g -std=gnu99 -Wsystem-headers -Werror -Wno-pointer-sign -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable -Wno-tautological-compare -Wno-unused-value -Wno-parentheses-equality -Wno-unused-function -Wno-enum-conversion -Wno-unused-local-typedef -Wno-address-of-packed-member -Wno-sw itch -Wno-switch-enum -Wno-knr-promoted-parameter -Wno-parentheses -Qunused-arguments -c /src/src-11/stand/i386/gptboot/gptboot.c -o gptboot.o cc -O2 -pipe -I/src/src-11/stand/i386/btx/lib -nostdinc -I/usr/obj/src/src-11/stand/libsa -I/src/src-11/stand/libsa -D_STANDALONE -I/src/src-11/sys -Ddouble=jagged-little-pill -Dfl
Re: gptboot broken when compiled with clang 6 and WITHOUT_LOADER_GELI -- clang 5 is OK
On Thu, 26-Apr-2018 at 12:06:21 +0200, Dimitry Andric wrote: > On 26 Apr 2018, at 06:17, Dewayne Geraghty >wrote: > > > > Andre, You're not alone. I think there's a problem with clang6 on i386 > > FreeBSD 11.1X, refer: > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227552 > > https://forums.freebsd.org/threads/uptime-w-i386-breakage.65584/ > > and perhaps also on amd64, search for > > https://bugs.freebsd.org/bugzilla/buglist.cgi?quicksearch=clang_id=226390. > > > > Without time to investigate I've reverted userland > > FreeBSD 11.2-PRERELEASE r332843Mamd64 1101515 1101509 > > FreeBSD 11.2-PRERELEASE r332843Mi386 1101515 1101509 > > As noted in another post, I cannot reproduce the problems with gptboot, > but with FreeBSD-11.2-PRERELEASE-i386-20180420-r332802, I do indeed see > w and uptime crashing in libxo: Scary, seems to be OK here: buildbox:~>cat /usr/src/.svn_revision 333056 buildbox:~>uname -K -m i386 1101515 buildbox:~>w 19:51 up 4:41, 1 user, load averages: 0,25 0,26 0,21 USER TTY FROM LOGIN@ IDLE WHAT andre pts/0bali 19:42 - w buildbox:~>uptime 19:51 up 4:41, 1 user, load averages: 0,25 0,26 0,21 buildbox:~>ll /lib/libxo.so.0 -r--r--r-- 1 root wheel 97596 27 Apr 15:07 /lib/libxo.so.0 buildbox:~> -Andre > > (gdb) bt > #0 ifree (tsd=0x2800) at arena.h:799 > #1 0x2814b506 in __free (ptr=0x280601ef) at tsd.h:716 > #2 0x2808bb07 in xo_do_emit_fields () at > /usr/src/contrib/libxo/libxo/libxo.c:6419 > #3 0x28089a1c in xo_do_emit (xop=, flags= optimized out>, fmt=0x804ad4d "{:time-of-day/%s} ") at > /usr/src/contrib/libxo/libxo/libxo.c:6470 > #4 0x28089b61 in xo_emit (fmt=0x804ad4d "{:time-of-day/%s} ") at > /usr/src/contrib/libxo/libxo/libxo.c:6541 > #5 0x08049f50 in main (argc=, argv= out>) at /usr/src/usr.bin/w/w.c:475 > #6 0x080494cd in _start1 () > #7 0x08049448 in _start () > #8 0x in ?? () > > Not sure if this has anything to do with clang though, it looks more > like a double free to me, at first glance. I'll do some debugging. > > -Dimitry > -- Unix is very userfriendly. It's just picky who its friends are. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: gptboot broken when compiled with clang 6 and WITHOUT_LOADER_GELI -- clang 5 is OK
On Thu, 26-Apr-2018 at 11:55:19 +0200, Dimitry Andric wrote: > On 25 Apr 2018, at 18:58, Andre Albsmeier <andre.albsme...@siemens.com> wrote: > > > > I have set up a new system disk for an i386 11.2-PRERELEASE box. I did the > > usual > > > > gpart create -s gpt $disk > > gpart add -t freebsd-boot -s 984 $disk > > gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 $disk > > ... > > > > thing, just to notice that the box wouldn't boot. It seems to hang where > > stage 2 should be run -- when the '\' should start spinning the screen > > turns white and the box hangs (tested on two machines, an Asus P5W and a > > Supermicro A2SAV). > > > > So I replaced gptboot on the new disk by the one from an older machine > > and everything was fine. I wanted to find out what is actually causing > > the issue and recompiled /usr/src/stand after updating the old sources > > in several steps. > > > > Eventually it turned out that it depends on the compiler. When compiling > > the latest /usr/src/stand with clang 5.0.1 the resulting gptboot works. > > When using 6.0.0 it doesn't. To be exact, it's gptboot.o which is causing > > the problems. When using a gptboot.o from a clang 5 system it is OK, when > > using a gptboot.o from a clang 6 system it fails. > > > > To add more confusion: I usually have WITHOUT_LOADER_GELI in my make.conf. > > When removing this, the resulting gptboot works even when compiled with > > clang 6... > > > > I can reproduce this in case s.o. wants me to do some tests... > > I can't reproduce it on my end, unfortunately. I downloaded the latest > FreeBSD-11.2-PRERELEASE-i386-20180420-r332802 snapshot, installed it to > a GPT partitioned disk, and it boots just fine. > > This snapshot corresponds to r332802, which has clang 6.0.0, with the > EFLAGS change still reverted, so I assume its copy of gptboot is also > compiled with that. > > Which exact revision of the base system were you using? I have done some updates in the meantime and can't tell which revision it was (it was just a few days ago). Currently, the buildbox is at 333056 and I have just compiled gptboot (with WITHOUT_LOADER_GELI) and I am seeing its size is -rw-rw 1 andre wheel 20859 27 Apr 19:42 gptboot This is the size where it didn't work but I can't test it at the moment as the build machine and the test environment are @work. I am pretty sure it will fail as there weren't any changes to stand or clang in the last few days but let's wait until Monday when I can verify it... -Andre ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
gptboot broken when compiled with clang 6 and WITHOUT_LOADER_GELI -- clang 5 is OK
I have set up a new system disk for an i386 11.2-PRERELEASE box. I did the usual gpart create -s gpt $disk gpart add -t freebsd-boot -s 984 $disk gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 $disk ... thing, just to notice that the box wouldn't boot. It seems to hang where stage 2 should be run -- when the '\' should start spinning the screen turns white and the box hangs (tested on two machines, an Asus P5W and a Supermicro A2SAV). So I replaced gptboot on the new disk by the one from an older machine and everything was fine. I wanted to find out what is actually causing the issue and recompiled /usr/src/stand after updating the old sources in several steps. Eventually it turned out that it depends on the compiler. When compiling the latest /usr/src/stand with clang 5.0.1 the resulting gptboot works. When using 6.0.0 it doesn't. To be exact, it's gptboot.o which is causing the problems. When using a gptboot.o from a clang 5 system it is OK, when using a gptboot.o from a clang 6 system it fails. To add more confusion: I usually have WITHOUT_LOADER_GELI in my make.conf. When removing this, the resulting gptboot works even when compiled with clang 6... I can reproduce this in case s.o. wants me to do some tests... ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: i386 with 4GB RAM: less than 2GB available on A2SAV (Intel Atom E3940)
On Sun, 28-Jan-2018 at 10:32:44 -0600, Mike Karels wrote: > > On 28 Jan 2018, at 15:57, Andre Albsmeier <andre.albsme...@siemens.com> = > > wrote: > > > I have a lot of machines running with 4 GB physical RAM and, for > > > some reasons, I still have to use a 32 bits OS. > > >=20 > > > All of them show something between 3 and 3.5 GB of RAM available > > > in dmesg but the brand new Supermicro A2SAV really shocked me: > > >=20 > > > FreeBSD 11.1-STABLE #0: Mon Jan 15 06:57:10 CET 2018 > > > ... > > > real memory =3D 4294967296 (4096 MB) > > > avail memory =3D 1939558400 (1849 MB) > > > ... > > >=20 > > > So do people have any ideas how I might get a bit closer to at least > > > 3 GB? I assume there are no FreeBSD knobs which might help but hope > > > dies last... > > > This is a common problem on i386. Most likely some ranges are reserved > > for I/O mappings, such as video cards. If you boot with -v, I think the > > kernel prints an overview of the physical ram chunks available? I don't > > know of any other way to get such an overview. > > > Another option is to try PAE, but I have no idea how stable that is... > > > -Dimitry > > I suspect that the unavailable RAM has been mapped above 4 GB by the BIOS. > > About PAE: at $JOB, we have a FreeBSD 8.2 system that has been running > PAE reliably since 8.2 was new. Also, we ship amd64 systems that run > mostly 32-bit binaries, which works well. Finally I found some time to play with the PAE option. I added option PAE option KVA_PAGES=1024 and the A2SAV booted and gave me real memory = 4294967296 (4096 MB) avail memory = 4048207872 (3860 MB) Very encouraging, this is double of what I had before! So I decided to try this on my desktop machine which also boots but loading the i915kms and drm2 stuff fails with info: [drm] Initialized drm 1.1.0 20060810 drmn0: on vgapci0 error: [drm:pid949:i915_gem_gtt_init] *ERROR* Scratch setup failed device_attach: drmn0 attach returned 12 The KMS stuff doesn't support the A2SAV anyway so I can't test how it would behave there... ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: i386 with 4GB RAM: less than 2GB available on A2SAV (Intel Atom E3940)
On Sun, 28-Jan-2018 at 10:32:44 -0600, Mike Karels wrote: > > On 28 Jan 2018, at 15:57, Andre Albsmeier <andre.albsme...@siemens.com> = > > wrote: > > > I have a lot of machines running with 4 GB physical RAM and, for > > > some reasons, I still have to use a 32 bits OS. > > >=20 > > > All of them show something between 3 and 3.5 GB of RAM available > > > in dmesg but the brand new Supermicro A2SAV really shocked me: > > >=20 > > > FreeBSD 11.1-STABLE #0: Mon Jan 15 06:57:10 CET 2018 > > > ... > > > real memory =3D 4294967296 (4096 MB) > > > avail memory =3D 1939558400 (1849 MB) > > > ... > > >=20 > > > So do people have any ideas how I might get a bit closer to at least > > > 3 GB? I assume there are no FreeBSD knobs which might help but hope > > > dies last... > > > This is a common problem on i386. Most likely some ranges are reserved > > for I/O mappings, such as video cards. If you boot with -v, I think the > > kernel prints an overview of the physical ram chunks available? I don't > > know of any other way to get such an overview. > > > Another option is to try PAE, but I have no idea how stable that is... > > > -Dimitry > > I suspect that the unavailable RAM has been mapped above 4 GB by the BIOS. > > About PAE: at $JOB, we have a FreeBSD 8.2 system that has been running > PAE reliably since 8.2 was new. Also, we ship amd64 systems that run > mostly 32-bit binaries, which works well. But can the entire userland be 32 bit only? Maybe I'll try the PAE thing... -Andre ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: i386 with 4GB RAM: less than 2GB available on A2SAV (Intel Atom E3940)
On Sun, 28-Jan-2018 at 17:05:54 +0100, Dimitry Andric wrote: > On 28 Jan 2018, at 15:57, Andre Albsmeier <andre.albsme...@siemens.com> wrote: > > I have a lot of machines running with 4 GB physical RAM and, for > > some reasons, I still have to use a 32 bits OS. > > > > All of them show something between 3 and 3.5 GB of RAM available > > in dmesg but the brand new Supermicro A2SAV really shocked me: > > > > FreeBSD 11.1-STABLE #0: Mon Jan 15 06:57:10 CET 2018 > > ... > > real memory = 4294967296 (4096 MB) > > avail memory = 1939558400 (1849 MB) > > ... > > > > So do people have any ideas how I might get a bit closer to at least > > 3 GB? I assume there are no FreeBSD knobs which might help but hope > > dies last... > > This is a common problem on i386. Most likely some ranges are reserved > for I/O mappings, such as video cards. If you boot with -v, I think the > kernel prints an overview of the physical ram chunks available? I don't Yes, it does: real memory = 4294967296 (4096 MB) Physical memory chunk(s): 0x1000 - 0x0009afff, 630784 bytes (154 pages) 0x0010 - 0x003f, 3145728 bytes (768 pages) 0x00c28000 - 0x1fff, 524124160 bytes (127960 pages) 0x22151000 - 0x75733fff, 1398681600 bytes (341475 pages) 0x7998e000 - 0x79a5efff, 856064 bytes (209 pages) 0x7a151000 - 0x7a4b, 3600384 bytes (879 pages) 0x7a4eb000 - 0x7aae2fff, 6258688 bytes (1528 pages) 0x7aae5000 - 0x7afe, 5287936 bytes (1291 pages) avail memory = 1939800064 (1849 MB) -Andre > know of any other way to get such an overview. > > Another option is to try PAE, but I have no idea how stable that is... > > -Dimitry > -- Win98: useless extension to a minor patch release for 32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit operating system originally coded for a 4-bit microprocessor, written by a 2-bit company that can't stand for 1 bit of competition. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: i386 with 4GB RAM: less than 2GB available on A2SAV (Intel Atom E3940)
On Sun, 28-Jan-2018 at 22:51:04 +0700, Eugene Grosbein wrote: > 28.01.2018 21:57, Andre Albsmeier wrote: > > > I have a lot of machines running with 4 GB physical RAM and, for > > some reasons, I still have to use a 32 bits OS. > > > > All of them show something between 3 and 3.5 GB of RAM available > > in dmesg but the brand new Supermicro A2SAV really shocked me: > > > > FreeBSD 11.1-STABLE #0: Mon Jan 15 06:57:10 CET 2018 > > ... > > real memory = 4294967296 (4096 MB) > > avail memory = 1939558400 (1849 MB) > > ... > > > > So do people have any ideas how I might get a bit closer to at least > > 3 GB? I assume there are no FreeBSD knobs which might help but hope > > dies last... > > First, try to decrease amount of RAM dedicated to integrated video, if any > (BIOS Setup). Done that. I have set everything as small as possible but this didn't help. After a BIOS upgrade, I found the promising option MAX TOLUD which was set to 2GB. I changed it to 3GB but nothing changed. > > Also, I'd like to know reasons that made you stick to 32 bit OS > as we have pretty good support for 32 bit applications running under 64 bit > system. I (still) have 32 bit machines and don't want to maintain 2 userlands. Each machine has its own kernel but userland (updated via nfs) must remain 32 bit. Or is it possible to boot a 64 bit kernel and have everything else in 32 bit? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
i386 with 4GB RAM: less than 2GB available on A2SAV (Intel Atom E3940)
I have a lot of machines running with 4 GB physical RAM and, for some reasons, I still have to use a 32 bits OS. All of them show something between 3 and 3.5 GB of RAM available in dmesg but the brand new Supermicro A2SAV really shocked me: FreeBSD 11.1-STABLE #0: Mon Jan 15 06:57:10 CET 2018 ... real memory = 4294967296 (4096 MB) avail memory = 1939558400 (1849 MB) ... So do people have any ideas how I might get a bit closer to at least 3 GB? I assume there are no FreeBSD knobs which might help but hope dies last... ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
kernel build error on STABLE-11 with bktr
Is it just me seeing this or does nobody else use bktr compiled into the kernel anymore: ... cc -target i386-unknown-freebsd11.0 --sysroot=/usr/obj/src/src-11/tmp -B/usr/obj/src/src-11/tmp/usr/bin -c -O -pipe -march=pentium3 -g -nostdinc -I. -I/src/src-11/sys -I/src/src-11/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -march=core2 -fno-ident -MD -MF.depend.msp34xx.o -MTmsp34xx.o -mno-mmx -mno-sse -msoft-float -ffreestanding -fwrapv -fstack-protector -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs -fdiagnostics-show-option -Wno-unknown-pragmas -Wno-error-tautological-compare -Wno-error-empty-body -Wno-error-parentheses-equality -Wno-error-unused-function -Wno-error-pointer-sign -Wno-error-shift-negative-value -mno-aes -mno-avx -std=iso9899:1999 -Werror /src/src-11/sys/dev/bktr/msp34xx.c /src/src-11/sys/dev/bktr/msp34xx.c:112:18: error: unused variable 'bl_dfp' [-Werror,-Wunused-const-variable] static const int bl_dfp[] = { ^ Removing the bl_dfp definition fixes it... -Andre ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Segmentation fault running ntpd
On Fri, 30-Oct-2015 at 19:47:59 +0100, Mark Martinec wrote: > Not sure if it's the same issue, but it sure looks like it is. > > I have upgraded a couple of hosts (amd64) from 10.2-RELEASE-p5 > to 10.2-RELEASE-p6, i.e. the freebsd-upgrade essentially just > replaced the /usr/sbin/ntpd with a new one; then I restarted > the ntpd. > > On all host but one this was successful: the new ntpd starts > fine and works normally. But on one of these machines the > ntpd process immediately crashes with SIGSEGV. That machine > has an Intel Xeon cpu. It is not apparent to me in what way > this machine differs from others, I'll add my observations here: I am using an ntp.conf with a single server entry: server ntp.some.domain.org ntp.some.domain.org is a CNAME pointing to gate.some.domain.org and the latter contains an A record pointing to 192.168.128.1. After updating 9.3-STABLE to the latest version (one which includes ntp 4.2.8p4), ntpd crashes: Nov 1 09:38:38 voyager kernel: pid 4443 (ntpd), uid 0: exited on signal 11 This happens in line 871 of ntpd.c where mlockall() is called: && 0 != mlockall(MCL_CURRENT|MCL_FUTURE)) It does NOT crash with MCL_FUTURE only. It does crash with MCL_CURRENT only. When adding rlimit memlock -1 to ntpd.conf it does NOT crash (as mlockall() won't be called anymore). When specifying the IP address (192.168.128.1) as the server it does NOT crash. When specifying gate.some.domain.org as the server it also does NOT crash. tcpdump shows in this case: 09:49:59.542310 IP 192.168.128.2.21102 > 192.168.128.1.53: 7639+ A? gate.some.domain.org. (41) 09:49:59.542578 IP 192.168.128.1.53 > 192.168.128.2.21102: 7639* 1/1/0 A 192.168.128.1 (71) 09:49:59.542612 IP 192.168.128.2.52455 > 192.168.128.1.53: 42047+ ? gate.some.domain.org. (41) 09:49:59.542792 IP 192.168.128.1.53 > 192.168.128.2.52455: 42047* 0/1/0 (88) When reverting the server entry back to ntp.some.domain.org it crashes and tcpdump shows: 09:36:05.172552 IP 192.168.128.2.17836 > 192.168.128.1.53: 49768+ A? ntp.some.domain.org. (40) 09:36:05.173320 IP 192.168.128.1.53 > 192.168.128.2.17836: 49768* 2/1/0 CNAME gate.some.domain.org., A 192.168.128.1 (89) 09:36:05.173361 IP 192.168.128.2.22611 > 192.168.128.1.53: 63808+ ? ntp.some.domain.org. (40) 09:36:05.173595 IP 192.168.128.1.53 > 192.168.128.2.22611: 63808* 1/1/0 CNAME gate.some.domain.org. (106) The probability for crashing increases with the speed and the number of cores of the machine: On my old single-core Pentiums it never crashes, on my quad-cores i7-3770K it always crashes. The (asynchronous) resolving of the names start in line 3876 of ntp_config.c: getaddrinfo_sometime(curr_peer->addr->address, If we put the mlockall() call directly before this line, the crash is gone. Maybe you want to play around with rlimit, CNAMES, IPs and so on... -Andre Anyone else seeing this? > 2015-10-30 12:34, je David Wolfskill napisal > > On Fri, Oct 30, 2015 at 09:42:07AM +0100, Dag-Erling Smørgrav wrote: > >> David Wolfskillwrites: > >> > ... > >> > bound to 172.17.1.245 -- renewal in 43200 seconds. > >> > pid 544 (ntpd), uid 0: exited on signal 11 (core dumped) > >> > Starting Network: lo0 em0 iwn0 lagg0. > >> > ... > >> > >> Did you find a solution? I'm wondering if the ntpd problems people > >> are > >> reporting on freebsd-security@ are related. I vaguely recall hearing > >> that this had been traced to a pthread bug, but can't find anything > >> about it in commit logs or mailing list archives. > >> > > > > I don't recall finding "a solution" per se; that said, I also don't > > recall seeing an occurrence of the above for enough time that I'm not > > sure when I sent that message. :-} > > > > As a reality check: > > > > g1-252(11.0-C)[1] ls -lT /*.core > > -rw-r--r-- 1 root wheel 13783040 Aug 18 04:19:03 2015 /ntpd.core > > g1-252(11.0-C)[2] > > > > So -- among other points -- my last sighting of whatever was causing > > that was the day I built: > > > > FreeBSD 11.0-CURRENT #157 r286880M/286880:1100079: Tue Aug 18 > > 04:45:25 PDT 2015 > > r...@g1-252.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY amd64 > > > > Note that the machines where I run head get updated daily (unless > > there's enough of a problem with head that I can't build it or can't > > boot it (and I'm unable to circumvent the issue within a reasonable > > time)) -- and while I do attempt to run ntpd on the machines, the above > > failure is more "annoying" than "crippling" in my particular case. > > > > And I'm presently running: > > > > FreeBSD 11.0-CURRENT #227 r290138M/290138:1100084: Thu Oct 29 > > 05:12:58 PDT 2015 > > r...@g1-252.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY amd64 > > > > and building head @r290190 as I type. > > > > And FWIW, I *suspect* that one of the issues involved (in my case) > > was a ... lack of determinism ... in events involving getting the > > (wireless)
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote: OK, patch is applied. I will reboot the machine later and see what happens tomorrow in the morning. However, it might take a few days since the last 2 weeks all was fine. BTW, should this patch be used in general or is it just for debugging? My understanding is that it is something which could stay in the code... Patch is to improve debugging. I probably commit it after the issue is closed. Arguments against the commit is that the change imposes small performance penalty due to save and restore of the %ebp (I doubt that this is measureable by any means). Also, arguably, such change should be done for all functions in support.s, but bcopy() is the hot spot. Thanks to this patch, we (you ;-) were able to track down the problem. So how are we going to deal with this debugging patch itself? My suggestion would be to #ifdef it somehow so on one hand it will be availabe in future (and with bcopy being used a lot probability is high it might help in other places), on the other hand it won't steal cycles during normal use. -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Fri, 12-Jul-2013 at 08:01:12 +0200, Konstantin Belousov wrote: On Fri, Jul 12, 2013 at 07:24:40AM +0200, Andre Albsmeier wrote: On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote: OK, patch is applied. I will reboot the machine later and see what happens tomorrow in the morning. However, it might take a few days since the last 2 weeks all was fine. BTW, should this patch be used in general or is it just for debugging? My understanding is that it is something which could stay in the code... Patch is to improve debugging. I probably commit it after the issue is closed. Arguments against the commit is that the change imposes small performance penalty due to save and restore of the %ebp (I doubt that this is measureable by any means). Also, arguably, such change should be done for all functions in support.s, but bcopy() is the hot spot. Got a new one, 2 hours old ;-) GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcd5ec000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd82e45cc frame pointer = 0x28:0xd82e45d4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 18714 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d82e4418,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd82e43e8 kdb_backtrace(c081df13,c08a82e0,c0801bfa,d82e4424,d82e4424,...) at kdb_backtrace+0x29/frame 0xd82e43f4 panic(c0801bfa,c0845a01,c2b067d4,1,1,...) at panic+0xc9/frame 0xd82e4418 trap_fatal(c0ff6000,cd5ec000,2,0,c08b6bf4,...) at trap_fatal+0x353/frame 0xd82e4458 trap_pfault(baa8454b,21510,0,c2b06620,c08b6bf0,...) at trap_pfault+0x2d7/frame 0xd82e44a0 trap(d82e458c) at trap+0x41a/frame 0xd82e4580 calltrap() at calltrap+0x6/frame 0xd82e4580 --- trap 0xc, eip = 0xc07cb2fe, esp = 0xd82e45cc, ebp = 0xd82e45d4 --- bcopy(c36ed000,cd5e6000,8000,8000,c281b980,...) at bcopy+0x1a/frame 0xd82e45d4 ffs_snapshot(c2b35a90,c2ed0400,0,0,0,...) at ffs_snapshot+0x2933/frame 0xd82e490c ffs_mount(c2b35a90,c322e200,ff,d82e4c08,c2ccbc8c,...) at ffs_mount+0x15ee/frame 0xd82e4a3c vfs_donmount(c2b06620,10313108,0,c2b74d80,c2b74d80,...) at vfs_donmount+0x196b/frame 0xd82e4c2c sys_nmount(c2b06620,d82e4ccc,c2b06908,d82e4c6c,c0605015,...) at sys_nmount+0x63/frame 0xd82e4c50 syscall(d82e4d08) at syscall+0x2ce/frame 0xd82e4cfc Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xd82e4cfc --- syscall (378, FreeBSD ELF32, sys_nmount), eip = 0x180bdf37, esp = 0xbfbfd65c, ebp = 0xbfbfddd8 --- Uptime: 4d20h0m44s Physical memory: 503 MB Dumping 104 MB: 89 73 57 41 25 9 No symbol stopped_cpus in current context. No symbol stoppcbs in current context. #0 doadump (textdump=1) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump (textdump=1) at pcpu.h:249 #1 0xc05f in kern_reboot (howto=260) at /src/src-9/sys/kern/kern_shutdown.c:449 #2 0xc05fe028 in panic (fmt=value optimized out) at /src/src-9/sys/kern/kern_shutdown.c:637 #3 0xc07cd1d3 in trap_fatal (frame=0xd82e458c, eva=3445538816) at /src/src-9/sys/i386/i386/trap.c:1044 #4 0xc07cd4b7 in trap_pfault (frame=0xd82e458c, usermode=0, eva=3445538816) at /src/src-9/sys/i386/i386/trap.c:957 #5 0xc07ce05a in trap (frame=0xd82e458c) at /src/src-9/sys/i386/i386/trap.c:555 #6 0xc07ba88c in calltrap () at /src/src-9/sys/i386/i386/exception.s:170 #7 0xc07cb2fe in bcopy () at /src/src-9/sys/i386/i386/support.s:198 #8 0xc072be13 in ffs_snapshot (mp=0xc2b35a90, snapfile=0xc2ed0400 s5-2013.07.12-03.15.01) at /src/src-9/sys/ufs/ffs/ffs_snapshot.c:793 #9 0xc0748e8e in ffs_mount (mp=0xc2b35a90) at /src/src-9/sys/ufs/ffs/ffs_vfsops.c:483 #10 0xc068a72b in vfs_donmount (td=0xc2b06620, fsflags=271659272, fsoptions=0xc2b74d80) at /src/src-9/sys/kern/vfs_mount.c:948 #11 0xc068a8e3 in sys_nmount (td=0xc2b06620, uap=0xd82e4ccc) at /src/src-9/sys/kern/vfs_mount.c:417 #12 0xc07cd7ae in syscall (frame=0xd82e4d08) at subr_syscall.c:135 #13 0xc07ba8f1
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Fri, 12-Jul-2013 at 08:35:33 +0200, Konstantin Belousov wrote: On Fri, Jul 12, 2013 at 08:05:27AM +0200, Andre Albsmeier wrote: On Fri, 12-Jul-2013 at 08:01:12 +0200, Konstantin Belousov wrote: On Fri, Jul 12, 2013 at 07:24:40AM +0200, Andre Albsmeier wrote: On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote: OK, patch is applied. I will reboot the machine later and see what happens tomorrow in the morning. However, it might take a few days since the last 2 weeks all was fine. BTW, should this patch be used in general or is it just for debugging? My understanding is that it is something which could stay in the code... Patch is to improve debugging. I probably commit it after the issue is closed. Arguments against the commit is that the change imposes small performance penalty due to save and restore of the %ebp (I doubt that this is measureable by any means). Also, arguably, such change should be done for all functions in support.s, but bcopy() is the hot spot. Got a new one, 2 hours old ;-) GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcd5ec000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd82e45cc frame pointer = 0x28:0xd82e45d4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 18714 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d82e4418,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd82e43e8 kdb_backtrace(c081df13,c08a82e0,c0801bfa,d82e4424,d82e4424,...) at kdb_backtrace+0x29/frame 0xd82e43f4 panic(c0801bfa,c0845a01,c2b067d4,1,1,...) at panic+0xc9/frame 0xd82e4418 trap_fatal(c0ff6000,cd5ec000,2,0,c08b6bf4,...) at trap_fatal+0x353/frame 0xd82e4458 trap_pfault(baa8454b,21510,0,c2b06620,c08b6bf0,...) at trap_pfault+0x2d7/frame 0xd82e44a0 trap(d82e458c) at trap+0x41a/frame 0xd82e4580 calltrap() at calltrap+0x6/frame 0xd82e4580 --- trap 0xc, eip = 0xc07cb2fe, esp = 0xd82e45cc, ebp = 0xd82e45d4 --- bcopy(c36ed000,cd5e6000,8000,8000,c281b980,...) at bcopy+0x1a/frame 0xd82e45d4 ffs_snapshot(c2b35a90,c2ed0400,0,0,0,...) at ffs_snapshot+0x2933/frame 0xd82e490c ffs_mount(c2b35a90,c322e200,ff,d82e4c08,c2ccbc8c,...) at ffs_mount+0x15ee/frame 0xd82e4a3c vfs_donmount(c2b06620,10313108,0,c2b74d80,c2b74d80,...) at vfs_donmount+0x196b/frame 0xd82e4c2c sys_nmount(c2b06620,d82e4ccc,c2b06908,d82e4c6c,c0605015,...) at sys_nmount+0x63/frame 0xd82e4c50 syscall(d82e4d08) at syscall+0x2ce/frame 0xd82e4cfc Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xd82e4cfc --- syscall (378, FreeBSD ELF32, sys_nmount), eip = 0x180bdf37, esp = 0xbfbfd65c, ebp = 0xbfbfddd8 --- Uptime: 4d20h0m44s Physical memory: 503 MB Dumping 104 MB: 89 73 57 41 25 9 No symbol stopped_cpus in current context. No symbol stoppcbs in current context. #0 doadump (textdump=1) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump (textdump=1) at pcpu.h:249 #1 0xc05f in kern_reboot (howto=260) at /src/src-9/sys/kern/kern_shutdown.c:449 #2 0xc05fe028 in panic (fmt=value optimized out) at /src/src-9/sys/kern/kern_shutdown.c:637 #3 0xc07cd1d3 in trap_fatal (frame=0xd82e458c, eva=3445538816) at /src/src-9/sys/i386/i386/trap.c:1044 #4 0xc07cd4b7 in trap_pfault (frame=0xd82e458c, usermode=0, eva=3445538816) at /src/src-9/sys/i386/i386/trap.c:957 #5 0xc07ce05a in trap (frame=0xd82e458c) at /src/src-9/sys/i386/i386/trap.c:555 #6 0xc07ba88c in calltrap () at /src/src-9/sys/i386/i386/exception.s:170 #7 0xc07cb2fe in bcopy () at /src/src-9/sys/i386/i386/support.s:198 #8 0xc072be13 in ffs_snapshot (mp=0xc2b35a90, snapfile=0xc2ed0400 s5-2013.07.12-03.15.01) at /src/src-9/sys/ufs/ffs/ffs_snapshot.c:793 #9 0xc0748e8e in ffs_mount (mp=0xc2b35a90) at /src/src-9
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote: OK, patch is applied. I will reboot the machine later and see what happens tomorrow in the morning. However, it might take a few days since the last 2 weeks all was fine. BTW, should this patch be used in general or is it just for debugging? My understanding is that it is something which could stay in the code... Patch is to improve debugging. I probably commit it after the issue is closed. Arguments against the commit is that the change imposes small performance penalty due to save and restore of the %ebp (I doubt that this is measureable by any means). Also, arguably, such change should be done for all functions in support.s, but bcopy() is the hot spot. Got a new one, 2 hours old ;-) GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcd5ec000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd82e45cc frame pointer = 0x28:0xd82e45d4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 18714 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d82e4418,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd82e43e8 kdb_backtrace(c081df13,c08a82e0,c0801bfa,d82e4424,d82e4424,...) at kdb_backtrace+0x29/frame 0xd82e43f4 panic(c0801bfa,c0845a01,c2b067d4,1,1,...) at panic+0xc9/frame 0xd82e4418 trap_fatal(c0ff6000,cd5ec000,2,0,c08b6bf4,...) at trap_fatal+0x353/frame 0xd82e4458 trap_pfault(baa8454b,21510,0,c2b06620,c08b6bf0,...) at trap_pfault+0x2d7/frame 0xd82e44a0 trap(d82e458c) at trap+0x41a/frame 0xd82e4580 calltrap() at calltrap+0x6/frame 0xd82e4580 --- trap 0xc, eip = 0xc07cb2fe, esp = 0xd82e45cc, ebp = 0xd82e45d4 --- bcopy(c36ed000,cd5e6000,8000,8000,c281b980,...) at bcopy+0x1a/frame 0xd82e45d4 ffs_snapshot(c2b35a90,c2ed0400,0,0,0,...) at ffs_snapshot+0x2933/frame 0xd82e490c ffs_mount(c2b35a90,c322e200,ff,d82e4c08,c2ccbc8c,...) at ffs_mount+0x15ee/frame 0xd82e4a3c vfs_donmount(c2b06620,10313108,0,c2b74d80,c2b74d80,...) at vfs_donmount+0x196b/frame 0xd82e4c2c sys_nmount(c2b06620,d82e4ccc,c2b06908,d82e4c6c,c0605015,...) at sys_nmount+0x63/frame 0xd82e4c50 syscall(d82e4d08) at syscall+0x2ce/frame 0xd82e4cfc Xint0x80_syscall() at Xint0x80_syscall+0x21/frame 0xd82e4cfc --- syscall (378, FreeBSD ELF32, sys_nmount), eip = 0x180bdf37, esp = 0xbfbfd65c, ebp = 0xbfbfddd8 --- Uptime: 4d20h0m44s Physical memory: 503 MB Dumping 104 MB: 89 73 57 41 25 9 No symbol stopped_cpus in current context. No symbol stoppcbs in current context. #0 doadump (textdump=1) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump (textdump=1) at pcpu.h:249 #1 0xc05f in kern_reboot (howto=260) at /src/src-9/sys/kern/kern_shutdown.c:449 #2 0xc05fe028 in panic (fmt=value optimized out) at /src/src-9/sys/kern/kern_shutdown.c:637 #3 0xc07cd1d3 in trap_fatal (frame=0xd82e458c, eva=3445538816) at /src/src-9/sys/i386/i386/trap.c:1044 #4 0xc07cd4b7 in trap_pfault (frame=0xd82e458c, usermode=0, eva=3445538816) at /src/src-9/sys/i386/i386/trap.c:957 #5 0xc07ce05a in trap (frame=0xd82e458c) at /src/src-9/sys/i386/i386/trap.c:555 #6 0xc07ba88c in calltrap () at /src/src-9/sys/i386/i386/exception.s:170 #7 0xc07cb2fe in bcopy () at /src/src-9/sys/i386/i386/support.s:198 #8 0xc072be13 in ffs_snapshot (mp=0xc2b35a90, snapfile=0xc2ed0400 s5-2013.07.12-03.15.01) at /src/src-9/sys/ufs/ffs/ffs_snapshot.c:793 #9 0xc0748e8e in ffs_mount (mp=0xc2b35a90) at /src/src-9/sys/ufs/ffs/ffs_vfsops.c:483 #10 0xc068a72b in vfs_donmount (td=0xc2b06620, fsflags=271659272, fsoptions=0xc2b74d80) at /src/src-9/sys/kern/vfs_mount.c:948 #11 0xc068a8e3 in sys_nmount (td=0xc2b06620, uap=0xd82e4ccc) at /src/src-9/sys/kern/vfs_mount.c:417 #12 0xc07cd7ae in syscall (frame=0xd82e4d08) at subr_syscall.c:135 #13 0xc07ba8f1 in Xint0x80_syscall () at /src/src-9/sys/i386/i386/exception.s:270 #14 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) Hth, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Sun, 07-Jul-2013 at 14:32:17 +0200, Jeremy Chadwick wrote: On Sun, Jul 07, 2013 at 02:13:54PM +0200, Andre Albsmeier wrote: On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote: On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote: OK, here we go (looks better now): GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: dev = stripe/p, block = 592, fs = /palveli panic: ffs_blkfree_cg: freeing free block KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd70fc8f4 kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at kdb_backtrace+0x29/frame 0xd70fc900 panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924 ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 0xd70fc9c8 ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 0xd70fca00 indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 0xd70fcae0 indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at indir_trunc+0x514/frame 0xd70fcbc0 handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24 process_worklist_item(0,0,0,c086ae78,0,...) at process_worklist_item+0x27a/frame 0xd70fcc6c softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at softdep_process_worklist+0x91/frame 0xd70fcc9c softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 0xd70f fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4 fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4 --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 --- Uptime: 2d16h29m37s Physical memory: 503 MB Dumping 95 MB: 80 64 48 32 16 No symbol stopped_cpus in current context. No symbol stoppcbs in current context. #0 doadump (textdump=1) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump (textdump=1) at pcpu.h:249 #1 0xc05f in kern_reboot (howto=260) at /src/src-9/sys/kern/kern_shutdown.c:449 #2 0xc05fe028 in panic (fmt=value optimized out) at /src/src-9/sys/kern/kern_shutdown.c:637 #3 0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, devvp=0xc2b0d470, bno=592, size=32768, inum=1183, dephd=0xd70fcad0) at /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151 #4 0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, devvp=0xc2b0d470, bno=592, size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280 #5 0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, lbn=-376844) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965 #6 0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946 #7 0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, flags=512) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588 #8 0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, flags=512) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774 #9 0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558 #10 0xc0738f94 in softdep_flush () at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414 #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 softdep_flush, arg=0x0, frame=0xd70fcd08) at /src/src-9/sys/kern/kern_fork.c:988 #12 0xc07ba904 in fork_trampoline () at /src/src-9/sys/i386/i386/exception.s:279 (kgdb) up 10 #10 0xc0738f94 in softdep_flush () at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414 1414progress += softdep_process_worklist(mp, 0); -Andre This looks unrelated, and exactly this panic is usually has one of two causes: - corrupted filesystem, run fsck to recheck it; root@palveli:~fsck /dev/stripe/p ** /dev/stripe/p ** Last Mounted on /palveli ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% fragmentation) * FILE SYSTEM IS CLEAN * Taken from your previous mail (showing
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 08:15:50 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 07:27:00AM +0200, Andre Albsmeier wrote: On Thu, 04-Jul-2013 at 07:24:40 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 07:14:09AM +0200, Andre Albsmeier wrote: On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote: On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. :( Unfortunately these LORs don't really help with discerning the cause of the reboot. If you have remote power access (and still wanted to test this) one option would be to change KDB to drop into the debugger on a panic. Then you could connect over the KVM and take images of the original panic along with a stack trace. After a few days of no problems, the box decided to crash during mksnap_ffs today ;-(. But now I have a crashdump, see below. Unfortunatley, I cannot upload the dump somewhere but if you ask me check whatever things I'll be happy to help. kgdb /usr/obj/src/src-9/sys/palveli/kernel.debug vmcore.4 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcfb5e000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd83545d0 frame pointer = 0x28:0xd835490c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12929 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d835441c
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Sun, 07-Jul-2013 at 09:41:12 +0200, Konstantin Belousov wrote: On Sun, Jul 07, 2013 at 09:25:53AM +0200, Andre Albsmeier wrote: OK, here we go (looks better now): GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: dev = stripe/p, block = 592, fs = /palveli panic: ffs_blkfree_cg: freeing free block KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d70fc924,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd70fc8f4 kdb_backtrace(c081df13,c08a82e0,c0833a0b,d70fc930,d70fc930,...) at kdb_backtrace+0x29/frame 0xd70fc900 panic(c0833a0b,c2aae178,250,0,c2af80d4,...) at panic+0xc9/frame 0xd70fc924 ffs_blkfree_cg(250,0,8000,49f,d70fcad0,...) at ffs_blkfree_cg+0x399/frame 0xd70fc9c8 ffs_blkfree(c2b35100,c2af8000,c2b0d470,250,0,...) at ffs_blkfree+0xad/frame 0xd70fca00 indir_trunc(fffa3ff4,,0,8000,0,...) at indir_trunc+0x658/frame 0xd70fcae0 indir_trunc(dff3,,c072df0a,c2d68d00,c087abd8,...) at indir_trunc+0x514/frame 0xd70fcbc0 handle_workitem_freeblocks(0,d70fcc4c,2,246,c2ab1000,...) at handle_workitem_freeblocks+0x2dc/frame 0xd70fcc24 process_worklist_item(0,0,0,c086ae78,0,...) at process_worklist_item+0x27a/frame 0xd70fcc6c softdep_process_worklist(c2b36548,0,54,c0835825,64,...) at softdep_process_worklist+0x91/frame 0xd70fcc9c softdep_flush(0,d70fcd08,0,c2aac2f0,0,...) at softdep_flush+0x3e4/frame 0xd70f fork_exit(c0738bb0,0,d70fcd08) at fork_exit+0xa2/frame 0xd70fccf4 fork_trampoline() at fork_trampoline+0x8/frame 0xd70fccf4 --- trap 0, eip = 0, esp = 0xd70fcd40, ebp = 0 --- Uptime: 2d16h29m37s Physical memory: 503 MB Dumping 95 MB: 80 64 48 32 16 No symbol stopped_cpus in current context. No symbol stoppcbs in current context. #0 doadump (textdump=1) at pcpu.h:249 249 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump (textdump=1) at pcpu.h:249 #1 0xc05f in kern_reboot (howto=260) at /src/src-9/sys/kern/kern_shutdown.c:449 #2 0xc05fe028 in panic (fmt=value optimized out) at /src/src-9/sys/kern/kern_shutdown.c:637 #3 0xc0717899 in ffs_blkfree_cg (ump=0xc2b35100, fs=0xc2af8000, devvp=0xc2b0d470, bno=592, size=32768, inum=1183, dephd=0xd70fcad0) at /src/src-9/sys/ufs/ffs/ffs_alloc.c:2151 #4 0xc0717c8d in ffs_blkfree (ump=0xc2b35100, fs=0xc2af8000, devvp=0xc2b0d470, bno=592, size=32768, inum=1183, vtype=VREG, dephd=0xd70fcad0) at /src/src-9/sys/ufs/ffs/ffs_alloc.c:2280 #5 0xc0730348 in indir_trunc (freework=0xc2f99100, dbn=1642816, lbn=-376844) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7965 #6 0xc0730204 in indir_trunc (freework=0xc2f99100, dbn=1639680, lbn=-8205) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7946 #7 0xc07324bc in handle_workitem_freeblocks (freeblks=0xc2fc1e00, flags=512) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:7588 #8 0xc0730dfa in process_worklist_item (mp=0xc2b36548, target=10, flags=512) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1774 #9 0xc07360c1 in softdep_process_worklist (mp=0xc2b36548, full=0) at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1558 #10 0xc0738f94 in softdep_flush () at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414 #11 0xc05d1b82 in fork_exit (callout=0xc0738bb0 softdep_flush, arg=0x0, frame=0xd70fcd08) at /src/src-9/sys/kern/kern_fork.c:988 #12 0xc07ba904 in fork_trampoline () at /src/src-9/sys/i386/i386/exception.s:279 (kgdb) up 10 #10 0xc0738f94 in softdep_flush () at /src/src-9/sys/ufs/ffs/ffs_softdep.c:1414 1414progress += softdep_process_worklist(mp, 0); -Andre This looks unrelated, and exactly this panic is usually has one of two causes: - corrupted filesystem, run fsck to recheck it; root@palveli:~fsck /dev/stripe/p ** /dev/stripe/p ** Last Mounted on /palveli ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 9895 files, 2039706 used, 15697693 free (5397 frags, 1961537 blocks, 0.0% fragmentation) * FILE SYSTEM IS CLEAN * - faulty hardware, most likely RAM, but might be CPU/CPU cache/bus. Well, of course I cannot prove that this is not the case. But the box runs flawlessly otherwise. RAM is ECC monitored, PSU is OK and airflow is OK. Sure, I can't look inside of CPU etc. Is it the same machine where the bcopy panic occured ? Yes. Let's see what it does the next days... -Andre
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 08:15:50 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 07:27:00AM +0200, Andre Albsmeier wrote: On Thu, 04-Jul-2013 at 07:24:40 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 07:14:09AM +0200, Andre Albsmeier wrote: On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote: On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. :( Unfortunately these LORs don't really help with discerning the cause of the reboot. If you have remote power access (and still wanted to test this) one option would be to change KDB to drop into the debugger on a panic. Then you could connect over the KVM and take images of the original panic along with a stack trace. After a few days of no problems, the box decided to crash during mksnap_ffs today ;-(. But now I have a crashdump, see below. Unfortunatley, I cannot upload the dump somewhere but if you ask me check whatever things I'll be happy to help. kgdb /usr/obj/src/src-9/sys/palveli/kernel.debug vmcore.4 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcfb5e000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd83545d0 frame pointer = 0x28:0xd835490c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12929 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d835441c
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 19:25:28 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 04:29:19PM +0200, Andre Albsmeier wrote: OK, patch is applied. I will reboot the machine later and see what happens tomorrow in the morning. However, it might take a few days since the last 2 weeks all was fine. BTW, should this patch be used in general or is it just for debugging? My understanding is that it is something which could stay in the code... Patch is to improve debugging. That's what I suspected. I probably commit it after the issue is closed. Arguments against the commit is that the change imposes small performance penalty due to save and restore of the %ebp (I doubt that this is measureable by any means). Also, arguably, such change should be done for all functions in support.s, but bcopy() is the hot spot. Hmm, maybe it could simply get enabled conditionally through some kernel compile option which is off by default... -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote: On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. :( Unfortunately these LORs don't really help with discerning the cause of the reboot. If you have remote power access (and still wanted to test this) one option would be to change KDB to drop into the debugger on a panic. Then you could connect over the KVM and take images of the original panic along with a stack trace. After making core dumps work actually (dump device was stopped and FreeBSD-9 doesn't start it automatically) and upgrading to a recent version of 9.1-STABLE it _seems_ that the troubles are gone. In case the problem reappears I'll come back ;-). Thanks, -Andre -- John Baldwin -- FreeBSD is the most powerful OS. NetBSD is the most portable OS. OpenBSD is the most secure OS. Windoze is the most popular OS. Linux is no OS. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote: On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. :( Unfortunately these LORs don't really help with discerning the cause of the reboot. If you have remote power access (and still wanted to test this) one option would be to change KDB to drop into the debugger on a panic. Then you could connect over the KVM and take images of the original panic along with a stack trace. After a few days of no problems, the box decided to crash during mksnap_ffs today ;-(. But now I have a crashdump, see below. Unfortunatley, I cannot upload the dump somewhere but if you ask me check whatever things I'll be happy to help. kgdb /usr/obj/src/src-9/sys/palveli/kernel.debug vmcore.4 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcfb5e000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd83545d0 frame pointer = 0x28:0xd835490c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12929 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d835441c,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd83543ec kdb_backtrace(c081df13,c08a82e0,c0801bfa,d8354428,d8354428,...) at kdb_backtrace+0x29/frame 0xd83543f8 panic(c0801bfa,c0845a01,c2bafae4,1,1,...) at panic+0xc9/frame 0xd835441c trap_fatal(c0ff6000,cfb5e000,2,0,265abf,...) at trap_fatal+0x353/frame 0xd835445c trap_pfault(140da,0,c2baf930,c08b6a40,c282145c,...) at trap_pfault+0x2d7/frame 0xd83544a4 trap(d8354590) at trap+0x41a/frame 0xd8354584 calltrap() at calltrap+0x6/frame 0xd8354584 --- trap 0xc, eip = 0xc07cb2fe, esp = 0xd83545d0, ebp = 0xd835490c --- bcopy(c2b36548,c2f194e0,0,0,0,...) at bcopy+0x1a/frame 0xd835490c ffs_mount(c2b36548,c2db9000,ff
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Thu, 04-Jul-2013 at 07:24:40 +0200, Konstantin Belousov wrote: On Thu, Jul 04, 2013 at 07:14:09AM +0200, Andre Albsmeier wrote: On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote: On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. :( Unfortunately these LORs don't really help with discerning the cause of the reboot. If you have remote power access (and still wanted to test this) one option would be to change KDB to drop into the debugger on a panic. Then you could connect over the KVM and take images of the original panic along with a stack trace. After a few days of no problems, the box decided to crash during mksnap_ffs today ;-(. But now I have a crashdump, see below. Unfortunatley, I cannot upload the dump somewhere but if you ask me check whatever things I'll be happy to help. kgdb /usr/obj/src/src-9/sys/palveli/kernel.debug vmcore.4 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xcfb5e000 fault code = supervisor write, page not present instruction pointer = 0x20:0xc07cb2fe stack pointer = 0x28:0xd83545d0 frame pointer = 0x28:0xd835490c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12929 (mksnap_ffs) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c08207eb,d835441c,c05fdfc9,c081df13,c08a82e0,...) at db_trace_self_wrapper+0x26/frame 0xd83543ec kdb_backtrace(c081df13,c08a82e0,c0801bfa,d8354428,d8354428,...) at kdb_backtrace+0x29/frame 0xd83543f8 panic(c0801bfa,c0845a01,c2bafae4,1,1,...) at panic+0xc9/frame 0xd835441c trap_fatal(c0ff6000,cfb5e000,2,0,265abf,...) at trap_fatal+0x353/frame 0xd835445c
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Sun, 16-Jun-2013 at 12:30:07 +0200, Jeremy Chadwick wrote: On Sun, Jun 16, 2013 at 11:55:38AM +0200, Andre Albsmeier wrote: On Sun, 16-Jun-2013 at 10:49:37 +0200, Jeremy Chadwick wrote: On Sun, Jun 16, 2013 at 10:02:39AM +0200, Andre Albsmeier wrote: On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote: On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? I'll try to arrange that. It'll take a bit since this box is 200 km away... Maybe I'll find another one nearby to reproduce it... SPECIFICALLY regarding lack of crash dumps: I need to see the following: * cat /etc/rc.conf * cat /etc/fstab I may need output from other commands, but shall deal with that when I see output from the above. Thanks. No problem, see below... To make a long story short, the machine dumps core perfectly (tested that a while ago), but not when dealing with _this_ issue... I dump on da1s1b and savecore fetches it from there and puts it on /var (sitting on da0), that's faster. rc.conf (beware, rc.conf.local exists): --- rcshutdown_timeout=180 tmpmfs=YES tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m tmpmfs_flags=$tmpmfs_flags -v 1 -n background_fsck=NO nisdomainname=ofw.tld pflog_flags=-S syslogd_flags=-svv inetd_enable=YES inetd_flags=-l named_flags=-S 1000 named_chrootdir= rwhod_enable=YES sshd_enable=YES amd_enable=YES amd_flags=-F /etc/amd.conf nfs_client_enable=YES nfs_access_cache=2 mountd_flags=-n rpcbind_enable=YES ntpdate_enable=YES ntpdate_hosts=ntp ntpd_enable=YES ntpd_flags=-p /var/run/ntpd.pid nis_client_enable=YES nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2 nis_server_flags=-n nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v defaultrouter=192.168.16.2 keyrate=fast sendmail_flags=-bd -q5m sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost sendmail_msp_queue_flags=-Ac -q30m sendmail_rebuild_aliases=NO lpd_enable=YES lpd_flags=-s chkprintcap_enable=YES dumpdev=AUTO clear_tmp_X=NO ldconfig_paths=/usr/local/lib ldconfig_paths_aout= entropy_file=/boot/entropy-file rc.conf.local: -- hostname
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Mon, 17-Jun-2013 at 21:30:31 +0200, John Baldwin wrote: On Sunday, June 16, 2013 2:39:42 am Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. :( Unfortunately these LORs don't really help with discerning the cause of the reboot. If you have remote power access (and still wanted to test this) one option would be to change KDB to drop into the debugger on a panic. Then you could connect over the KVM and take images of the original panic along with a stack trace. As described yesterday, I think I know why we don't get dumps: I dump on da1 and da1 is spun down. On FreeBSD-7 da1 started automatically in this case, on FreeBSD-9 it doesn't. I now dump on da0 which is running already... My suggestion is that I will try to get a dump now -- however, I have to arrange it with people using the machine. I'll come back when I have a dump ready... Thanks, -Andre -- John Baldwin -- Win98: useless extension to a minor patch release for 32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit operating system originally coded for a 4-bit microprocessor, written by a 2-bit company that can't stand for 1 bit of competition. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? Couldn't attach the serial console yet ;-(. But I had people attach a KVMoverIP switch and enabled the various KDB options in the kernel. Now we can see a bit more (see below) -- no crashdump is being generated though. Some comments on what the crontab script does at 5:01 (I switched it from 5:15 to 5:01 for some reason): 1. Unmount all snapshots 2. Remove all /dev/md devices 3. Deleting the oldest snapshot 4. Generating a new snapshost 5. mdconfig and mount mount of all snapshots I assume the first LOR (sys_unmount) is related to the unmount and the second one (sys_unlink) to the rm. I have added some sleep(1) and sync(1) commands between the different steps but this didn't help. Now the log of three days, we can see another LOR after booting: --- cronjob start, day 1 --- Jun 11 05:01:00 kern.crit typhon kernel: lock order reversal: Jun 11 05:01:00 kern.crit typhon kernel: 1st 0xc53644c8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 Jun 11 05:01:00 kern.crit typhon kernel: 2nd 0xc5361290 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 Jun 11 05:01:00 kern.crit typhon kernel: KDB: stack backtrace: Jun 11 05:01:00 kern.crit typhon kernel: db_trace_self_wrapper(c0815bdd,662f7366,765f7366,706f7366,3a632e73,...) at db_trace_self_wrapper+0x26/frame 0xec3c98d4 Jun 11 05:01:00 kern.crit typhon kernel: kdb_backtrace(c063659b,c08198b4,c09e8bb0,586,ec3c99a0,...) at kdb_backtrace+0x2a/frame 0xec3c9930 Jun 11 05:01:00 kern.crit typhon kernel: _witness_debugger(c08198b4,c5361290,c080d6e7,c4c29338,c0835390,...) at _witness_debugger+0x25/frame 0xec3c9948 Jun 11 05:01:00 kern.crit typhon kernel: witness_checkorder(c5361290,9,c0835390,586,c53612b0,...) at witness_checkorder+0x86f/frame 0xec3c99a0 Jun 11 05:01:00 kern.crit typhon kernel: __lockmgr_args(c5361290,80400,c53612b0,0,0,...) at __lockmgr_args+0x829/frame 0xec3c9a58 Jun 11 05:01:00 kern.crit typhon kernel: vop_stdlock(ec3c9abc,246,c08bcd9c,80400,c5361238,...) at vop_stdlock+0x62/frame 0xec3c9a8c Jun 11 05:01:00 kern.crit typhon kernel: VOP_LOCK1_APV(c08611e0,ec3c9abc,c09e8bb4,c0890120,c5361238,...) at VOP_LOCK1_APV+0xb5/frame 0xec3c9aa8 Jun 11 05:01:00 kern.crit typhon kernel: _vn_lock(c5361238,80400,c0835390,586,ec3c9b14,...) at _vn_lock+0x5e/frame 0xec3c9adc Jun 11 05:01:00 kern.crit typhon kernel: ffs_flushfiles(c5365d34,0,c67ec600,0,c5365d34,...) at ffs_flushfiles+0x133/frame 0xec3c9b1c Jun 11 05:01:00 kern.crit typhon kernel: ffs_unmount(c5365d34,800,c0821043,513,c4c00c08,...) at ffs_unmount+0x180/frame 0xec3c9b5c Jun 11 05:01:00 kern.crit typhon kernel: dounmount(c5365d34,800,c67ec600,494,c67e8378,...) at dounmount+0x423/frame 0xec3c9bac Jun 11 05:01:00 kern.crit typhon kernel: sys_unmount(c67ec600,ec3c9ccc,c0846650,c081a478,206,...) at sys_unmount+0x3d1/frame
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote: On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? I'll try to arrange that. It'll take a bit since this box is 200 km away... Maybe I'll find another one nearby to reproduce it... SPECIFICALLY regarding lack of crash dumps: I need to see the following: * cat /etc/rc.conf * cat /etc/fstab I may need output from other commands, but shall deal with that when I see output from the above. Thanks. No problem, see below... To make a long story short, the machine dumps core perfectly (tested that a while ago), but not when dealing with _this_ issue... I dump on da1s1b and savecore fetches it from there and puts it on /var (sitting on da0), that's faster. rc.conf (beware, rc.conf.local exists): --- rcshutdown_timeout=180 tmpmfs=YES tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m tmpmfs_flags=$tmpmfs_flags -v 1 -n background_fsck=NO nisdomainname=ofw.tld pflog_flags=-S syslogd_flags=-svv inetd_enable=YES inetd_flags=-l named_flags=-S 1000 named_chrootdir= rwhod_enable=YES sshd_enable=YES amd_enable=YES amd_flags=-F /etc/amd.conf nfs_client_enable=YES nfs_access_cache=2 mountd_flags=-n rpcbind_enable=YES ntpdate_enable=YES ntpdate_hosts=ntp ntpd_enable=YES ntpd_flags=-p /var/run/ntpd.pid nis_client_enable=YES nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2 nis_server_flags=-n nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v defaultrouter=192.168.16.2 keyrate=fast sendmail_flags=-bd -q5m sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost sendmail_msp_queue_flags=-Ac -q30m sendmail_rebuild_aliases=NO lpd_enable=YES lpd_flags=-s chkprintcap_enable=YES dumpdev=AUTO clear_tmp_X=NO ldconfig_paths=/usr/local/lib ldconfig_paths_aout= entropy_file=/boot/entropy-file rc.conf.local: -- hostname=typhon.ofw.tld ifconfig_msk0=inet 192.168.24.1/21 ifconfig_msk0_alias0=inet 192.168.24.10/32 named_enable=YES nfs_server_enable=YES nis_client_flags=-s -S ofw.tld,nis-24-1,nis-24-2 nis_server_enable=YES defaultrouter=192.168.24.2 lpd_flags=-l dumpdev=/dev/da1s1b quota_enable=YES fstab: -- /dev/da0s1a / ufs noatime,rw 0 1 /dev/da0s1b noneswapsw 0 0 proc/proc procfs rw 0 0 /dev/da0s1d /usrufs noatime,rw 0 2 /dev/da0s1e /varufs noatime,nosuid,rw 0 2 /dev/da10p1 /share2 ufs suiddir,groupquota,noatime,nosuid,rw 0 2
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Sun, 16-Jun-2013 at 10:49:37 +0200, Jeremy Chadwick wrote: On Sun, Jun 16, 2013 at 10:02:39AM +0200, Andre Albsmeier wrote: On Sun, 16-Jun-2013 at 08:54:41 +0200, Jeremy Chadwick wrote: On Fri, May 31, 2013 at 07:25:23PM +0200, Andre Albsmeier wrote: On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? I'll try to arrange that. It'll take a bit since this box is 200 km away... Maybe I'll find another one nearby to reproduce it... SPECIFICALLY regarding lack of crash dumps: I need to see the following: * cat /etc/rc.conf * cat /etc/fstab I may need output from other commands, but shall deal with that when I see output from the above. Thanks. No problem, see below... To make a long story short, the machine dumps core perfectly (tested that a while ago), but not when dealing with _this_ issue... I dump on da1s1b and savecore fetches it from there and puts it on /var (sitting on da0), that's faster. rc.conf (beware, rc.conf.local exists): --- rcshutdown_timeout=180 tmpmfs=YES tmpsize=$(( `/sbin/sysctl -n hw.usermem` / 300 ))m tmpmfs_flags=$tmpmfs_flags -v 1 -n background_fsck=NO nisdomainname=ofw.tld pflog_flags=-S syslogd_flags=-svv inetd_enable=YES inetd_flags=-l named_flags=-S 1000 named_chrootdir= rwhod_enable=YES sshd_enable=YES amd_enable=YES amd_flags=-F /etc/amd.conf nfs_client_enable=YES nfs_access_cache=2 mountd_flags=-n rpcbind_enable=YES ntpdate_enable=YES ntpdate_hosts=ntp ntpd_enable=YES ntpd_flags=-p /var/run/ntpd.pid nis_client_enable=YES nis_client_flags=-s -S ofw.tld,nis-16-1,nis-16-2 nis_server_flags=-n nis_yppasswdd_flags=-t /var/yp/src/master.passwd -f -v defaultrouter=192.168.16.2 keyrate=fast sendmail_flags=-bd -q5m sendmail_submit_flags=$sendmail_flags -ODaemonPortOptions=Addr=localhost sendmail_msp_queue_flags=-Ac -q30m sendmail_rebuild_aliases=NO lpd_enable=YES lpd_flags=-s chkprintcap_enable=YES dumpdev=AUTO clear_tmp_X=NO ldconfig_paths=/usr/local/lib ldconfig_paths_aout= entropy_file=/boot/entropy-file rc.conf.local: -- hostname=typhon.ofw.tld ifconfig_msk0=inet 192.168.24.1/21 ifconfig_msk0_alias0=inet 192.168.24.10/32 named_enable=YES nfs_server_enable=YES nis_client_flags=-s -S ofw.tld,nis-24-1,nis-24-2 nis_server_enable=YES defaultrouter=192.168.24.2 lpd_flags=-l dumpdev=/dev/da1s1b quota_enable=YES fstab: -- /dev/da0s1a / ufs noatime,rw
FreeBSD-9.1: machine reboots during snapshot creation, LORs found
Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found
On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 kern.crit palveli kernel: lock order reversal: May 29 05:15:00 kern.crit palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 kern.crit palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 kern.crit palveli kernel: lock order reversal: May 29 05:15:04 kern.crit palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 kern.crit palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running dump -L on a root fs. Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? I'll try to arrange that. It'll take a bit since this box is 200 km away... Maybe I'll find another one nearby to reproduce it... -Andre -- This email has been checked as virus-free. It may still be full of nonsense however. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Lost CDROM on 9.1 with ATA_CAM on Promise controller
On Thu, 18-Apr-2013 at 13:53:14 +0200, Alexander Motin wrote: On 17.04.2013 12:47, Andre Albsmeier wrote: On Wed, 17-Apr-2013 at 10:53:54 +0200, Jeremy Chadwick wrote: On Wed, Apr 17, 2013 at 08:26:00AM +0200, Andre Albsmeier wrote: On Tue, 16-Apr-2013 at 21:38:22 +0200, Jeremy Chadwick wrote: On Tue, Apr 16, 2013 at 07:55:20PM +0200, Andre Albsmeier wrote: I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00) after going from 7.4 to 9.1 when using ATA_CAM. It is attached to a Promise PDC20268 UDMA100 controller. A standard harddisk drive attached to this controller works well. Cables, controller and drive where replaced already. Kernel gives me: atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... ada0 at ata2 bus 0 scbus2 target 0 lun 0 ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C) ... (cd2:ata3:0:0:0): got CAM status 0x50 (cd2:ata3:0:0:0): fatal error, failed to attach to device (cd2:ata3:0:0:0): lost device, 4 refs (cd2:ata3:0:0:0): removing device entry ... Attaching the CDROM drive to the controller that is integrated on the mainboard (Intel PIIX4 UDMA33 controller) does not show this problem (but here I don't have UDMA66). It also works when not using ATA_CAM: ... acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66 ... So this semes to be a problem with the Promise controller and ATA_CAM. Any ideas? Or should I file PR? The controller in question is a Promise Ultra100 TX2. Right. Tried with an Ultra133, same effect. The error message comes from sys/cam/scsi/scsi_cd.c, in function cddone(). The logic is a little hard for me to follow (I understand about 70% of it). Look at lines 1724 to 1877 for stable/9. 1. Can you provide full output from a verbose boot when the CD/DVD drive is attached to the Promise controller? Attached below. I have just filtered out some ahc cruft... Later I will try to boot a -current kernel -- just to see how this behaves... 2. What firmware version the card is using? The PDC20268 had many, many firmware problems relating to ATAPI devices. It is the latest BIOS: 2.20.0.15. 3. I wouldn't worry about ATA66 vs. ATA33; this drive can only support up to about 22MBytes/second so ATA66 isn't going to get you anything, so as a workaround, using the PIIX4 for it would not hurt you. Probably. But I already had cdrecord complain when it came to the funky DMA speed test it is doing. It went away when using the UDMA66 port. And on the other hand I sometimes use the PIIX4 port for other stuff and I do not want to attach the cdrom to the slave port. 4. ONLY if this turns out to be a controller thing: I'm not sure how much effort should be spent trying to make this work, as the PDC20268 is legacy/deprecated hardware (made/released 13 years ago). The whole box is more than 13 years old (good old Asus BX board) ;-) But since it worked in 7.4-STABLE I feel that this is some kind of regression. I do not want to waste anyone's resources in fixing it -- just if someone is curious and/or has an idea how to fix it... And here is the dmesg: {snipping for mail brevity} Thanks. CC'd ken@ and mav@ for advice on this. Here's the dmesg: http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073131.html Short details: The device under scrutiny here is cd2 on ata3, which is an ATAPI IDE-based optical drive. The drive works when either: a) Connected to a different IDE controller (atapci0), or, b) When ATA_CAM is removed (i.e. use ata(4) exclusively). And just as a note: The -current kernel from https://snapshots.glenbarber.us/Latest/FreeBSD-10.0-CURRENT-i386-20130316-r248381-bootonly.iso shows the same problem... Some of Promise controllers are known to have problems with ATAPI DMA. Have you tried to disable DMA on that channel or device with loader tunable like like hint.ata.3.mode=PIO4 ? Interestingly, now it attaches properly: cd2 at ata3 bus 0 scbus3 target 0 lun 0 cd2: HL-DT-ST DVD-RAM GH22LP20 2.00 Removable CD-ROM SCSI-0 device cd2: 16.700MB/s transfers (PIO4, ATAPI 12bytes, PIO 65534bytes) cd2: Attempt to query device size failed: NOT READY, Medium not present - tray closed Anyway, this thing worked until 7.4-STABLE in DMA66 mode and burned quite a few DVDs. And it attaches under 9.1 without ATA_CAM and even with atapicam (although here it probes with 3.300MB/s transfers and I didn't try to burn a DVD actually). So while I cannot prove that this controller doesn't have problems with ATAPI DMA, especially 7.4 found
Re: Lost CDROM on 9.1 with ATA_CAM on Promise controller
On Tue, 16-Apr-2013 at 21:38:22 +0200, Jeremy Chadwick wrote: On Tue, Apr 16, 2013 at 07:55:20PM +0200, Andre Albsmeier wrote: I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00) after going from 7.4 to 9.1 when using ATA_CAM. It is attached to a Promise PDC20268 UDMA100 controller. A standard harddisk drive attached to this controller works well. Cables, controller and drive where replaced already. Kernel gives me: atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... ada0 at ata2 bus 0 scbus2 target 0 lun 0 ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C) ... (cd2:ata3:0:0:0): got CAM status 0x50 (cd2:ata3:0:0:0): fatal error, failed to attach to device (cd2:ata3:0:0:0): lost device, 4 refs (cd2:ata3:0:0:0): removing device entry ... Attaching the CDROM drive to the controller that is integrated on the mainboard (Intel PIIX4 UDMA33 controller) does not show this problem (but here I don't have UDMA66). It also works when not using ATA_CAM: ... acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66 ... So this semes to be a problem with the Promise controller and ATA_CAM. Any ideas? Or should I file PR? The controller in question is a Promise Ultra100 TX2. Right. Tried with an Ultra133, same effect. The error message comes from sys/cam/scsi/scsi_cd.c, in function cddone(). The logic is a little hard for me to follow (I understand about 70% of it). Look at lines 1724 to 1877 for stable/9. 1. Can you provide full output from a verbose boot when the CD/DVD drive is attached to the Promise controller? Attached below. I have just filtered out some ahc cruft... Later I will try to boot a -current kernel -- just to see how this behaves... 2. What firmware version the card is using? The PDC20268 had many, many firmware problems relating to ATAPI devices. It is the latest BIOS: 2.20.0.15. 3. I wouldn't worry about ATA66 vs. ATA33; this drive can only support up to about 22MBytes/second so ATA66 isn't going to get you anything, so as a workaround, using the PIIX4 for it would not hurt you. Probably. But I already had cdrecord complain when it came to the funky DMA speed test it is doing. It went away when using the UDMA66 port. And on the other hand I sometimes use the PIIX4 port for other stuff and I do not want to attach the cdrom to the slave port. 4. ONLY if this turns out to be a controller thing: I'm not sure how much effort should be spent trying to make this work, as the PDC20268 is legacy/deprecated hardware (made/released 13 years ago). The whole box is more than 13 years old (good old Asus BX board) ;-) But since it worked in 7.4-STABLE I feel that this is some kind of regression. I do not want to waste anyone's resources in fixing it -- just if someone is curious and/or has an idea how to fix it... And here is the dmesg: Copyright (c) 1992-2013 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.1-STABLE #6: Wed Apr 17 07:56:57 CEST 2013 r...@server.ofw.tld:/usr/obj/src/src-9/sys/bratfix i386 gcc version 4.2.1 20070831 patched [FreeBSD] Preloaded elf kernel /boot/kernel/kernel at 0xc097d000. Calibrating TSC clock ... TSC clock: 1405298309 Hz CPU: Intel(R) Celeron(TM) CPU1400MHz (1405.30-MHz 686-class CPU) Origin = GenuineIntel Id = 0x6b1 Family = 0x6 Model = 0xb Stepping = 1 Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE Instruction TLB: 4 KB pages, 4-way set associative, 32 entries Instruction TLB: 4 MB pages, fully associative, 2 entries Data TLB: 4 KB pages, 4-way set associative, 64 entries 2nd-level cache: 256 KB, 8-way set associative, 32 byte line size 1st-level instruction cache: 16 KB, 4-way set associative, 32 byte line size Data TLB: 4 MB Pages, 4-way set associative, 8 entries 1st-level data cache: 16 KB, 4-way set associative, 32 byte line size real memory = 268435456 (256 MB) Physical memory chunk(s): 0x1000 - 0x0009dfff, 643072 bytes (157 pages) 0x0010 - 0x003f, 3145728 bytes (768 pages) 0x00c26000 - 0x0fb18fff, 250556416 bytes (61171 pages) avail memory = 253022208 (241 MB) bios32: Found BIOS32 Service Directory header at 0xc00f92a0 bios32: Entry = 0xf06c0 (c00f06c0) Rev = 0 Len = 1 pcibios: PCI BIOS entry at 0xf+0x8c0 pnpbios: Found PnP BIOS data at 0xc00fc240 pnpbios: Entry = f:c270 Rev = 1.0
Re: Lost CDROM on 9.1 with ATA_CAM on Promise controller
On Wed, 17-Apr-2013 at 10:53:54 +0200, Jeremy Chadwick wrote: On Wed, Apr 17, 2013 at 08:26:00AM +0200, Andre Albsmeier wrote: On Tue, 16-Apr-2013 at 21:38:22 +0200, Jeremy Chadwick wrote: On Tue, Apr 16, 2013 at 07:55:20PM +0200, Andre Albsmeier wrote: I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00) after going from 7.4 to 9.1 when using ATA_CAM. It is attached to a Promise PDC20268 UDMA100 controller. A standard harddisk drive attached to this controller works well. Cables, controller and drive where replaced already. Kernel gives me: atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... ada0 at ata2 bus 0 scbus2 target 0 lun 0 ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C) ... (cd2:ata3:0:0:0): got CAM status 0x50 (cd2:ata3:0:0:0): fatal error, failed to attach to device (cd2:ata3:0:0:0): lost device, 4 refs (cd2:ata3:0:0:0): removing device entry ... Attaching the CDROM drive to the controller that is integrated on the mainboard (Intel PIIX4 UDMA33 controller) does not show this problem (but here I don't have UDMA66). It also works when not using ATA_CAM: ... acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66 ... So this semes to be a problem with the Promise controller and ATA_CAM. Any ideas? Or should I file PR? The controller in question is a Promise Ultra100 TX2. Right. Tried with an Ultra133, same effect. The error message comes from sys/cam/scsi/scsi_cd.c, in function cddone(). The logic is a little hard for me to follow (I understand about 70% of it). Look at lines 1724 to 1877 for stable/9. 1. Can you provide full output from a verbose boot when the CD/DVD drive is attached to the Promise controller? Attached below. I have just filtered out some ahc cruft... Later I will try to boot a -current kernel -- just to see how this behaves... 2. What firmware version the card is using? The PDC20268 had many, many firmware problems relating to ATAPI devices. It is the latest BIOS: 2.20.0.15. 3. I wouldn't worry about ATA66 vs. ATA33; this drive can only support up to about 22MBytes/second so ATA66 isn't going to get you anything, so as a workaround, using the PIIX4 for it would not hurt you. Probably. But I already had cdrecord complain when it came to the funky DMA speed test it is doing. It went away when using the UDMA66 port. And on the other hand I sometimes use the PIIX4 port for other stuff and I do not want to attach the cdrom to the slave port. 4. ONLY if this turns out to be a controller thing: I'm not sure how much effort should be spent trying to make this work, as the PDC20268 is legacy/deprecated hardware (made/released 13 years ago). The whole box is more than 13 years old (good old Asus BX board) ;-) But since it worked in 7.4-STABLE I feel that this is some kind of regression. I do not want to waste anyone's resources in fixing it -- just if someone is curious and/or has an idea how to fix it... And here is the dmesg: {snipping for mail brevity} Thanks. CC'd ken@ and mav@ for advice on this. Here's the dmesg: Thanks for trying to help. http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073131.html Short details: The device under scrutiny here is cd2 on ata3, which is an ATAPI IDE-based optical drive. The drive works when either: a) Connected to a different IDE controller (atapci0), or, b) When ATA_CAM is removed (i.e. use ata(4) exclusively). No idea if atapicam(4) during scenario (b) works or not. :-) Ha, atapicam... Forgot about this ;-). atapicam works: FreeBSD 9.1-STABLE #8: Wed Apr 17 11:12:17 CEST 2013 ... atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... cd2 at ata3 bus 0 scbus3 target 0 lun 0 cd2: HL-DT-ST DVD-RAM GH22LP20 2.00 Removable CD-ROM SCSI-0 device cd2: 3.300MB/s transfers cd2: Attempt to query device size failed: NOT READY, Medium not present - tray closed ... Only visible difference to 7.4-STABLE: Here it got 66 MB/s but I think the above 3.3 are just faked up: FreeBSD 7.4-STABLE #0: Tue Aug 14 11:28:27 CEST 2012 ... cd2 at ata2 bus 0 target 0 lun 0 cd2: HL-DT-ST DVD-RAM GH22LP20 2.00 Removable CD-ROM SCSI-0 device cd2: 66.000MB/s transfers cd2: Attempt to query
Re: Lost CDROM on 9.1 with ATA_CAM on Promise controller
On Wed, 17-Apr-2013 at 10:53:54 +0200, Jeremy Chadwick wrote: On Wed, Apr 17, 2013 at 08:26:00AM +0200, Andre Albsmeier wrote: On Tue, 16-Apr-2013 at 21:38:22 +0200, Jeremy Chadwick wrote: On Tue, Apr 16, 2013 at 07:55:20PM +0200, Andre Albsmeier wrote: I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00) after going from 7.4 to 9.1 when using ATA_CAM. It is attached to a Promise PDC20268 UDMA100 controller. A standard harddisk drive attached to this controller works well. Cables, controller and drive where replaced already. Kernel gives me: atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... ada0 at ata2 bus 0 scbus2 target 0 lun 0 ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C) ... (cd2:ata3:0:0:0): got CAM status 0x50 (cd2:ata3:0:0:0): fatal error, failed to attach to device (cd2:ata3:0:0:0): lost device, 4 refs (cd2:ata3:0:0:0): removing device entry ... Attaching the CDROM drive to the controller that is integrated on the mainboard (Intel PIIX4 UDMA33 controller) does not show this problem (but here I don't have UDMA66). It also works when not using ATA_CAM: ... acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66 ... So this semes to be a problem with the Promise controller and ATA_CAM. Any ideas? Or should I file PR? The controller in question is a Promise Ultra100 TX2. Right. Tried with an Ultra133, same effect. The error message comes from sys/cam/scsi/scsi_cd.c, in function cddone(). The logic is a little hard for me to follow (I understand about 70% of it). Look at lines 1724 to 1877 for stable/9. 1. Can you provide full output from a verbose boot when the CD/DVD drive is attached to the Promise controller? Attached below. I have just filtered out some ahc cruft... Later I will try to boot a -current kernel -- just to see how this behaves... 2. What firmware version the card is using? The PDC20268 had many, many firmware problems relating to ATAPI devices. It is the latest BIOS: 2.20.0.15. 3. I wouldn't worry about ATA66 vs. ATA33; this drive can only support up to about 22MBytes/second so ATA66 isn't going to get you anything, so as a workaround, using the PIIX4 for it would not hurt you. Probably. But I already had cdrecord complain when it came to the funky DMA speed test it is doing. It went away when using the UDMA66 port. And on the other hand I sometimes use the PIIX4 port for other stuff and I do not want to attach the cdrom to the slave port. 4. ONLY if this turns out to be a controller thing: I'm not sure how much effort should be spent trying to make this work, as the PDC20268 is legacy/deprecated hardware (made/released 13 years ago). The whole box is more than 13 years old (good old Asus BX board) ;-) But since it worked in 7.4-STABLE I feel that this is some kind of regression. I do not want to waste anyone's resources in fixing it -- just if someone is curious and/or has an idea how to fix it... And here is the dmesg: {snipping for mail brevity} Thanks. CC'd ken@ and mav@ for advice on this. Here's the dmesg: http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073131.html Short details: The device under scrutiny here is cd2 on ata3, which is an ATAPI IDE-based optical drive. The drive works when either: a) Connected to a different IDE controller (atapci0), or, b) When ATA_CAM is removed (i.e. use ata(4) exclusively). And just as a note: The -current kernel from https://snapshots.glenbarber.us/Latest/FreeBSD-10.0-CURRENT-i386-20130316-r248381-bootonly.iso shows the same problem... -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Lost CDROM on 9.1 with ATA_CAM on Promise controller
I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00) after going from 7.4 to 9.1 when using ATA_CAM. It is attached to a Promise PDC20268 UDMA100 controller. A standard harddisk drive attached to this controller works well. Cables, controller and drive where replaced already. Kernel gives me: atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... ada0 at ata2 bus 0 scbus2 target 0 lun 0 ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C) ... (cd2:ata3:0:0:0): got CAM status 0x50 (cd2:ata3:0:0:0): fatal error, failed to attach to device (cd2:ata3:0:0:0): lost device, 4 refs (cd2:ata3:0:0:0): removing device entry ... Attaching the CDROM drive to the controller that is integrated on the mainboard (Intel PIIX4 UDMA33 controller) does not show this problem (but here I don't have UDMA66). It also works when not using ATA_CAM: ... acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66 ... So this semes to be a problem with the Promise controller and ATA_CAM. Any ideas? Or should I file PR? Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
ppc fails to attach to puc on 9.1-STABLE, 7.4-STABLE works
I want my printer port back on 9.1 ;-( I have this card: puc0@pci0:4:1:0:class=0x078000 card=0x00121000 chip=0x98359710 rev=0x01 hdr=0x00 vendor = 'NetMos Technology' device = 'PCI 9835 Multi-I/O Controller' class = simple comms It attached and worked under 7.4-STABLE (as long as I disabled the interrupt using hint.ppc.0.irq=): puc0: NetMos NM9835 Dual UART and 1284 Printer port port 0xdf00-0xdf07,0xde00-0xde07,0xdd00-0xdd07 ,0xdc00-0xdc07,0xdb00-0xdb07,0xda00-0xda0f irq 17 at device 1.0 on pci4 puc0: [FILTER] uart0: Non-standard ns8250 class UART with FIFOs on puc0 uart0: [FILTER] uart1: Non-standard ns8250 class UART with FIFOs on puc0 uart1: [FILTER] ppc0: Parallel port on puc0 ppc0: Generic chipset (ECP/EPP/PS2/NIBBLE) in ECP+EPP mode (EPP 1.9) ppbus0: Parallel port bus on ppc0 lpt0: Printer on ppbus0 lpt0: Polled port Under 9.1 the card does not attach the ppc anymore. The hint entries hint.ppc.0.at=puc0 hint.ppc.0.irq= hint.ppc.0.flags=0x2F get ignored and so it probes as ppc1 (failing due to the interrupt problem as it was in 7.4 without hints): puc0: NetMos NM9835 Dual UART and 1284 Printer port port 0xdf00-0xdf07,0xde00-0xde07,0xdd00-0xdd07 ,0xdc00-0xdc07,0xdb00-0xdb07,0xda00-0xda0f irq 17 at device 1.0 on pci4 uart2: Non-standard ns8250 class UART with FIFOs at port 1 on puc0 uart3: 16550 or compatible at port 2 on puc0 ppc1: Parallel port at port 3 on puc0 ppc1: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode ppc1: failed to register interrupt handler: 6 device_attach: ppc1 attach returned 6 Any ideas? How do I construct the hint entries under 9.1 so that 1. it does not want to use the interrupt (which made it attach under 7.4) 2. it takes the flags 0x2F as it did before. I have also never understood if ppc itself needs to attach to the irq as well (I thought this all would be handled by puc). Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bug with some em nics on RELENG_7
On Thu, 19-Nov-2009 at 09:27:36 -0800, Jack Vogel wrote: Cool, so stable/7 will just need to be updated :) I need to catch up all the Hi Jack, has this been done already? I am asking since I have problem which might be related to this: A FreeBSD-7.2 machine (Sat Nov 28 12:28:42 CET 2009) with an 'Intel PRO/1000 PL Network Adaptor (82573L)' NIC that is connected directly (through a crossover cable) to a satellite dish receiver. When the receiver has booted, I can't ping it from the FreeBSD box: gate:~ping db PING dreambox.home.albsmeier.net (192.168.128.66): 56 data bytes ping: sendto: No buffer space available ping: sendto: No buffer space available ping: sendto: No buffer space available ^C I have to do an ifconfig em0 down up on the FreeBSD box and all is well (pinging and transferring files) until the receiver has been shut down and rebooted again. This setup worked without problems under 6.4-STABLE so I am pretty sure it has something to do with 7-STABLE's em drivers. Thanks, -Andre drivers in that stream actually. Thanks for testing!! Jack On Thu, Nov 19, 2009 at 8:58 AM, Mike Tancsa m...@sentex.net wrote: At 07:29 PM 11/18/2009, Jack Vogel wrote: Hey Mike, Can you check if you see the same behavior on RELENG 8? For RELENG_8. I installed an fxp card and netbooted off it. I assigned an IP address to the onboard nic (em5). Pinged itself, got a MAC and response. Plugged in the cable, pinged the other side, all ok Did the same, but pinged the other side, and plugged the cable in, all worked as expected. # ping 10.177.194.18 PING 10.177.194.18 (10.177.194.18): 56 data bytes ping: sendto: Host is down ping: sendto: Host is down ping: sendto: Host is down 64 bytes from 10.177.194.18: icmp_seq=30 ttl=64 time=1329.918 ms 64 bytes from 10.177.194.18: icmp_seq=31 ttl=64 time=324.925 ms 64 bytes from 10.177.194.18: icmp_seq=32 ttl=64 time=0.054 ms 64 bytes from 10.177.194.18: icmp_seq=33 ttl=64 time=0.055 ms 64 bytes from 10.177.194.18: icmp_seq=34 ttl=64 time=0.047 ms 64 bytes from 10.177.194.18: icmp_seq=35 ttl=64 time=0.050 ms 64 bytes from 10.177.194.18: icmp_seq=36 ttl=64 time=0.047 ms 64 bytes from 10.177.194.18: icmp_seq=37 ttl=64 time=0.049 ms 64 bytes from 10.177.194.18: icmp_seq=38 ttl=64 time=0.043 ms em4: Intel(R) PRO/1000 Network Connection 6.9.14 port 0xcc00-0xcc1f mem 0xfaee-0xfaef,0xfaedc000-0xfaed irq 16 at device 0.0 on pci6 em4: Using MSIX interrupts em4: [ITHREAD] em4: [ITHREAD] em4: [ITHREAD] em4: Ethernet address: 00:30:48:d6:ef:12 pcib7: ACPI PCI-PCI bridge irq 16 at device 28.1 on pci0 pci7: ACPI PCI bus on pcib7 em5: Intel(R) PRO/1000 Network Connection 6.9.14 port 0xdc00-0xdc1f mem 0xfafe-0xfaff,0xfafdc000-0xfafd irq 17 at device 0.0 on pci7 em5: Using MSIX interrupts em5: [ITHREAD] em5: [ITHREAD] em5: [ITHREAD] em5: Ethernet address: 00:30:48:d6:ef:13 So the problem is _not_ there under RELENG_8. I also tested the 2 PCIe nics to make sure they are still working, and they are. Full dmesg below opyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-RC2 #0: Wed Nov 11 09:54:52 EST 2009 mdtan...@ich10.sentex.ca:/usr/obj/usr/src/sys/alix i386 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz (2660.00-MHz 686-class CPU) Origin = GenuineIntel Id = 0x106a5 Stepping = 5 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x98e3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT AMD Features=0x2810NX,RDTSCP,LM AMD Features2=0x1LAHF TSC: P-state invariant real memory = 6446645248 (6148 MB) avail memory = 3137355776 (2992 MB) ACPI APIC Table: 011209 APIC2037 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 2 cpu2 (AP): APIC ID: 4 cpu3 (AP): APIC ID: 6 ioapic0: Changing APIC ID to 1 ioapic0 Version 2.0 irqs 0-23 on motherboard kbd1 at kbdmux0 cryptosoft0: software crypto on motherboard acpi0: 011209 XSDT2037 on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a (3) failed acpi0: reservation of 10, bff0 (3) failed Timecounter ACPI-safe frequency 3579545 Hz quality 850 acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0 pci1: ACPI PCI bus on pcib1 em0: Intel(R) PRO/1000 Network
Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)
On Sat, 03-Oct-2009 at 22:27:39 +, Bjoern A. Zeeb wrote: On Sat, 3 Oct 2009, Andre Albsmeier wrote: Hi, On Sat, 03-Oct-2009 at 16:27:32 -0400, jhell wrote: On Sat, 3 Oct 2009 14:42 -, Andre.Albsmeier wrote: FYI, after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all samba33 programmes did abort() immediately after start. The solution was to use CONFIGURE_ARGS+= --disable-pie -Andre To add an additional note samba33 even when not running (not enabled by a rcvar) also runs a tdbcleanup routine on shutdown and/or start that also does abort(). Yes, every samba programme is linked with -pie per default (so all abort()). Thanks for reporting the issue. People are aware of the problem now and we'll try to present a solution within the next days for better position-independent executable (PIE) handling. Meanwhile there are multiple solutions for people affected: (1) recompile the port; but as more than just samba might be affected and we generally do not want to flip the pie switch everywhere that's probably only a temporary, private solution. I'll stick to this since I am happy about having the map_at_zero option and want to continue to try it out on 7.2-STABLE. And I see now reason why samba has to be linked with -pie (without -pie it is also 4% smaller). -Andre -- I think there is a world market for maybe five computers. - Thomas Watson, chairman of IBM, 1943 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
security.bsd.map_at_zero=0 problem with samba33 (including solution)
FYI, after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all samba33 programmes did abort() immediately after start. The solution was to use CONFIGURE_ARGS+= --disable-pie -Andre -- BSD, from the people who brought you TCP/IP. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: security.bsd.map_at_zero=0 problem with samba33 (including solution)
On Sat, 03-Oct-2009 at 16:27:32 -0400, jhell wrote: On Sat, 3 Oct 2009 14:42 -, Andre.Albsmeier wrote: FYI, after setting security.bsd.map_at_zero to 0 on 7.2-STABLE all samba33 programmes did abort() immediately after start. The solution was to use CONFIGURE_ARGS+= --disable-pie -Andre To add an additional note samba33 even when not running (not enabled by a rcvar) also runs a tdbcleanup routine on shutdown and/or start that also does abort(). Yes, every samba programme is linked with -pie per default (so all abort()). -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Bug in 7.2-STABLE's /bin/sh?
Hello all, is it correct to print OK here? -- snip -- #!/bin/sh if false || ! echo bla | grep -q bla; then echo OK fi -- snap --- 7.2-STABLE (can't check others at the moment) does which I think is wrong... Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Bug in 7.2-STABLE's /bin/sh?
On Thu, 01-Oct-2009 at 18:44:08 +0300, Andriy Gapon wrote: on 01/10/2009 17:49 Andre Albsmeier said the following: Hello all, is it correct to print OK here? -- snip -- #!/bin/sh if false || ! echo bla | grep -q bla; then echo OK fi -- snap --- 7.2-STABLE (can't check others at the moment) does which I think is wrong... This looks like a bug and it seems to be fixed in head. Forgotten MFC? Have you got a PR# handy? -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Bug in 7.2-STABLE's /bin/sh?
On Thu, 01-Oct-2009 at 20:31:01 +0200, Andre Albsmeier wrote: On Thu, 01-Oct-2009 at 18:44:08 +0300, Andriy Gapon wrote: on 01/10/2009 17:49 Andre Albsmeier said the following: Hello all, is it correct to print OK here? -- snip -- #!/bin/sh if false || ! echo bla | grep -q bla; then echo OK fi -- snap --- 7.2-STABLE (can't check others at the moment) does which I think is wrong... This looks like a bug and it seems to be fixed in head. Forgotten MFC? Have you got a PR# handy? Found it myself: http://www.freebsd.org/cgi/cvsweb.cgi/src/bin/sh/parser.c.diff?r1=1.60;r2=1.61 seems to be it. Could someone please MFC that to 7.2? Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
7.2-STABLE: Wiring down umass devices to uhubs
[This is on 7.2-STABLE] I have a USB card reader which gives me 4 da drives. I was able to wire them down so they appear as: scbus3 on umass-sim0 bus 0: Generic USB SD Reader 1.00 at scbus3 target 0 lun 0 (da30,pass30) Generic USB CF Reader 1.01 at scbus3 target 0 lun 1 (da31,pass31) Generic USB SM Reader 1.02 at scbus3 target 0 lun 2 (da32,pass32) Generic USB MS Reader 1.03 at scbus3 target 0 lun 3 (da33,pass33) The relevant parts of dmesg look like this (indented for readability): usb4: EHCI version 1.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 on usb4 uhub5: vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/7.02, addr 2 on uhub4 umass0: Generic Mass Storage Device, class 0/0, rev 2.00/1.00, addr 3 on uhub5 da30 at umass-sim0 bus 0 target 0 lun 0 da31 at umass-sim0 bus 0 target 0 lun 1 da32 at umass-sim0 bus 0 target 0 lun 2 da33 at umass-sim0 bus 0 target 0 lun 3 However, if I insert a USB stick into one of the remaining ports umass0 will get attached to it and umass1 to my quad USB card reader device: usb4: EHCI version 1.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 on usb4 umass0: vendor 0x13fe Patriot Memory, class 0/0, rev 2.00/1.10, addr 2 on uhub4 da30 at umass-sim0 bus 0 target 0 lun 0 uhub5: vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/7.02, addr 3 on uhub4 umass1: Generic Mass Storage Device, class 0/0, rev 2.00/1.00, addr 4 on uhub5 da7 at umass-sim1 bus 1 target 0 lun 0 da8 at umass-sim1 bus 1 target 0 lun 1 da9 at umass-sim1 bus 1 target 0 lun 2 da14 at umass-sim1 bus 1 target 0 lun 3 How can I wire down umass0 so that it always gets attached to uhub5? Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: net/iwi-firmware refuses to build on 7.2-STABLE
On Thu, 02-Jul-2009 at 11:08:09 +0200, Bartosz Fabianowski wrote: iwicontrol is not used on 7 any more. You can use sysctl to query the hardware switch instead: sysctl dev.iwi.0.radio That did it, thanks a lot, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
net/iwi-firmware refuses to build on 7.2-STABLE
Today I wanted to build net/iwi-firmware on 7.2-STABLE and I got === iwi-firmware-2.4_8 is configured with iwicontrol(8) which you don't need, use 'make rmconfig' and uncheck CONTROL. While the firmware itself is in the base system, iwicontrol(8) is not. The port should be modified to cope with this but I have no idea in what way. Some ideas: 1. Remove the build restrictions 2. Just build iwicontrol on systems where the fw is not needed 3. Make an extra port just to build iwicontrol 4. ??? Any suggestions? Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: net/iwi-firmware refuses to build on 7.2-STABLE
On Wed, 01-Jul-2009 at 13:28:57 -0700, Sam Leffler wrote: Andre Albsmeier wrote: Today I wanted to build net/iwi-firmware on 7.2-STABLE and I got === iwi-firmware-2.4_8 is configured with iwicontrol(8) which you don't need, use 'make rmconfig' and uncheck CONTROL. While the firmware itself is in the base system, iwicontrol(8) is not. The port should be modified to cope with this but I have no idea in what way. Some ideas: 1. Remove the build restrictions 2. Just build iwicontrol on systems where the fw is not needed 3. Make an extra port just to build iwicontrol 4. ??? Any suggestions? Thanks, man iwi; the firmware is automatically loaded by the driver Yes, I know. I don't want to load the firmware, I want to query the state of the hardware switch which enables the transmitter (iwicontrol iwi0 -r). Thanks, -Andre -- Never argue with an idiot. They drag you down to their level, then beat you with their experience. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Sensorsd framework in 7.X ?
On Mon, 28-Jul-2008 at 15:46:55 +0100, Vincent Hoffman wrote: Cristiano Deana wrote: Hi, any news about a MFC from -current to -stable (or 7.1) for the sensorsd framework? I find it very useful in openbsd, so i hoped to have it soon in free. tnx As far as i can understand it was backed out from -CURRENT soon after the initial commit and so will not be coming to 7.x any time soon. see the (rather long) thread here for details. http://lists.freebsd.org/pipermail/cvs-src/2007-October/082398.html Since people who complained about this failed to come up with something better, I have (partly) backported it into 6-STABLE and I am happy with it. While one can surely argue about this festering junkpile that does not belong in the kernel until the day hell freezes, I am glad to have something that actually works. It's no big deal (at least not for 6-STABLE)... -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Call for Testers: ncurses 5.6 update
On Tue, 13-Mar-2007 at 00:45:24 +0800, Rong-en Fan wrote: On 3/12/07, Stefan Lambrev [EMAIL PROTECTED] wrote: Rong-en Fan wrote: Hi folks, ncurses in 6.x is pretty old. We have update-to-date ncurses in 7.x with wide character support now. The patch at http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070310.diff.gz gives you ncurses 5.6 and wide character support in 6.x. Please apply with 'patch -p0' under /usr/src. For more information, please visit http://people.freebsd.org/~rafan/ncurses/ You can also find individual patches, say ncurses update and wide character support, there. Feedbacks and suggestions are welcome. P.S. Due to some lib32 issues, the patch above contains changes made by ru@ recently for src/Makefile.inc1. make installworld failed: cd /usr/src; /usr/obj/usr/src/make.amd64/make -f Makefile.inc1 install32 mkdir -p /usr/lib32 # XXX add to mtree [...] Sorry about this. I messed up the lib32 changes in the all-in-one patch. Could you please use this one instead? http://people.freebsd.org/~rafan/ncurses/ncursesw-5.6-all-fbsd6-20070312.diff.gz I am running this patch on 6.2-STABLE (i386). I have rebuilt world and ports/mail/mutt-devel (using WITH_MUTT_NCURSES=1). Everything seems to work and mutt also runs perfectly with UTF8. Before applying this patch I had to use WITH_MUTT_NCURSES_PORT=1 :-). Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: CF card and /dev filesystem entries
On Mon, 21-Nov-2005 at 09:11:51 +, Brian Candler wrote: On Sun, Nov 20, 2005 at 02:11:56PM -1000, Robert Marella wrote: Now, that does the job: [EMAIL PROTECTED] brian# dd if=/dev/null of=/dev/da0 count=0 0+0 records in 0+0 records out 0 bytes transferred in 0.28 secs (0 bytes/sec) [EMAIL PROTECTED] brian# ls -l /dev/da0* crw-r- 1 root operator0, 107 Nov 20 21:00 /dev/da0 crw-r- 1 root operator0, 115 Nov 20 21:00 /dev/da0s1 [EMAIL PROTECTED] brian# However, I'm not sure I actually *like* opening my CF card in such a way that I would be likely to overwrite the partition table if I hit a wrong key... Regards, Brian. I am not sure if this is any different but when placing or changing a card in my card reader I run cat /dev/null /dev/daX before mounting. I have cards of different sizes and it would fail to mount a different size card without doing the above first. Yeah, that's the same thing: open the device for write and then close it again. I still wouldn't be happy that in a haze I might type cat /dev/zero /dev/daX instead of cat /dev/null /dev/daX but I guess I can write a script. FYI, http://www.freebsd.org/cgi/query-pr.cgi?85975 -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: MNT_USER?
On Tue, 03-May-2005 at 04:10:18 -0700, Colin Percival wrote: Danny Braniss wrote: BTW, this, the MNT_NOEXEC, uncovered, IMHO, a bug in libexec/rtld-elf/rtld.c where it's now checking for MNT_NOEXEC, but only if LD_LIBRARY_PATH is set! This is not a bug. Checking for MNT_NOEXEC adds a cost in performance, and it is not necessary if LD_LIBRARY_PATH, LD_PRELOAD, and LD_LIBMAP* are not set -- based on the assumption, that is, that no (sane) sysadmin would ever put a MNT_NOEXEC-mounted filesystem into the default library path. I agree that it's a bit counter-intuitive, but it's really just a case of saving time by not checking for something which should Never Happen. :-) Colin Percival PS. Bravo to Ian for tracking down the bug in NFS -- I spent a while looking You may want to look at the PR mentioned in the commit message to see who did this initially. I just changed it at a different place of the kernel (the same way as it was done in 4.x). -Andre for this, but got hopelessly lost. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] -- An NT server can be run by an idiot, and usually is. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Flash player sound solution
On Sun, 13-Mar-2005 at 17:12:43 -0500, Scott Robbins wrote: On Sun, Mar 13, 2005 at 06:50:01PM +, Chris Hodgins wrote: Scott Robbins wrote: There are a few pages out there that will crash FreeBSD native firefox with linuxpluginwrapper, at least. The problem doesn't occur in Linux, and most of the time, these sites will work with FreeBSD's linux-opera (and the linuxpluginwrapper). http://www.tvguide.com http://www.espn.com are two examples that always crash it for me, and apparently for most poeple on BSD forums. Interesting. Both of those sites hang but don't crash as such. Wierd. Chris, and others, I apologize, hang is what they do and I shouldn't have loosely used the word crash. What happens is (again, judging from some threads on freebsdforums, to everyone who tries ) is that if one opens the site, it'll begin to load and finally freeze, to the point, at least on my machine, where the only way to stop it is to find the PID and kill it. What happens if you (temporarely) remove /dev/dsp and try again? You will get no sound but I assume firefox won't hang anymore... -Andre -- Win98: useless extension to a minor patch release for 32-bit extensions and a graphical shell for a 16-bit patch to an 8-bit operating system originally coded for a 4-bit microprocessor, written by a 2-bit company that can't stand for 1 bit of competition. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Flash player sound solution
On Sat, 12-Mar-2005 at 22:21:35 -0800, Joshua Tinnin wrote: On Saturday 12 March 2005 05:42 am, James McNaughton [EMAIL PROTECTED] wrote: The Problem: Using native Mozilla and linuxpluginwrapper/linux-flashplugin (on 4.10-stable et al) to view flash content results in no sound and occasional Mozilla freezes. The Solution: Run esd. How: I was searching the web for the solution, and got nowhere. There didn't seem to be anyone who had gotten it to work. Linux mailing lists noted a problem with file permissions on /dev/snd or /dev/pcm* depending on the sound system drivers installed. My /dev/pcm* file permissions were all rw to begin with, so this didn't help. I wondered what device the plugin was actually trying to access, so I did strings /usr/local/lib/linux-flashplugin6/libflashplayer.so and found /dev/dsp and /dev/mixer. Nearby in the list I noticed a few lines relating to esd. From a command line I started esd and went back to view some content that had previously frozen Mozilla, and there was sound coming out my speakers and the browser did not hang. I had long suspected the browser hangs were related to sound in the flash content. My results seem to confirm that suspicion. Do you have an example (or two or more) of a page that causes the crash? I'd like to test this on 5.3-R. I know I used to have an email around here somewhere with examples ... Speaking of flashplugin crashes: There is some flash content out there which makes my firefox crash (not due to sound but for some other reasons). One example is: http://cgi.ebay.de/ws/eBayISAPI.dll?ViewItemitem=5961030392 I'd like to know if other people also experience crashes when viewing this page. The flash content isn't obvious to find: When the page has opened search for the string Setzen Sie die Andale Gallery. Directly above is the flash part. I can only view this page when disabling the linux-flashplugin :-( Thanks, -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Flash player sound solution
On Sun, 13-Mar-2005 at 17:18:13 +, Chris Hodgins wrote: Andre Albsmeier wrote: On Sat, 12-Mar-2005 at 22:21:35 -0800, Joshua Tinnin wrote: On Saturday 12 March 2005 05:42 am, James McNaughton [EMAIL PROTECTED] wrote: The Problem: Using native Mozilla and linuxpluginwrapper/linux-flashplugin (on 4.10-stable et al) to view flash content results in no sound and occasional Mozilla freezes. The Solution: Run esd. How: I was searching the web for the solution, and got nowhere. There didn't seem to be anyone who had gotten it to work. Linux mailing lists noted a problem with file permissions on /dev/snd or /dev/pcm* depending on the sound system drivers installed. My /dev/pcm* file permissions were all rw to begin with, so this didn't help. I wondered what device the plugin was actually trying to access, so I did strings /usr/local/lib/linux-flashplugin6/libflashplayer.so and found /dev/dsp and /dev/mixer. Nearby in the list I noticed a few lines relating to esd. From a command line I started esd and went back to view some content that had previously frozen Mozilla, and there was sound coming out my speakers and the browser did not hang. I had long suspected the browser hangs were related to sound in the flash content. My results seem to confirm that suspicion. Do you have an example (or two or more) of a page that causes the crash? I'd like to test this on 5.3-R. I know I used to have an email around here somewhere with examples ... Speaking of flashplugin crashes: There is some flash content out there which makes my firefox crash (not due to sound but for some other reasons). One example is: http://cgi.ebay.de/ws/eBayISAPI.dll?ViewItemitem=5961030392 I'd like to know if other people also experience crashes when viewing this page. The flash content isn't obvious to find: When the page has opened search for the string Setzen Sie die Andale Gallery. Directly above is the flash part. I can only view this page when disabling the linux-flashplugin :-( Thanks, -Andre That worked ok for me without crashing. One site that I know always Hmm, I think the ebay stuff is very dynamic. Maybe there are some local browser settings which influence the output. Did you find the flash content in the page? crashes my browser is this one: http://www.flickr.com/photos/ Just click on any of the photos there and the new page will load up and crash. If anyone cares I have attached the backtrace from the crash below. Same here :-(. -Andre ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: problem with ipfilter and todays -stable
On Fri, 13-Aug-2004 at 19:19:02 +, Michael Handler wrote: On 2004-08-13, Bernhard Valenti [EMAIL PROTECTED] wrote: i just updated from 4.8 to 4.10-stable(from today). i noticed that i can't ping the machine. [...] I just did the same upgrade last night, and am experiencing similar troubles. (block in quick log on dc0 isn't actually blocking anything.) Someone on freebsd-net just noticed this as well: http://lists.freebsd.org/pipermail/freebsd-net/2004-August/004675.html Darren Reed MFCed IPFilter 3.4.35 in early July, and I don't think that ipfilter was updated completely in both of the relevant places (src/contrib/ipfilter and src/sys/contrib/ipfilter). If you diff Yes, he forgot to MFC ipl.h into src/contrib/ipfilter, see PR# 70492. the files that exist in both locations, there are some troubling differences, especially the missing member of the qif structure in ip_compat.h, etc. Well, it seems that src/contrib/ipfilter/ip_compat.h simply isn't used by the userland parts of ipfilter (only by the kernel stuff in src/sys/contrib/ipfilter where the file is up to date). However, since there have always been confusing discrepancies (at least for me) between the files in src/sys/contrib/ipfilter and src/contrib/ipfilter, I have replaced src/contrib/ipfilter by the offical ip_filter-3.4.35 package and made src/sys/contrib/ipfilter/netinet a symlink to this location just to be sure to use consistent versions of all files. (I have done this several times before when I wanted to test a not yet commited version of ipfilter). However, this does not fix my problem which can be found at http://marc.theaimsgroup.com/?l=ipfilterm=109259371522385 When looking at HISTORY, we find a lot of changes w.r.t. checksum corrections in ICMP packages so I assume there are still some bugs in there. I'm seeing the same problem that the freebsd-net poster did: [EMAIL PROTECTED]:~# ipf -V ipf: IP Filter: v3.4.31 (336) Kernel: IP Filter: v3.4.35 Same here (before replacing src/contrib/ipfilter as described above) due to the missing MFC of ipl.h. -Andre ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: FreeBSD 4.9 RC1 (i386) now available
[CC'ed Martin Blapp since he was the last one who touched amd] On Mon, 29-Sep-2003 at 08:19:05 -0700, Murray Stokely wrote: Not all FTP sites have the first release candidate, but it is at least available from ftp.freebsd.org. Please download and install this candidate and help us find bugs BEFORE we call it 4.9-RELEASE. Just tracked down a nasty bug when using amd to mount msdos filesystems. http://www.freebsd.org/cgi/query-pr.cgi?pr=57401 -Andre ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: (da0:ahc0:0:0:0): Unexpected busfree in Data-in phase and other weirdness
On Sat, 01-Mar-2003 at 10:49:34 +0100, Francesco Casadei wrote: On Sat, Mar 01, 2003 at 12:50:31AM +0100, Joan Picanyol i Puig wrote: [reposted from -scsi@, maybe that's not the right place] Hi, On an Adaptec 2940 I have an IBM DNES-309170W and a SEAGATE ST318438LW, soft-raided with vinum. Lately it seems that the Seagate disc has become 'unstable', and I don't know how to diagnose any further. I've checked the cabling and I've tried the SeaTools floppy disk from Seagate but it hangs on my system :( Please have a look at the excerpt of kernel logs at http://biaix.org/pk/debug/. messages.1.kernel shows what happened (look for Feb 4) while recording a cd with my SCSI cd-writer. The system appeared to hang for anything between 3 and 20 minutes while I was getting those. messages.0.kernel shows what happened today (Feb 27) for no apparent reason. Problems persisted across reboots, even though some of them were not logged (could not fsck). For further reference, please look at this thread from two months ago: http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=2329637+0+/usr/local/www/db/text/2002/freebsd-questions/20021222.freebsd-questions I'm really stumped so I'd appreciate any help in the lines of: 1.- What's causing these problems? 2.- How can I solve them? tks -- pica To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message end of the original message I'm having this problem too. When this happens the SCSI bus is reset. Here's the error message (not wrapped): ... Here's the system configuration: # camcontrol devlist IBM DDRS-34560W S92A at scbus0 target 0 lun 0 (pass0,da0) IBM DNES-318350W SA30at scbus0 target 1 lun 0 (pass1,da1) TEAC CD-R55S 1.0Rat scbus0 target 2 lun 0 (pass2,cd0) PLEXTOR CD-ROM PX-32TS 1.02 at scbus0 target 3 lun 0 (pass3,cd1) (I assume the cabling/termination has been checked already.) I had bus problems with my DNES until I upgraded the firmware to: da2: IBM DNES-318350W SAH0 Fixed Direct Access SCSI-3 device In general, I have often seen bus problems when a lot of different devices are hanging on the same SCSI bus and some drives are being hit really hard. They always went away with a new fw, especially on my IBM DDYS drives. The Plextor fw seems a bit old as well (at least compared to my PLEXTOR CD-ROM PX-40TS 1.13) but I don't know if the 32TS uses the same as the 40TS. I have written a program to upgrade the firmware on IBM and Plextor (and some other devices) under FreeBSD in case you are interested... However, IMO one should play with the fw only in case of problems and not just to get lastest version... -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
4.7-STABLE: panic: ffs_blkfree: freeing free block, softupdates related?
In case anyone wants to dig into this... 4.7-STABLE #0: Mon Dec 23 15:41:17 CET 2002 using softupdates. /dev/ad2a is a 120GB hard drive, 96GB were in use. 75 of these 96GB were rm -rf'ed. The rm command returned quickly since the 75GB were stored in few big files. The box crashed during the delayed softupdates writes. The kernel and core are available for debugging... I can also do simple debugging if someone tells me what to do :-) root@romfix:/var/crashgdb -k /usr/obj/src/src-4/sys/romfix/kernel.debug vmcore.0 GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-unknown-freebsd...Deprecated bfd_read called at /src/src-4/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /src/src-4/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf IdlePTD at phsyical address 0x002dd000 initial pcb at physical address 0x002606e0 panicstr: ffs_blkfree: freeing free block panic messages: --- panic: ffs_blkfree: freeing free block syncing disks... 7 5 4 4 2 2 2 2 2 2 2 dev = #ad/16, block = 94920, fs = /cdroms panic: ffs_blkfree: freeing free block Uptime: 4d22h14m1s dumping to dev #ad/1, offset 99280 dump ata0: resetting devices .. done 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 dumpsys () at /src/src-4/sys/kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) where #0 dumpsys () at /src/src-4/sys/kern/kern_shutdown.c:487 #1 0xc01454df in boot (howto=260) at /src/src-4/sys/kern/kern_shutdown.c:316 #2 0xc0145904 in poweroff_wait (junk=0xc02360c0, howto=-1071423328) at /src/src-4/sys/kern/kern_shutdown.c:595 #3 0xc01d6d56 in ffs_blkfree (ip=0xc83d1bb8, bno=94920, size=16384) at /src/src-4/sys/ufs/ffs/ffs_alloc.c:1444 #4 0xc01db68b in indir_trunc (ip=0xc83d1bb8, dbn=125477504, level=0, lbn=12, countp=0xc83d1ba8) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:2233 #5 0xc01db445 in handle_workitem_freeblocks (freeblks=0xc1149400) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:2133 #6 0xc01d994b in process_worklist_item (matchmnt=0x0, flags=0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:723 #7 0xc01d97de in softdep_process_worklist (matchmnt=0x0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:622 #8 0xc014533d in boot (howto=256) at /src/src-4/sys/kern/kern_shutdown.c:261 #9 0xc0145904 in poweroff_wait (junk=0xc02360c0, howto=-1071423328) at /src/src-4/sys/kern/kern_shutdown.c:595 #10 0xc01d6d56 in ffs_blkfree (ip=0xc83d1e0c, bno=94984, size=16384) at /src/src-4/sys/ufs/ffs/ffs_alloc.c:1444 #11 0xc01db68b in indir_trunc (ip=0xc83d1e0c, dbn=127975424, level=0, lbn=12, countp=0xc83d1dfc) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:2233 #12 0xc01db445 in handle_workitem_freeblocks (freeblks=0xc1149380) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:2133 #13 0xc01d994b in process_worklist_item (matchmnt=0x0, flags=0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:723 #14 0xc01d97de in softdep_process_worklist (matchmnt=0x0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:622 #15 0xc01729bf in sched_sync () at /src/src-4/sys/kern/vfs_subr.c:1177 (kgdb) And the dmesg: Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.7-STABLE #0: Mon Dec 23 15:41:17 CET 2002 [EMAIL PROTECTED]:/src/obj-4/src/src-4/sys/romfix Timecounter i8254 frequency 1192977 Hz Timecounter TSC frequency 501050245 Hz CPU: AMD-K6(tm) 3D processor (501.05-MHz 586-class CPU) Origin = AuthenticAMD Id = 0x58c Stepping = 12 Features=0x8021bfFPU,VME,DE,PSE,TSC,MSR,MCE,CX8,PGE,MMX AMD Features=0x8800SYSCALL,3DNow! real memory = 134217728 (131072K bytes) avail memory = 127889408 (124892K bytes) Preloaded elf kernel kernel at 0xc02be000. K6-family MTRR support enabled (2 registers) Using $PIR table, 6 entries at 0xc00fddf0 npx0: math processor on motherboard npx0: INT 16 interface pcib0: Host to PCI bridge on motherboard pci0: PCI bus on pcib0 pcib2: VIA 82C598MVP (Apollo MVP3) PCI-PCI (AGP) bridge at device 1.0 on pci0 pci1: PCI bus on pcib2 isab0: VIA 82C586 PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: VIA 82C586 ATA33 controller port 0xe000-0xe00f at device
Re: Matrox G550 tv-out
On Sun, 10-Nov-2002 at 22:42:28 -0500, Rob Babb wrote: I have a G550 running on 4.7. I've installed the ports collection for the Matrox Power Desk, which seems to be working in KDE 3, but the tv-out settings are grayed out. I've tried getting help from Matrox's Linux help, but they say it's not supported on the G550. Does anyone know how to make the tv-out for the G550 work on freebsd 4.7? TV-Out on the G450 and G550 is not supported under X11. Try to get a G400 DH if you need TV-Out. That's what I did. Go to the Matrox forum and complain there but be assured it won't help. They'll tell you that TV-Out under X is not supported on the G450 and G550 but they won't tell you why. W.r.t the G400 they are quite helpful. The Matrox driver for the G400 is really good. It supports two real screens :0.0 and :0.1 so I can use :0.1 as TV-Out screen with mplayer. Another possibility would be to check the TV-Out features of the new NVidia driver released recently. -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: cvs commit: src/usr.sbin/ppp command.c (fwd)
On Tue, 17-Sep-2002 at 04:17:38 +0400, Dmitry Morozovsky wrote: Colleagues, the following commit brings up a thought: has anyone an experience of narrowing the system when there is no chances to use IPv6? AFAICS, NOINET6 is not mentioned anywhere in share/mk, sys/conf/*, sys/*/conf/*, neither for -STABLE not for -CURRENT? Or, is this commit simply a bandaid? I have the following lines in /etc/make.conf for years now without problems (well, apart from small things like the one below): CFLAGS=-O -pipe -fno-ident -UINET6 -UIPSEC NOINET6=true -Andre Thanks in advance. Sincerely, D.Marck [DM5020, DM268-RIPE, DM3-RIPN] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- [EMAIL PROTECTED] *** -- Forwarded message -- Date: Sun, 15 Sep 2002 18:41:05 -0700 (PDT) From: Brian Somers [EMAIL PROTECTED] To: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: cvs commit: src/usr.sbin/ppp command.c brian 2002/09/15 18:41:04 PDT Modified files:(Branch: RELENG_4) usr.sbin/ppp command.c Log: MFC: Unbreak things when NOINET6 is defined Approved by: re (jhb) RevisionChangesPath 1.230.2.17 +2 -0 src/usr.sbin/ppp/command.c To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe cvs-all in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message -- Linux: Sozialismus, der nicht funktioniert To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: sys/modules/aic7xxx/aicasm breaks buildworld with MODULES_WITH_WORLD=true
On Tue, 03-Sep-2002 at 11:56:42 +0900, Yoshihiko SARUMARU wrote: Hi all, Today, I buildworld with RELENG_4 branch (4.7-PRERELEASE) as usual, but it failed with sys/modules/aic7xxx/aicasm: (snip) cd /usr/src/sys; make buildincludes; make installincludes (snip) === sys/modules/aic7xxx === sys/modules/aic7xxx/aicasm make: don't know how to make buildincludes. Stop *** Error code 2 Stop in /usr/src/sys/modules/aic7xxx. *** Error code 1 (snip) I repeated buildworld with some situation and I found disabling MODULES_WITH_WORLD=true in my /etc/make.conf lead buildworld and buildkernel to be succeeded. But I see no one has trouble with this so far. Does MODULES_WITH_WORLD is obsoleted or my /usr/src is something wrong? I have the same problem here. I hope it gets fixed since I like to use MODULES_WITH_WORLD=true. -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
panic: ffs_vfree: freeing free inode
One of my 4.5-STABLE servers crashed today with this msg: panic: ffs_vfree: freeing free inode I have the kernel.debug and the dump. I don't know how to debug ffs stuff but I I am happy if someone instructs me what to do. Since the filesystems are mounted with softupdates it might be a bug in the softupdates code. The kernel (and userland) where compiled Thu Mar 28 14:37:35 CET 2002. I can't remember any commits to the fs code so I assume the problem is in 4.6-PRERELEASE as well. This is what I got so far: root@server:/var/crashgdb -k /usr/obj/src/src-4/sys/server/kernel.debug vmcore.10 GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-unknown-freebsd... IdlePTD at phsyical address 0x00303000 initial pcb at physical address 0x00282ee0 panicstr: ffs_vfree: freeing free inode panic messages: --- panic: ffs_vfree: freeing free inode syncing disks... 31 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 giving up on 1 buffers Uptime: 18d6h24m37s dumping to dev #da/9, offset 276456 dump 511 510 509 508 507 506 505 504 503 502 501 500 499 498 497 496 495 494 493 492 491 490 489 488 487 486 485 484 483 482 481 480 479 478 477 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 428 427 426 425 424 423 422 421 420 419 418 417 416 415 414 413 412 411 410 409 408 407 406 405 404 403 402 401 400 399 398 397 396 395 394 393 392 391 390 389 388 387 386 385 384 383 382 381 380 379 378 377 376 375 374 373 372 371 370 369 368 367 366 365 364 363 362 361 360 359 358 357 356 355 354 353 352 351 350 349 348 347 346 345 344 343 342 341 340 339 338 337 336 335 334 333 332 331 330 329 328 327 326 325 324 323 322 321 320 319 318 317 316 315 314 313 312 311 310 309 308 307 306 305 304 303 302 301 300 299 298 297 296 295 294 293 292 291 290 289 288 287 286 285 284 283 282 281 280 279 278 277 276 275 274 273 272 271 270 269 268 267 266 265 264 263 262 261 260 259 258 257 256 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 [CTRL-C to abort] [CTRL-C to abort] [CTRL-C to abort] 1 0 [CTRL-C to abort] [CTRL-C to abort] [CTRL-C to abor --- #0 dumpsys () at /src/src-4/sys/kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) where #0 dumpsys () at /src/src-4/sys/kern/kern_shutdown.c:487 #1 0xc0161443 in boot (howto=256) at /src/src-4/sys/kern/kern_shutdown.c:316 #2 0xc0161868 in poweroff_wait (junk=0xc0255591, howto=0) at /src/src-4/sys/kern/kern_shutdown.c:595 #3 0xc01f0781 in ffs_freefile (pvp=0xd5321e64, ino=3, mode=17407) at /src/src-4/sys/ufs/ffs/ffs_alloc.c:1611 #4 0xc01f58a4 in handle_workitem_freefile (freefile=0xc49793a0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:2913 #5 0xc01f2e43 in process_worklist_item (matchmnt=0x0, flags=0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:737 #6 0xc01f2cae in softdep_process_worklist (matchmnt=0x0) at /src/src-4/sys/ufs/ffs/ffs_softdep.c:622 #7 0xc018e72f in sched_sync () at /src/src-4/sys/kern/vfs_subr.c:1177 (kgdb) up 3 #3 0xc01f0781 in ffs_freefile (pvp=0xd5321e64, ino=3, mode=17407) at /src/src-4/sys/ufs/ffs/ffs_alloc.c:1611 1611panic(ffs_vfree: freeing free inode); (kgdb) Now I wanted to look at fs-fs_fsmnt and found strange stuff in there. Here is the complete fs struct: (kgdb) print *fs $1 = {fs_firstfield = 1601398374, fs_unused_1 = 1701996150, fs_sblkno = 1713388133, fs_cblkno = 1768252786, fs_iblkno = 1713399662, fs_dblkno = 543516018, fs_cgoffset = 1685024361, fs_cgmask = 101, fs_time = 0, fs_size = 0, fs_dsize = 0, fs_ncg = 1929379840, fs_bsize = 1953653108, fs_fsize = 622869792, fs_frag = 1814047844, fs_minfree = 1025535589, fs_rotdelay = 744760608, fs_rps = 544433696, fs_bmask =
Re: Another possible solution for non-sendmail users
On Thu, 28-Mar-2002 at 14:49:49 -0600, Scot W. Hetzel wrote: From: Coleman Kane [EMAIL PROTECTED] Another thing to look at is the /usr/sbin/sendmail - mailwrapper link that is produced from installworld. In current it seems to have been linking that, even Stable creates the same links to mailwrapper. when NO_SENDMAIL=yes in make.conf. Qmail et al. overwrite this with their own workalike (since /usr/sbin/sendmail is a 'standard' these days) local mailer. I dunno if -stable has this problem too. You want to set NO_MAILWRAPPER in make.conf to prevent the linking to This might give you more problems: http://www.freebsd.org/cgi/query-pr.cgi?pr=29699 Nobody seems to have agreed to a solution yet... :-) -Andre mailwrapper. But if you do this, you'll loose the configurablity that mailwrapper provides for alternate MTA's via /etc/mail/mailer.conf. Qmail install shouldn't need to install anything into /usr/[sbin,bin] directories with mailwrapper properly configured (see `man mailer.conf` 'man mailwrapper'). Mailwrapper was designed so that you didn't need to re-create your links to your personal MTA (in /usr/[bin,sbin], when upgrading FreeBSD. Scot To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: pcm problem in 4.4-PRERELEASE
On Tue, 07-Aug-2001 at 07:25:36 -, Conrad Sabatier wrote: Since upgrading to 4.4-PRERELEASE, I can't get mtv to do audio anymore. The following message appears in the console window: pcm0: play interrupt timeout, channel dead I've tried both with and without esd enabled. Same results. Strangely enough, mpg123 works just fine. I have exactly the same effects here. Additionally, mxaudio (ftp://ftp.xaudio.com/pub/xaudio/players/unix-motif/x86-unknown-linux-glibc/xaudio.x86-unknown-linux-glibc.tar.gz) does funny things as well: When playing a song the progress slider at the bottom runs like hell; a 4 minute song is processed in 4 seconds. No audio appears... -Andre Any clues as to what's going on here? -- Conrad Sabatier [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-multimedia in the body of the message -- Unix is very userfriendly. It's just picky who its friends are. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: pcm problem in 4.4-PRERELEASE
On Tue, 07-Aug-2001 at 07:25:36 -, Conrad Sabatier wrote: Since upgrading to 4.4-PRERELEASE, I can't get mtv to do audio anymore. The following message appears in the console window: pcm0: play interrupt timeout, channel dead I've tried both with and without esd enabled. Same results. Strangely enough, mpg123 works just fine. On thing I forgot in my other mail regarding this, I am using: andre@bali:~cat /dev/sndstat FreeBSD Audio Driver (newpcm) Aug 3 2001 12:12:07 Installed devices: pcm0: SB16 DSP 4.16 at io 0x220 irq 5 drq 1:5 (1p/1r/0v channels duplex) Conrad, what hardware have you got? -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: NIS/YP still broken!
On Sat, 02-Jun-2001 at 15:22:37 -0700, John Polstra wrote: In article [EMAIL PROTECTED], Hartmann, O. [EMAIL PROTECTED] wrote: FreeBSD 4.3-STABLE has still a broken NIS/YP! If there are more than one slave servers ypxfrd should spread its tables, push seems to lock up and get a timeout. This was reported earlier here and I got a 'fix' for this but this fix hasn't been merged in due it targets a sypmtome, not the cause itself. I will happily jump in here since I can easily reproduce it. We would love to fix this, but unfortunately the people who can debug it have not been able to reproduce the problem. If you are willing to help, maybe you can debug it by remote control. :-) Well, if these people like a step by step guide how to reproduce it I can try... Currently, my best hypothesis about the cause of this problem is that yppush is reading from an invalid memory address which happens to fall into the region occupied by the dynamic linker. Thus making small changes to the dymamic linker causes the behavior of yppush to change. To test this hypothesis, let's try an experiment. Please apply the patch below to /usr/src/usr.sbin/yppush/yppush_main.c: Index: yppush_main.c === RCS file: /home/ncvs/src/usr.sbin/yppush/yppush_main.c,v retrieving revision 1.11 diff -u -r1.11 yppush_main.c --- yppush_main.c 1999/08/28 01:21:09 1.11 +++ yppush_main.c 2001/06/02 21:35:11 @@ -545,6 +545,11 @@ struct hostlist *tmp; struct sigaction sa; + static char *rtld_base = (char *)0; /* Patch me */ + static char *rtld_limit = (char *)0;/* Patch me too */ + if (rtld_base != NULL rtld_limit rtld_base) + munmap(rtld_base, rtld_limit - rtld_base); + while ((ch = getopt(argc, argv, d:j:p:h:t:v)) != -1) { switch(ch) { case 'd': Then rebuild and reinstall yppush like this: make clean make obj make depend DEBUG_FLAGS=-g make STRIP= make install and verify that the program is still failing. I hope it will still fail, or we are out of luck. Ack, the programm still fails as before. As it is shown here, the patch should do nothing. Next you must determine where the dynamic linker is loaded, and patch the low and high limits into the two lines labeled Patch me and Patch me too. You can do this as follows. Run yppush manually and see what its process ID is. While the program is still running, display its map file /proc/PID/map. For example, if the process ID is 12345 you would want to see /proc/12345/map. I recommend that you look at the file like this: dd bs=64k /proc/12345/map since cat often doesn't work on these kinds of files. I hope that yppush will run long enough for you to snare this information. If it finishes too quickly, try adding a call ``sleep(30)'' just after the added lines in yppush_main.c. The map file will resemble this: 0x8048000 0x8049000 1 0 0xcb8a78a0 r-x 1 0 0x0 COW NC vnode 0x8049000 0x804a000 1 0 0xcb79d1e0 rw- 1 0 0x2180 NCOW NNC default 0x28049000 0x2805a000 17 0 0xcb55a120 r-x 38 19 0x4 COW NC vnode 0x2805a000 0x2805b000 1 0 0xcb39b120 rw- 1 0 0x2180 COW NNC vnode 0x2805b000 0x2805d000 2 0 0xcb5c6a20 rw- 2 0 0x2180 NCOW NNC default 0x2805d000 0x28065000 6 0 0xcb5c6a20 rwx 2 0 0x2180 NCOW NNC default 0x28065000 0x280e2000 44 0 0xc0355a00 r-x 46 23 0x4 COW NC vnode 0x280e2000 0x280e7000 5 0 0xcb34f120 rwx 1 0 0x2180 COW NNC vnode 0x280e7000 0x280fb000 2 0 0xcb3c2240 rwx 1 0 0x2180 NCOW NNC default 0xbfbe 0xbfc0 4 0 0xcb45b600 rwx 1 0 0x2180 NCOW NNC default The map here looks slightly different: 0x8048000 0x804d000 5 0 0xd6927ea0 r-x 1 0 0x0 COW NC vnode 0x804d000 0x804f000 2 0 0xd6894d20 rw- 2 0 0x2180 NCOW NNC default 0x804f000 0x8066000 16 0 0xd6894d20 rwx 2 0 0x2180 NCOW NNC default --- additional 0x1804d000 0x1805e000 17 0 0xd73f4d80 r-x 10 5 0x0 COW NC vnode 0x1805e000 0x1805f000 1 0 0xd7246ba0 rw- 1 0 0x2180 COW NNC vnode 0x1805f000 0x18061000 2 0 0xd72ec540 rw- 2 0 0x2180 NCOW NNC default 0x18061000 0x18069000 5 0 0xd72ec540 rwx 2 0 0x2180 NCOW NNC default 0x18069000 0x180e6000 103 0 0xc0280300 r-x 104 45 0x0 COW NC vnode 0x180e6000 0x180eb000 5 0 0xd71fba20 rwx 1 0 0x2180 COW NNC vnode 0x180eb000 0x180ff000 7 0 0xd7b96c60 rwx 1 0 0x2180 NCOW NNC default 0xbfbe 0xbfc0 4 0 0xd79b18a0 rwx 1 0 0x2180 NCOW NNC default except that I have added some blank lines to make it easier to explain. The first 3 groups of lines above correspond to (1) the program itself, (2) the dynamic linker, and (3) the shared library libc.so.4. The final line is the runtime stack. Except for the stack, each group begins with one or two vnode lines. That's how you can recognize where each group starts. The first two numbers in each line are the start and end+1 addresses of a region of memory.
Re: NIS/YP problems after cvsupdate
On Wed, 16-May-2001 at 20:24:35 +0200, Hartmann, O. wrote: Hello. Three weeks ago I did the last cvsupdate on our FBSD 4.3 boxes. On our master NIS/YP server rpc.ypxfrd is running due the fact all of our boxes within this area run FreeBSD 4.3. Until last weekend an cron driven make in /var/yp worked well. It pragated the maps every hour to the slave servers. Last weekend I did the first cvupdate, yesterday the last one. First thing I realized was that fxp now needs miibus code (??). well, now our master NIS server is stock in sending to the slaves. It gets timeouts and messages about pending transaction processes. what happened? Why is that an up and down with NIS/YP in FreeBSD? Nothing has been changed within the configuartion since the last month, so I'm sure this problem is either triggered by 'obsolete configs', but nothing has been changed that way or by changes in networking or NIS/.YP related things. Has anybody realized the same? Can you send me the ouput of a yppush -vvv some_map_name ? How many slave servers are you using? -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: NIS/YP problems after cvsupdate
On Wed, 16-May-2001 at 21:30:36 +0200, Hartmann, O. wrote: On Wed, 16 May 2001, John Polstra wrote: Hello. Well, all of our FBSD boxes are equipted with Intel NIC (fxp) and 100MBit/full-duplex (the switches are also full duplex types). All other network facilities work well - only the master server isn't able to send out a transfer initiation to its slaves. when doing a 'ypinit -s MASTER-SERVER' on each client the client polls its maps successfully. This seems to be a problem of the ypxfrd daemon running on our master ... In the past I have had a similar, versy strange problem due the fact I compiled each part of the kernel and of the base sources of the base operating system with the option -march=i686 as a compileroption of CFLAGS and COPTFLAGS in make.conf. This triggered a very strange behaviour. This time this problem occured after a cvsupdate without changes in config matter ... :-( (I did make world and mergemaster only ...). Try this patch (no joke): --- libexec/rtld-elf/rtld.c.ORI Fri May 18 08:05:01 2001 +++ libexec/rtld-elf/rtld.c Fri May 18 08:03:24 2001 @@ -386,6 +386,8 @@ dbg(initializing key program variables); set_program_var(__progname, argv[0] != NULL ? basename(argv[0]) : ); +set_program_var(environ, dummy); +set_program_var(environ, dummy); set_program_var(environ, env); dbg(initializing thread locks); Maybe we should really reopen http://www.freebsd.org/cgi/query-pr.cgi?pr=12496 John, what do you think? -Andre :In article [EMAIL PROTECTED], :Hartmann, O. [EMAIL PROTECTED] wrote: : : Last weekend I did the first cvupdate, yesterday the last one. : First thing I realized was that fxp now needs miibus code (??). : : well, now our master NIS server is stock in sending to the slaves. : It gets timeouts and messages about pending transaction processes. : :Peter Wemm just fixed a similar problem on ref4.freebsd.org. He said :the cause of the problem was a full/half duplex mismatch between the :NIC and the switch. : :John :-- : John Polstra [EMAIL PROTECTED] : John D. Polstra Co., Inc.Seattle, Washington USA : Disappointment is a good sign of basic intelligence. -- Ch?gyam Trungpa : : -- MfG O. Hartmann [EMAIL PROTECTED] IT-Administration des Institut fuer Physik der Atmosphaere (IPA) Johannes Gutenberg Universitaet Mainz Becherweg 21 55099 Mainz Tel: +496131/3924662 (Maschinensaal) Tel: +496131/3924144 FAX: +496131/3923532 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message -- Division by zero error -- multiplying by zero to recover... To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-stable in the body of the message
Re: No /boot/loader (dangerously dedicated)
On Sun, 23-Jul-2000 at 15:33:43 -0400, Mikhail Teterin wrote: Doug White once stated: = Wait! Smarter then what? So it can boot NT and Win98 for some = weenies, or, actually do something useful (not sure what, though)? = Why am I to waste space (even so little) "to be compatible with other = OSes", if there will never be any other OSes? = =So you'll be compatible with your BIOS as well. Many BIOSen get really, =really torqued if your partition table isn't normal. I'm yet to see a BIOS, for which this is true. May be, I'm just lucky... You are lucky. Try some Siemens crap with their Phoenix BIOS. They simply say "Read error" if you wanted to use dangerously dedicated mode. I have been bitten by this a lot of times. Normally, I don't use Siemens machines for things other than Win* crap but sometimes I have (had) to. -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message
Re: No /boot/loader (dangerously dedicated)
On Sun, 23-Jul-2000 at 18:57:12 -0400, Adrian Filipi-Martin wrote: On Sun, 23 Jul 2000, John Baldwin wrote: Patrick M. Hausen wrote: Hello all! Mikhail Teterin wrote: John Baldwin once stated: Folks, gemoetries are for brain damaged PC operating systems. All the box needs to boot is a proper MBR. BIOSes that don't boot from a dedicated disk are _broken_. No, they are actually smart in that they attempt to use a geometry that matches the MBR so that you can move disks around. As a result, when we try to fake it, it confuses them. Hmmm. Perhaps my memory is failing me, but I've been using "dangerously dedicated" disks exclusively for the last few years, because it was supposed to insulate me from the silliness of BIOS geometry translation. By insulate, I mean that a disk formatted on one system was always usable on another even if it decided to have a different geometry translation. I don't shuttle disks around between systems as much as I used to, but I do recall dedicated mode helping. The only systems that had problem booting were old and are long gone. I haven't seen or bought anything in the last three years that won't boot a "dangerously dedicated" disk. Buy a (brandnew) Siemens machine and you will see one :-(. -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message
Re: Acroread4
On Wed, 12-Apr-2000 at 13:12:15 -0700, Cy Schubert - ITSD Open Systems Group wrote: In message [EMAIL PROTECTED], Wilko Bulte writes: On Wed, Apr 12, 2000 at 10:38:04AM -0400, Vivek Khera wrote: "A" == Asmodai Jeroen writes: I don't think this is a 4.0 issue. I get the same on a couple of 3.4R systems, minus the locale message. A I can, on my 3.4-STABLE, get acroread4 to coredump time and again. I've never had acroread version 4 croak on my 3.4-STABLE system. It works just perfectly fine. Well... (just installed): acroread-4.05 gives: WKB ~acroread4 Floating point exception (core dumped) on 3.4-stable. Is this what Vivek is seeing? Exactly what I get. 4.00 works, 4.05 not. Can be fixed for 3.4-STABLE by applying http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/include/npx.h.diff?r1=1.17r2=1.18 -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message
Re: Acroread4
On Wed, 12-Apr-2000 at 14:44:21 -0400, Vivek Khera wrote: "WB" == Wilko Bulte [EMAIL PROTECTED] writes: WB WKB ~acroread4 WB Floating point exception (core dumped) WB on 3.4-stable. Is this what Vivek is seeing? I see no errors. It just works perfectly fine. On my 3.4-stable, I have no "acroread4" executable, just "acroread", which is a symlink to /usr/local/Acrobat4/bin/acroread which in turn is a shell script that does the right thing for me. I have the linux_base-6.1 port installed on my system. I don't think I used ports to install Acrobat 4. I manually installed it. Did you use linux-ar-40.tar.gz or linux-ar-405.tar.gz ? I am sure you took 40 ... -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message
Re: ASUS P2B-S and FreeBSD 3.3-RELEASE
On Tue, 16-Nov-1999 at 11:01:13 -0500, Thomas David Rivers wrote: On Tue, 16-Nov-1999 at 09:19:48 -0500, Thomas David Rivers wrote: Is anyone using FreeBSD 3.3 with an ASUS P2B SCSI? I believe this is an AHA 2940U2W "clone" - is that right? If you have been successful with a P2B-DS or P2B-LS, let me know, I _think_ I'm having a difficult time getting SCSI termination correct (but...) I got a P2B-LS. Had lot of problems (kernel stopped booting when it came to the SCSI controller) until I upgraded the BIOS. Sounds interesting - what version of the BIOS did you upgrade to? OK, will try to remember. I currently use 1.11 and used 1.10 before. I think I got it with something 1.08 which didn't work. The SCSI BIOS shows something about Version 2.11, iirc... However, I suggest you upgrade it to 1.11 using ftp://ftp.asus.com.tw/pub/asus/mb/flash/aflash.exe and ftp://ftp.asuscom.de/pub/ASUS/mb/slot1/440bx/p2b-s/bx2s1011.zip The above file is for a P2B-S ... - Dave R. - -Andre To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message