Re: Multiple serial consoles via null modem cable
In message 717f7a3e1001210137p7884adcbxc66a4f7fff928...@mail.gmail.com, Marin Atanasov (dna...@gmail.com) wrote: Hello Jeremy, Now I'm a little confused :) I've made some tests with my machines and a couple of null modem cables, and here's what I've got. On Wed, Jan 20, 2010 at 9:46 AM, Jeremy Chadwick free...@jdc.parodius.comwrote: On Wed, Jan 20, 2010 at 08:46:48AM +0200, Marin Atanasov wrote: Hello, Using `cu' only works with COM1 for me. Currently I have two serial ports on the system, and only the first is able to make the connection - the serial consoles are enabled in /etc/tty, but as I said only COM1 is able to make the connection. I'm a little confused by this statement, so I'll add some clarify: /etc/ttys is for configuring a machine to tie getty (think login prompt) to a device (in this case, a serial port). Meaning: the device on the other end of the serial cable will start seeing login: and so on assuming you attach to the serial port there. For example: box1 COM1/ttyu0 is wired to box2 COM3/ttyu2 using a null modem cable. box1 COM2/ttyu1 is wired to box2 COM4/ttyu3 using a null modem cable. On box1, you'd have something like this in /etc/ttys: ttyu0 /usr/libexec/getty std.9600 vt100 on secure ttyu1 /usr/libexec/getty std.9600 vt100 on secure Here's what I did: box1 COM1/ttyd0 - box2 COM1/ttyd0 - using null modem cable box1 COM2/ttyd1 - box3 COM1/ttyd0 - using null modem cable On box1 I have this in /etc/ttys: ttyd0 /usr/libexec/getty std.9600 vt100 on secure ttyd1 /usr/libexec/getty std.9600 vt100 on secure Now if I want to connect to box1 from box2 or box3 through the serial connection it should work, right? But I only can connect to box1 from box2, because box2's COM port is connected to box1's COM1 port. From box2 I can get a login prompt box2# cu -l /dev/cuad0 -s 9600 Connected login: ) (host.domain) (ttyd0) login: ~ [EOT] But if I try to connect to box1 from box3 - no success there. box3# cu -l /dev/cuad0 -s 9600 Connected ~ [EOT] You need to reduce the number of unknowns, e.g. where is the problem: box1, box3 or in between. So, swap the cables on box1 so that you now have box1:COM1 - box3:COM1 and box1:COM2 - box2:COM1. Now repeat the tests above and post your results. Cheers, Nick. -- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available
On 1/21/10 9:15 PM, John Baldwin wrote: On Thursday 21 January 2010 2:09:34 pm Florian Smeets wrote: On 1/21/10 8:05 PM, John Baldwin wrote: On Thursday 21 January 2010 1:33:35 pm Florian Smeets wrote: On 1/21/10 6:58 PM, John Baldwin wrote: On Thursday 21 January 2010 8:25:22 am Florian Smeets wrote: (kgdb) frame 8 #8 0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1307 1307m_copydata(m, 0, mcopy-m_len, mtod(mcopy, caddr_t)); (kgdb) l 1302mcopy = NULL; 1303} 1304if (mcopy != NULL) { 1305mcopy-m_len = min(ip-ip_len, M_TRAILINGSPACE(mcopy)); 1306mcopy-m_pkthdr.len = mcopy-m_len; 1307m_copydata(m, 0, mcopy-m_len, mtod(mcopy, caddr_t)); 1308} 1309 1310#ifdef IPSTEALTH 1311if (!ipstealth) { (kgdb) p *m $1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc271e80e E\020, mh_len = 164, mh_flags = 3, mh_type = 1, pad = \000}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0, len = 164, csum_flags = 3072, csum_data = 65535, tso_segsz = 0, ether_vtag = 0, tags = {slh_first = 0xc35bc380}}, MH_dat = {MH_ext = {ext_buf = 0xc271e800 , ext_free = 0, ext_args = 0x0, ext_size = 2048, ref_cnt = 0xc2703ab4, ext_type = 6}, MH_databuf = \000?q?\000\000\000\000\000\000\000\000\000\b\000\000?:p? \006\000\000\000dL?\t+?\202\200\020 O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l? \000\000\001%r??? \200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b? \237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002? 3...@\210d\021\000\001? \001b\000!e\000\...@bv\000\000@2\032$W\213\n\034...}}, M_databuf = \000H\n?\000\000\000\000?\000\000\000\000\f\000\000?? \000\000\000\000\000\000\200?[?\000?q? \000\000\000\000\000\000\000\000\000\b\000\000?:p?\006\000\000\000dL? \t+? \202\200\020 O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l? \000\000\001%r??? \200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b? \237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002? 3...}} Ok, can you do 'p *m_copy'? What ever you want :-) (kgdb) p *m_copy No symbol m_copy in current context. (kgdb) p *m_copydata $2 = {void (const struct mbuf *, int, int, caddr_t)} 0xc0572e10m_copydata (kgdb) p *mcopy $1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc23cce34 E\020, mh_len = 204, mh_flags = 2, mh_type = 1, pad = \000}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0, len = 204, csum_flags = 3072, csum_data = 65535, tso_segsz = 0, ether_vtag = 0, tags = {slh_first = 0xc23c3e00}}, MH_dat = {MH_ext = {ext_buf = 0x84001045Address 0x84001045 out of bounds, Hmm, ok. Can you do 'p *ip'? mcopy-m_len (204) is larger than m-m_len (164). That shouldn't be the case unless ip-ip_len is somehow larger than m- m_len. (kgdb) p *ip $3 = {ip_hl = 5, ip_v = 4, ip_tos = 16 '\020', ip_len = 33792, ip_id = 61492, ip_off = 64, ip_ttl = 64 '@', ip_p = 6 '\006', ip_sum = 34849, ip_src = {s_addr = 355576000}, ip_dst = { s_addr = 2251401408}} Looks like ip_len is in network byte order instead of host byte order and that is causing the problem. 33792 == 0x8400. Swapping that gives 0x84 == 132 which would be a reasonable length. Are you using any firewall rules that would rewrite packets? I wonder if you are having a packet rewritten and the new IP header is written in network byte order, but we swap the IP header len field to host byte order earlier in ip_input(). Luigi Rizzo may have some insight into this. Well, when looking at MH_databuf i see Jboss MQ traffic that would mean that this traffic was coming from or going to an IPsec tunnel, i could say for sure when i would have a clue how to get an IP address from something like ip_src = {s_addr = 355576000}. If it really is IPsec traffic then there are no rewrite rules only 10 pf pass rules on the enc0 interface and a scrub in all rule. Perhaps it matters that i have these set: net.enc.out.ipsec_bpf_mask=0x0001 net.enc.out.ipsec_filter_mask=0x0001 net.enc.in.ipsec_bpf_mask=0x0002 net.enc.in.ipsec_filter_mask=0x0002 so that i can filter the encapsulated traffic. Thanks, Florian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: device.hints isn't setting what I want
On Thu, Jan 21, 2010 at 08:23:23PM -0500, Dan Langille wrote: First, see also my post: do I want ch0 or pass1? I have an external tape library and an external tape drive. They are not always powered up. My goal: always get the same devices regardless of whether or not the tape library is powered on at boot. After booting, with the tape library powered on, I have these devices: # camcontrol devlist QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0) DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1) DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2) HL-DT-ST DVDRAM GSA-H10A JL02at scbus2 target 0 lun 0 (cd0,pass3) USB 2.0 Storage Device 0100 at scbus5 target 0 lun 0 (da0,pass4) In /boot/devices, I have added these entries: hint.scbus.1.at=ahc0 hint.scbus.0.at=ahc1 hint.scbus.2.at=acd0 hint.scbus.5.at=umass0 I think that this is wrong. I had a similar issue (multiple tape drives and changer devices that needed to stay at the same ids). Your device.hints entries should look something like this: hint.sa.0.at=scbus0 hint.sa.0.target=5 hint.sa.0.unit=0 hint.sa.1.at=scbus0 hint.sa.1.target=3 hint.sa.1.unit=0 hint.sa.2.at=scbus0 hint.sa.2.target=1 hint.sa.2.unit=0 hint.ch.0.at=scbus0 hint.ch.0.target=4 hint.ch.0.unit=0 hint.ch.1.at=scbus0 hint.ch.1.target=2 hint.ch.1.unit=0 hint.ch.2.at=scbus0 hint.ch.2.target=0 hint.ch.2.unit=0 Which I use to get this: # camcontrol devlist SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch2) SONY SDX-1100 0102 at scbus0 target 1 lun 0 (sa2,pass1) SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch1) SONY SDX-900V 0102 at scbus0 target 3 lun 0 (sa1,pass3) # (Currently the first changer is not powered up.) So I think that what you want is something like: hint.sa.0.at=scbus0 hint.sa.0.target=5 hint.sa.0.unit=0 hint.sa.1.at=scbus1 hint.sa.1.target=5 hint.sa.1.unit=0 hint.ch.0.at=scbus1 hint.ch.0.target=0 hint.ch.0.unit=0 [...] -- greg byshenk - gbysh...@byshenk.net - Leiden, NL ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: device.hints isn't setting what I want
On Fri, Jan 22, 2010 at 10:01:02AM +0100, Greg Byshenk wrote: On Thu, Jan 21, 2010 at 08:23:23PM -0500, Dan Langille wrote: First, see also my post: do I want ch0 or pass1? I have an external tape library and an external tape drive. They are not always powered up. My goal: always get the same devices regardless of whether or not the tape library is powered on at boot. After booting, with the tape library powered on, I have these devices: # camcontrol devlist QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0) DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1) DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2) HL-DT-ST DVDRAM GSA-H10A JL02at scbus2 target 0 lun 0 (cd0,pass3) USB 2.0 Storage Device 0100 at scbus5 target 0 lun 0 (da0,pass4) In /boot/devices, I have added these entries: hint.scbus.1.at=ahc0 hint.scbus.0.at=ahc1 hint.scbus.2.at=acd0 hint.scbus.5.at=umass0 I think that this is wrong. I had a similar issue (multiple tape drives and changer devices that needed to stay at the same ids). Your device.hints entries should look something like this: hint.sa.0.at=scbus0 hint.sa.0.target=5 hint.sa.0.unit=0 hint.sa.1.at=scbus0 hint.sa.1.target=3 hint.sa.1.unit=0 hint.sa.2.at=scbus0 hint.sa.2.target=1 hint.sa.2.unit=0 hint.ch.0.at=scbus0 hint.ch.0.target=4 hint.ch.0.unit=0 hint.ch.1.at=scbus0 hint.ch.1.target=2 hint.ch.1.unit=0 hint.ch.2.at=scbus0 hint.ch.2.target=0 hint.ch.2.unit=0 Which I use to get this: # camcontrol devlist SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch2) SONY SDX-1100 0102 at scbus0 target 1 lun 0 (sa2,pass1) SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch1) SONY SDX-900V 0102 at scbus0 target 3 lun 0 (sa1,pass3) # (Currently the first changer is not powered up.) So I think that what you want is something like: hint.sa.0.at=scbus0 hint.sa.0.target=5 hint.sa.0.unit=0 hint.sa.1.at=scbus1 hint.sa.1.target=5 hint.sa.1.unit=0 hint.ch.0.at=scbus1 hint.ch.0.target=0 hint.ch.0.unit=0 [...] Just saw your second message. I don't know if you can wire down 'pass?' the same way, but if you can, I would assume that you need to set it the same way as the 'sa?' and other devices. That is, if you want: QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0) Then the device.hints entry would look like: hint.pass.0.at=scbus0 hint.pass.0.target=5 hint.pass.0.unit=0 (If you can do that.) -greg -- greg byshenk - gbysh...@byshenk.net - Leiden, NL ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Multiple serial consoles via null modem cable
On Fri, Jan 22, 2010 at 08:36:51AM +0200, Marin Atanasov wrote: On Thu, Jan 21, 2010 at 6:26 PM, Ulrich Spörlein u...@spoerlein.net wrote: On Thu, 21.01.2010 at 11:37:06 +0200, Marin Atanasov wrote: Here's what I did: box1 COM1/ttyd0 - box2 COM1/ttyd0 - using null modem cable box1 COM2/ttyd1 - box3 COM1/ttyd0 - using null modem cable On box1 I have this in /etc/ttys: ttyd0 /usr/libexec/getty std.9600 vt100 on secure ttyd1 /usr/libexec/getty std.9600 vt100 on secure Now if I want to connect to box1 from box2 or box3 through the serial connection it should work, right? But I only can connect to box1 from box2, because box2's COM port is connected to box1's COM1 port. Are there actually two gettys running on the serial ports? Did you do kill -1 1 after the changes to /etc/ttys? On box1, what do the following commands produce egrep uart|sio /var/run/dmesg.boot pgrep -fl getty Regards, Uli Hi, This is the output from the requested commands: box1# egrep 'uart|sio' /var/run/dmesg.boot usb0: USB revision 1.0 sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: 16550A-compatible COM port port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A box1# pgrep -fl getty 3066 /usr/libexec/getty std.9600 ttyd1 3065 /usr/libexec/getty std.9600 ttyd0 534 /usr/libexec/getty Pc ttyv7 533 /usr/libexec/getty Pc ttyv6 532 /usr/libexec/getty Pc ttyv5 531 /usr/libexec/getty Pc ttyv4 530 /usr/libexec/getty Pc ttyv3 529 /usr/libexec/getty Pc ttyv2 528 /usr/libexec/getty Pc ttyv1 527 /usr/libexec/getty Pc ttyv0 Can you run the same commands on box2 please? -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
make buildkernel failing on zfs
Hi Folks, I'm having a problem since I upgraded from 7.0 to the latest stable over Christmas with quotas and exim not rebuilding. I'm told that exim requires NIS and WITHOUT_NIS=yes was defined in /usr/src.conf My /usr/src.conf for that build is as follows though I've removed it for a test build: WITHOUT_ATM=yes WITHOUT_BLUETOOTH=yes WITHOUT_GAMES=yes WITHOUT_I4B=yes WITHOUT_IPX=yes WITHOUT_NCP=yes WITHOUT_NIS=yes WITHOUT_SENDMAIL=yes WITHOUT_INET6=yes WITHOUT_PROFILE=yes I've been building with the following command (this worked with the sources I used at Christmas, but I updated yesterday and now it won't): cd /usr/src make buildworld | tee /root/build make kernel-toolchain | tee /root/build2 make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=TED | tee /root/build3 It goes for a while and then the buildkernel fails with this: cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo n/fs/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common -I/usr/src/sys/modules/zfs/../.. -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso laris/common/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq -finline-l imit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding - Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas -Wno-missing-prototypes -Wno-undef -Wno-strict-pro totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline -Wno-switch -Wno-pointer-arith -c /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z node.c *** Error code 1 Stop in /usr/src/sys/modules/zfs. *** Error code 1 Stop in /usr/src/sys/modules. *** Error code 1 Stop in /usr/obj/usr/src/sys/TED. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. It is a custom kernel and the conf file is here: http://www.pastebin.org/80156 First build I used the old conf file and that failed then I compared the generic and the custom before attempting this rebuild and copied in a couple of new lines (options P1003_1B_SEMAPHORES and device vlan) but they've nothing to do with it from what I can tell. I don't even need zfs seen as this is a ufs machine but I do need some of the other options in the kernel hence using custom. I am currently running another csup and will try building a generic kernel to see if that works but I can't really install a generic. Anyone got any pointers? Cheers, Colin. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: make buildkernel failing on zfs
On Fri, Jan 22, 2010 at 9:56 AM, Colin free...@southportcomputers.co.uk wrote: Anyone got any pointers? Could you post your /etc/make.conf? That said, I recon you build your kernel in a rather wierd way. Delete /usr/obj/* and run make cleandir make cleandir in /usr/src. Then build your world and kernel like this make buildworld buildkernel KERNCONF=TED. If that goes as well, run make installkernel KERNCONF=TED, reboot, make installworld, run mergemaster and reboot again. -- chs ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: make buildkernel failing on zfs
In message 4b596838.9020...@southportcomputers.co.uk, Colin (free...@southportcomputers.co.uk) wrote: [snip] It goes for a while and then the buildkernel fails with this: cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo n/fs/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common -I/usr/src/sys/modules/zfs/../.. -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso laris/common/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq -finline-l imit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding - Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas -Wno-missing-prototypes -Wno-undef -Wno-strict-pro totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline -Wno-switch -Wno-pointer-arith -c /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z node.c *** Error code 1 I think this was a temporary problem that has already been fixed. Try updating to the latest version and see if that builds okay. Cheers, Nick. -- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: make buildkernel failing on zfs
Christer Solskogen wrote: On Fri, Jan 22, 2010 at 9:56 AM, Colin free...@southportcomputers.co.uk wrote: Anyone got any pointers? Could you post your /etc/make.conf? That said, I recon you build your kernel in a rather wierd way. Delete /usr/obj/* and run make cleandir make cleandir in /usr/src. Then build your world and kernel like this make buildworld buildkernel KERNCONF=TED. If that goes as well, run make installkernel KERNCONF=TED, reboot, make installworld, run mergemaster and reboot again. Thanks for the reply. I have deleted /usr/obj in between builds (forgot to mention that) but didn't do make cleandir. I think the first way that I did the build when I started was similar to what you suggested but I was having problems with installworld so after various reading and suggestions dropped back to that format which I read was a more paranoid way of doing it. make.conf as below: SUP=/usr/local/bin/cvsup SUPFLAGS= -g -L 2 SUPHOST=cvsup.FreeBSD.org SUPFILE=/root/standard-supfile PORTSSUPFILE= /root/ports-supfile #*REMOVE* OPENSSL_OVERWRITE_BASE=NO # added by use.perl 2009-06-14 11:10:18 #PERL_VERSION=5.8.9 #BATCH=YES #CRYPT_DES=0 #WITHOUT_ALT_CONFIG_PREFIX=YES #CFLAGS= -O -pipe NO_FORTRAN= true NO_OBJC=true NO_X= true NO_GAMES=true NO_PROFILE= true BATCH=YES WITHOUT_X11=YES SKIP_DNS_CHECK=YES CRYPT_DES=0 WITH_PORT_REPLACES_BASE_BIND8=YES WITH_PORT_REPLACES_BASE_BIND9=YES WITHOUT_ALT_CONFIG_PREFIX=YES WITH_OPENSSL_PORT=YES X11BASE=${LOCALBASE} ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: make buildkernel failing on zfs
N.J. Mann wrote: In message 4b596838.9020...@southportcomputers.co.uk, Colin (free...@southportcomputers.co.uk) wrote: [snip] It goes for a while and then the buildkernel fails with this: cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo n/fs/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common -I/usr/src/sys/modules/zfs/../.. -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso laris/common/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq -finline-l imit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding - Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas -Wno-missing-prototypes -Wno-undef -Wno-strict-pro totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline -Wno-switch -Wno-pointer-arith -c /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z node.c *** Error code 1 I think this was a temporary problem that has already been fixed. Try updating to the latest version and see if that builds okay. Cheers, Nick. Ahh if that is the case then the build I currently have running should work seen as I ran a csup less than an hour ago. Fingers crossed! Regards, Colin. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
Alexander Motin schrieb am 19.01.2010 17:12 (localtime): ... Patch can be found here: http://people.freebsd.org/~mav/cam-ata.20100119.patch Feedback as always welcome. Again, thanks a lot for your ongoing great work! The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't care. Otherwise I couldn't find any problems. The system detects reinserted SATA drives on ICH9 fine. This was tested on a zfs backup server which went to the backbone yesterday, so I can't physically remove any devices any more for testing... But I had some questions about zfs raidz states. I think that isn't a matter of atacam but if I removed one disk, zpool status still showed me the ada3 device online. After reinserting (and proper detection/initialisazion with cam, ada3 was present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O errors. I coudn't get the device into the pool again, no matter what I tried. Only rebooting the machine helped. Then I could clean and scrub. What are the needed steps to provide a reinsterted hard disk to geom? With the latest patches I don't need to issue any reset/rescan comman, right? So it's a zfs problem, right? My mistake in understanding? Thanks, -Harry signature.asc Description: OpenPGP digital signature
Re: make buildkernel failing on zfs
On Fri, Jan 22, 2010 at 10:40:17AM +0100, Christer Solskogen typed: On Fri, Jan 22, 2010 at 9:56 AM, Colin free...@southportcomputers.co.uk wrote: Anyone got any pointers? Could you post your /etc/make.conf? That said, I recon you build your kernel in a rather wierd way. Delete /usr/obj/* and run make cleandir make cleandir in /usr/src. Then Bit redundant ;) cleandir only effects /usr/obj, which you just blew away. Ruben ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: make buildkernel failing on zfs
On Fri, Jan 22, 2010 at 09:45:46AM +, N.J. Mann wrote: In message 4b596838.9020...@southportcomputers.co.uk, Colin (free...@southportcomputers.co.uk) wrote: [snip] It goes for a while and then the buildkernel fails with this: cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo n/fs/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common -I/usr/src/sys/modules/zfs/../.. -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso laris/common/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq -finline-l imit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding - Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas -Wno-missing-prototypes -Wno-undef -Wno-strict-pro totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline -Wno-switch -Wno-pointer-arith -c /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z node.c *** Error code 1 I think this was a temporary problem that has already been fixed. Try updating to the latest version and see if that builds okay. The FreeBSD tinderbox build system noticed this problem as well, dying in the same piece of code. It's not you. 738 01/21 17:04 FreeBSD Tinderbox (9.0K) [releng_7 tinderbox] failure on amd64/amd64 739 01/21 17:44 FreeBSD Tinderbox (8.7K) [releng_7 tinderbox] failure on i386/i386 Normally I'd shake my finger at the committer for committing code to stable branches without testing, but I hold the committer (jhb) in very high regards and he has a very established history of not breaking things + doing excellent work. Mistakes happen, we're all human. :-) -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: make buildkernel failing on zfs
On Fri, 22 Jan 2010, Ruben de Groot wrote: RdG Could you post your /etc/make.conf? RdG That said, I recon you build your kernel in a rather wierd way. Delete RdG /usr/obj/* and run make cleandir make cleandir in /usr/src. Then RdG RdG Bit redundant ;) RdG cleandir only effects /usr/obj, which you just blew away. Not exactly: without objdir, cleandir removes built objects from source directory (where they may accidentally reside if one type 'make all' without 'make obj' previously) -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: ma...@freebsd.org ] *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru *** ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS performance degradation over time
Quoting Jeremy Chadwick free...@jdc.parodius.com (from Tue, 19 Jan 2010 09:01:01 -0800): On Tue, Jan 19, 2010 at 11:40:50AM -0500, Garrett Moore wrote: I've been watching my memory usage and I have no idea what is consuming memory as 'Active'. Last night I had around 6500MB 'Active' again, 1500MB Wired, no inact, ~30MB buf, no free, and ~100MB swap used. My performance copying ZFS-ZFS was again slow (1MB/s). I tried killing rTorrent and no significant amount of memory was reclaimed - maybe 100MB. `ps aux` showed no processes using any significant amount of memory, and I was definitely nowhere near 6500MB usage. I tried running the perl oneliner again to hog a bunch of memory, and almost all of the Active memory was IMMEDIATELY marked as Free, and my performance was excellent again. I'm not sure what in userland could be causing the issue. The only things I've installed are rTorrent, lighttpd, samba, smartmontools, vim, bash, Python, Perl, and SABNZBd. There is nothing that *should* be consuming any serious amount of memory. I've two recommendations: 1) Have you considered upgrading to RELENG_8 (e.g. 8.0-STABLE) instead of sticking with 8.0-RELEASE? There's been a recent MFC to RELENG_8 which pertain to ARC drainage. I'm referring to the commit labelled revision 1.22.2.2 (RELENG_8): http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c This patch can be merged stand-alone if necessary, no need to go to RELENG_8 if there are reservations. 2) Have you tried using vfs.zfs.arc_max in loader.conf to limit the ARC size? I'd recommend picking something like 1GB as a cap (your machine Or even less... to be determined by experimenting. has 8GB total at present, if I remember right). I believe long ago someone said this isn't an explicit hard limit on the maximum size of the ARC, but I believe this was during the RELENG_7 days and the ARC stuff on FreeBSD has changed since then. I wish the tunables were better documented, or at least explained in detail (hello Wiki!). The commit you refer to above is just doing this: limiting the arc more to the arc_max than it was the case before. This patch is in 7-stable too (in case someone is interested). Bye, Alexander. -- Johnson's First Law: When any mechanical contrivance fails, it will do so at the most inconvenient possible time. http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: IPSec NAT-T in transport mode
Hi. On Thu, Jan 21, 2010 at 04:36:12PM +, David Murray wrote: [...] On 2010-01-20 Wed 1:22 pm, Crest wrote: Yes the NAT-T Patch has been integrated into FreeBSD 8.0. Just rebuild your kernel with this options: device crypto # IPsec depends on this options IPSEC options IPSEC_DEBUG options IPSEC_NAT_T I'm trying to do the same thing as the OP, so thanks for these replies. However, they seem to be at odds. Are we saying that the NAT-T patch is there, but is missing checksum re-calculation, so MPD's packets are going to be discarded? Yes, see my other mail in this thread. (FWIW, this seems to be what happens. All the negotiation to set up IPSEC SAs happens, but MPD's log never shows a single entry. I hadn't got as far as packet dumps when this thread popped up.) And if you have a look at system stats, you'll see lots of UDP packets dropped because of invalid checksums Yvan. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available
On Friday 22 January 2010 3:08:45 am Florian Smeets wrote: On 1/21/10 9:15 PM, John Baldwin wrote: On Thursday 21 January 2010 2:09:34 pm Florian Smeets wrote: On 1/21/10 8:05 PM, John Baldwin wrote: On Thursday 21 January 2010 1:33:35 pm Florian Smeets wrote: On 1/21/10 6:58 PM, John Baldwin wrote: On Thursday 21 January 2010 8:25:22 am Florian Smeets wrote: (kgdb) frame 8 #8 0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1307 1307 m_copydata(m, 0, mcopy-m_len, mtod(mcopy, caddr_t)); (kgdb) l 1302 mcopy = NULL; 1303 } 1304 if (mcopy != NULL) { 1305 mcopy-m_len = min(ip-ip_len, M_TRAILINGSPACE(mcopy)); 1306 mcopy-m_pkthdr.len = mcopy-m_len; 1307 m_copydata(m, 0, mcopy-m_len, mtod(mcopy, caddr_t)); 1308 } 1309 1310 #ifdef IPSTEALTH 1311 if (!ipstealth) { (kgdb) p *m $1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc271e80e E\020, mh_len = 164, mh_flags = 3, mh_type = 1, pad = \000}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0, len = 164, csum_flags = 3072, csum_data = 65535, tso_segsz = 0, ether_vtag = 0, tags = {slh_first = 0xc35bc380}}, MH_dat = {MH_ext = {ext_buf = 0xc271e800 , ext_free = 0, ext_args = 0x0, ext_size = 2048, ref_cnt = 0xc2703ab4, ext_type = 6}, MH_databuf = \000?q?\000\000\000\000\000\000\000\000\000\b\000\000?:p? \006\000\000\000dL?\t+?\202\200\020 O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l? \000\000\001%r??? \200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b? \237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002? 3...@\210d\021\000\001? \001b\000!e\000\...@bv\000\000@2\032$W\213\n\034...}}, M_databuf = \000H\n?\000\000\000\000?\000\000\000\000\f\000\000?? \000\000\000\000\000\000\200?[?\000?q? \000\000\000\000\000\000\000\000\000\b\000\000?:p?\006\000\000\000dL? \t+? \202\200\020 O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l? \000\000\001%r??? \200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b? \237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002? 3...}} Ok, can you do 'p *m_copy'? What ever you want :-) (kgdb) p *m_copy No symbol m_copy in current context. (kgdb) p *m_copydata $2 = {void (const struct mbuf *, int, int, caddr_t)} 0xc0572e10m_copydata (kgdb) p *mcopy $1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc23cce34 E\020, mh_len = 204, mh_flags = 2, mh_type = 1, pad = \000}, M_dat = {MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0, len = 204, csum_flags = 3072, csum_data = 65535, tso_segsz = 0, ether_vtag = 0, tags = {slh_first = 0xc23c3e00}}, MH_dat = {MH_ext = {ext_buf = 0x84001045Address 0x84001045 out of bounds, Hmm, ok. Can you do 'p *ip'? mcopy-m_len (204) is larger than m-m_len (164). That shouldn't be the case unless ip-ip_len is somehow larger than m- m_len. (kgdb) p *ip $3 = {ip_hl = 5, ip_v = 4, ip_tos = 16 '\020', ip_len = 33792, ip_id = 61492, ip_off = 64, ip_ttl = 64 '@', ip_p = 6 '\006', ip_sum = 34849, ip_src = {s_addr = 355576000}, ip_dst = { s_addr = 2251401408}} Looks like ip_len is in network byte order instead of host byte order and that is causing the problem. 33792 == 0x8400. Swapping that gives 0x84 == 132 which would be a reasonable length. Are you using any firewall rules that would rewrite packets? I wonder if you are having a packet rewritten and the new IP header is written in network byte order, but we swap the IP header len field to host byte order earlier in ip_input(). Luigi Rizzo may have some insight into this. Well, when looking at MH_databuf i see Jboss MQ traffic that would mean that this traffic was coming from or going to an IPsec tunnel, i could say for sure when i would have a clue how to get an IP address from something like ip_src = {s_addr = 355576000}. Something like this should show you the IP: (gdb) set $i = 355576000 (gdb) printf %d.%d.%d.%d\n, $i 24, $i 16 0xff, $i 8 0xff, $i 0xff 21.49.168.192 In this case I probably printed it backwards, so it is probably 192.168.49.21. If it really is IPsec traffic then there are no rewrite rules only 10 pf pass rules on the enc0 interface and a scrub in all rule. Perhaps it matters that i have these set: net.enc.out.ipsec_bpf_mask=0x0001 net.enc.out.ipsec_filter_mask=0x0001 net.enc.in.ipsec_bpf_mask=0x0002 net.enc.in.ipsec_filter_mask=0x0002 so that i can filter the encapsulated traffic. I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have any ideas. -- John Baldwin
Re: make buildkernel failing on zfs
On Friday 22 January 2010 6:17:08 am Jeremy Chadwick wrote: On Fri, Jan 22, 2010 at 09:45:46AM +, N.J. Mann wrote: In message 4b596838.9020...@southportcomputers.co.uk, Colin (free...@southportcomputers.co.uk) wrote: [snip] It goes for a while and then the buildkernel fails with this: cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo n/fs/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common -I/usr/src/sys/modules/zfs/../.. -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso laris/common/zfs -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq -finline-l imit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding - Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas -Wno-missing-prototypes -Wno-undef -Wno-strict-pro totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline -Wno-switch -Wno-pointer-arith -c /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z node.c *** Error code 1 I think this was a temporary problem that has already been fixed. Try updating to the latest version and see if that builds okay. The FreeBSD tinderbox build system noticed this problem as well, dying in the same piece of code. It's not you. 738 01/21 17:04 FreeBSD Tinderbox (9.0K) [releng_7 tinderbox] failure on amd64/amd64 739 01/21 17:44 FreeBSD Tinderbox (8.7K) [releng_7 tinderbox] failure on i386/i386 Normally I'd shake my finger at the committer for committing code to stable branches without testing, but I hold the committer (jhb) in very high regards and he has a very established history of not breaking things + doing excellent work. Mistakes happen, we're all human. :-) Kind words aside, in this case the testing wasn't quite adequate. While I did run-test it under UFS, I only built a GENERIC kernel w/o modules which is how I missed ZFS. Given that the patch touched ZFS I really should have built the module. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD NFS client/Linux NFS server issue
On Tue, 19 Jan 2010 10:02:57 +0200 Mikolaj Golub wrote: So, on some of our freebsd7.1 nfs clients (and it looks like we have had similar case with 6.3), which have several nfs mounts to the same CentOS 5.3 NFS server (mount options: rw,-3,-T,-s,-i,-r=32768,-w=32768,-o=noinet6), at some moment the access to one of the NFS mount gets stuck, while the access to the other mounts works ok. In all cases we have been observed so far the first gotten stuck process was php script (or two) that was (were) writing to logs file (appending). In tcpdump we see that every write to the file causes the sequence of the following rpc: ACCESS - READ - WRITE - COMMIT. And at some moment this stops after READ rpc call and successful reply. After this in tcpdump successful readdir/access/lookup/fstat calls are observed from our other utilities, which just check the presence of some files and they work ok (df also works). The php process at this state is in bo_wwait invalidating buffer cache [1]. If at this time we try accessing the share with mc then it hangs acquiring the vn_lock held by php process [2] and after this any operations with this NFS share hang (df hangs too). If instead some other process is started that writes to some other file on this share (append) then the first process unfreezes too (starting from WRITE rpc, so there is no any retransmits). So it looks for me that the problem here is that eventually problem nfsmount ends up in this state: (kgdb) p *nmp $1 = {nm_mtx = {lock_object = {lo_name = 0xc0b808ee NFSmount lock, lo_type = 0xc0b808ee NFSmount lock, lo_flags = 16973824, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, nm_flag = 35399, nm_state = 1310720, nm_mountp = 0xc6b472cc, nm_numgrps = 16, nm_fh = \001\000\000\000\000\223\000\000\...@\003\n, '\0' repeats 115 times, nm_fhsize = 12, nm_rpcclnt = {rc_flag = 0, rc_wsize = 0, rc_rsize = 0, rc_name = 0x0, rc_so = 0x0, rc_sotype = 0, rc_soproto = 0, rc_soflags = 0, rc_timeo = 0, rc_retry = 0, rc_srtt = {0, 0, 0, 0}, rc_sdrtt = {0, 0, 0, 0}, rc_sent = 0, rc_cwnd = 0, rc_timeouts = 0, rc_deadthresh = 0, rc_authtype = 0, rc_auth = 0x0, rc_prog = 0x0, rc_proctlen = 0, rc_proct = 0x0}, nm_so = 0xc6e81d00, nm_sotype = 1, nm_soproto = 0, nm_soflags = 44, nm_nam = 0xc6948640, nm_timeo = 6000, nm_retry = 2, nm_srtt = {15, 15, 31, 52}, nm_sdrtt = {3, 3, 15, 15}, nm_sent = 0, nm_cwnd = 4096, nm_timeouts = 0, nm_deadthresh = 9, nm_rsize = 32768, nm_wsize = 32768, nm_readdirsize = 4096, nm_readahead = 1, nm_wcommitsize = 1177026, nm_acdirmin = 30, nm_acdirmax = 60, nm_acregmin = 3, nm_acregmax = 60, nm_verf = JК╬W\000\004oМ, nm_bufq = {tqh_first = 0xda82dc70, tqh_last = 0xda8058e0}, nm_bufqlen = 2, nm_bufqwant = 0, nm_bufqiods = 1, nm_maxfilesize = 1099511627775, nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, nm_tprintf_delay = 30, nm_nfstcpstate = { rpcresid = 0, flags = 1, sock_send_inprog = 0}, nm_hostname = 172.30.10.92\000/var/www/app31, '\0' repeats 60 times, nm_clientid = 0, nm_fsid = { val = {0, 0}}, nm_lease_time = 0, nm_last_renewal = 0} We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod thread run for this mount, which is wrong -- nm_bufq will not be emptied until some other process starts writing to the nfsmount and starts nfsiod thread for this mount. Reviewing the code how it could happen I see the following path. Could someone confirm or disprove me? in nfs_bio.c:nfs_asyncio() we have: 1363 mtx_lock(nfs_iod_mtx); ... 1374 /* 1375 * Find a free iod to process this request. 1376 */ 1377 for (iod = 0; iod nfs_numasync; iod++) 1378 if (nfs_iodwant[iod]) { 1379 gotiod = TRUE; 1380 break; 1381 } 1382 1383 /* 1384 * Try to create one if none are free. 1385 */ 1386 if (!gotiod) { 1387 iod = nfs_nfsiodnew(); 1388 if (iod != -1) 1389 gotiod = TRUE; 1390 } Let's consider situation when new nfsiod is created. nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks nfs_iod_mtx: 179 mtx_unlock(nfs_iod_mtx); 180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, 181 0, nfsiod %d, newiod); 182 mtx_lock(nfs_iod_mtx); And nfs_nfsiod.c:nfssvc_iod() do the followin: 226 mtx_lock(nfs_iod_mtx); ... 238 nfs_iodwant[myiod] = curthread-td_proc; 239 nfs_iodmount[myiod] = NULL; ... 244 error = msleep(nfs_iodwant[myiod], nfs_iod_mtx, PWAIT | PCATCH, 245 -, timo); Let's at this moment another nfs_asyncio()
Re: Multiple serial consoles via null modem cable
On Fri, Jan 22, 2010 at 10:02 AM, N.J. Mann n...@njm.me.uk wrote: In message 717f7a3e1001210137p7884adcbxc66a4f7fff928...@mail.gmail.com, Marin Atanasov (dna...@gmail.com) wrote: Hello Jeremy, Now I'm a little confused :) I've made some tests with my machines and a couple of null modem cables, and here's what I've got. On Wed, Jan 20, 2010 at 9:46 AM, Jeremy Chadwick free...@jdc.parodius.comwrote: On Wed, Jan 20, 2010 at 08:46:48AM +0200, Marin Atanasov wrote: Hello, Using `cu' only works with COM1 for me. Currently I have two serial ports on the system, and only the first is able to make the connection - the serial consoles are enabled in /etc/tty, but as I said only COM1 is able to make the connection. I'm a little confused by this statement, so I'll add some clarify: /etc/ttys is for configuring a machine to tie getty (think login prompt) to a device (in this case, a serial port). Meaning: the device on the other end of the serial cable will start seeing login: and so on assuming you attach to the serial port there. For example: box1 COM1/ttyu0 is wired to box2 COM3/ttyu2 using a null modem cable. box1 COM2/ttyu1 is wired to box2 COM4/ttyu3 using a null modem cable. On box1, you'd have something like this in /etc/ttys: ttyu0 /usr/libexec/getty std.9600 vt100 on secure ttyu1 /usr/libexec/getty std.9600 vt100 on secure Here's what I did: box1 COM1/ttyd0 - box2 COM1/ttyd0 - using null modem cable box1 COM2/ttyd1 - box3 COM1/ttyd0 - using null modem cable On box1 I have this in /etc/ttys: ttyd0 /usr/libexec/getty std.9600 vt100 on secure ttyd1 /usr/libexec/getty std.9600 vt100 on secure Now if I want to connect to box1 from box2 or box3 through the serial connection it should work, right? But I only can connect to box1 from box2, because box2's COM port is connected to box1's COM1 port. From box2 I can get a login prompt box2# cu -l /dev/cuad0 -s 9600 Connected login: ) (host.domain) (ttyd0) login: ~ [EOT] But if I try to connect to box1 from box3 - no success there. box3# cu -l /dev/cuad0 -s 9600 Connected ~ [EOT] You need to reduce the number of unknowns, e.g. where is the problem: box1, box3 or in between. So, swap the cables on box1 so that you now have box1:COM1 - box3:COM1 and box1:COM2 - box2:COM1. Now repeat the tests above and post your results. Cheers, Nick. -- Seems I've found the issue, that I'm having - a broken null modem cable :( The last time I was using that cable it was working fine. And now that I connected a second one to the machine, it seemed that only the one connected to COM1 was actually working, and I was left with the impression from the documentation that only COM1 is able to do a serial console connection. I'm very sorry to bother you like that. I'll continue setting up the servers once I get a new null modem cable. Thanks and regards, Marin -- Marin Atanasov Nikolov dnaeon AT gmail DOT com daemon AT unix-heaven DOT org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
8.0-RELEASE - -STABLE and size of /
Hi, I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default 512M. First step after setup was to csup to RELENG_8 and buildkernel and buildworld (no custom kernel, no make.conf). Instaling the new kernel failed, since /boot/kernel/ is already well over 230 MBytes in size. moving that to kernel.old and writing a new one with about the same size fails due to no space left on device. This is not a question; I do know how to get around this and how to configure custom kernels so they are a fragment of that size afterwards. However, I think this is a clear POLA violation. So, either GENERIC with less debugging information (symbols and stuff), which makes debugging harder or setting a higher default for / would be options, if not anyone else has better ideas. - Oliver -- | Oliver Brandmueller http://sysadm.in/ o...@sysadm.in | |Ich bin das Internet. Sowahr ich Gott helfe. | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
Hi All, On Fri, 22 Jan 2010 17:21:56 +0100, Oliver Brandmueller o...@e-gitt.net wrote: Hi, I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default 512M. First step after setup was to csup to RELENG_8 and buildkernel and buildworld (no custom kernel, no make.conf). Instaling the new kernel failed, since /boot/kernel/ is already well over 230 MBytes in size. moving that to kernel.old and writing a new one with about the same size fails due to no space left on device. This is not a question; I do know how to get around this and how to configure custom kernels so they are a fragment of that size afterwards. However, I think this is a clear POLA violation. So, either GENERIC with less debugging information (symbols and stuff), which makes debugging harder or setting a higher default for / would be options, if not anyone else has better ideas. +1 vote for making / bigger. At least a size where a make installkernel runs through. I like FreeBSD because it honors POLA. And as Oliver stated, this is a clear POLA violation. Cheers, Marian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
On Fri, 22 Jan 2010 17:21:56 +0100 Oliver Brandmueller o...@e-gitt.net wrote: Instaling the new kernel failed, since /boot/kernel/ is already well over 230 MBytes in size. moving that to kernel.old and writing a new one with about the same size fails due to no space left on device. This is not a question; I do know how to get around this and how to configure custom kernels so they are a fragment of that size afterwards. It would also be nice if we knew how to configure the whole make world procedure[1] to make a new kernel and modules without symbols. The FAQ doesn't seem to have that answer either. References: 1) http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html -- Regards, Torfinn Ingolfsen ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
On Fri, Jan 22, 2010 at 2:23 AM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: Alexander Motin schrieb am 19.01.2010 17:12 (localtime): ... Patch can be found here: http://people.freebsd.org/~mav/cam-ata.20100119.patch Feedback as always welcome. Again, thanks a lot for your ongoing great work! The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't care. Otherwise I couldn't find any problems. The system detects reinserted SATA drives on ICH9 fine. This was tested on a zfs backup server which went to the backbone yesterday, so I can't physically remove any devices any more for testing... But I had some questions about zfs raidz states. I think that isn't a matter of atacam but if I removed one disk, zpool status still showed me the ada3 device online. After reinserting (and proper detection/initialisazion with cam, ada3 was present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O errors. I coudn't get the device into the pool again, no matter what I tried. Only rebooting the machine helped. Then I could clean and scrub. What are the needed steps to provide a reinsterted hard disk to geom? With the latest patches I don't need to issue any reset/rescan comman, right? So it's a zfs problem, right? My mistake in understanding? In my testing of pulling drives at random (using a 3Ware 9550SXU or 9650SE controller), you have to zpool offline pool device while the drive is unplugged, before you can re-insert the same disk or a different disk. Without doing that step, it's very hard to re-insert the same disk, or replace it with a new one, without rebooting. Took me a couple of reboots and drive replacements before I figured that one out. :) -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: device.hints isn't setting what I want
Just as a side note: does mergemaster or installworld handle the installation of /boot/device.hints? If it's mergemaster, then everything is fine, it'll detect your changes. If it's installworld, you'll lose your changes at the next update. Either way, I find it nicer/simpler to use /boot/loader.conf for this, as nothing in the build/update process touches it, and it keeps all boot/kernel options in one file. And, it overrides any settings in device.hints. -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
On Fri, Jan 22, 2010 at 05:27:52PM +0100, Marian Hettwer wrote: Hi All, On Fri, 22 Jan 2010 17:21:56 +0100, Oliver Brandmueller o...@e-gitt.net wrote: Hi, I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default 512M. First step after setup was to csup to RELENG_8 and buildkernel and buildworld (no custom kernel, no make.conf). Instaling the new kernel failed, since /boot/kernel/ is already well over 230 MBytes in size. moving that to kernel.old and writing a new one with about the same size fails due to no space left on device. This is not a question; I do know how to get around this and how to configure custom kernels so they are a fragment of that size afterwards. However, I think this is a clear POLA violation. So, either GENERIC with less debugging information (symbols and stuff), which makes debugging harder or setting a higher default for / would be options, if not anyone else has better ideas. +1 vote for making / bigger. At least a size where a make installkernel runs through. I like FreeBSD because it honors POLA. And as Oliver stated, this is a clear POLA violation. I'd like to see the default root filesystem size default to 1GB. For most folks this works well. If people are paranoid, 2GB should be more than sufficient. While I'm here, I figure I'd share how I end up partitioning most of the server systems I maintain. I use this general formula when building a new system, unless it's a 4-disk box (see bottom of mail): ad4s1a = /= UFS2= 1GB ad4s1b = swap = (2*RAM) or (2*MaxRAMPossible) ad4s1d = /var = UFS2+SU = 16GB (mandatory: must be = 2*RAM) ad4s1e = /tmp = UFS2+SU = (2*RAM) ad4s1f = /usr = UFS2+SU = 16GB There's lots of leftover space on the disk of course -- for either ad4s1g or ad4s2. For 1-disk boxes, I add ad8s1g = /home = UFS2+SU = remaining space, or sometimes name it /storage (depends on the role of the box). For 2-disk boxes, I almost always go with disks that are identical in size, use the above formula, and add ZFS mirroring as so: ad4s2 = ZFS mirror pool = remaining space ad6= ZFS mirror pool = entire disk Then /home or /storage are ZFS filesystems in that pool. Folks will say but that means you're losing/wasting gigs of space on ad6, since the mirror size is based on the smallest pool member! Yep, but I consider the trade off easily worth it. Given the size of disks today (500GB to 2TB), I really don't stress about it: Wasted space for 4GB RAM systems: 1 + 2*4 + 16 + 2*4 + 16 = 49GB Wasted space for 8GB RAM systems: 1 + 2*8 + 16 + 2*8 + 16 = 65GB If the machine is 4-disk, I use a slightly modified formula: ad4s1a = / = UFS2= 1GB ad4s1b = swap = (2*RAM) or (2*MaxRAMPossible) ad4s1d = /var = UFS2+SU = 16GB (mandatory: must be = 2*RAM) ad4s1e = /tmp = UFS2+SU = (2*RAM) ad4s1f = /usr = UFS2+SU = 16GB ad4s1g = /spare = UFS2+SU = remaining space ad6= ZFS raidz1 pool = entire disk ad8= ZFS raidz1 pool = entire disk ad10 = ZFS raidz1 pool = entire disk The ad4s1g part might seem silly, but I've found it useful. If a filesystem like /var goes awry (usually if bad blocks exist on the disk where that filesystem lies), you can temporarily work around it by rsync'ing as much data over to /spare, then remount /spare as /var to avoid use of the sectors involved in ad4s1d. I've had to do this on two separate occasions. There are network backups for all the boxes, so I don't OCD about it all too much. :-) -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: IPSec NAT-T in transport mode
Hi Yvan, On 10-01-22 Fri 1:19 pm, VANHULLEBUS Yvan wrote: On Thu, Jan 21, 2010 at 04:36:12PM +, David Murray wrote: On 2010-01-20 Wed 1:22 pm, Crest wrote: Yes the NAT-T Patch has been integrated into FreeBSD 8.0. Are we saying that the NAT-T patch is there, but is missing checksum re-calculation, so MPD's packets are going to be discarded? Yes, see my other mail in this thread. (FWIW, this seems to be what happens. All the negotiation to set up IPSEC SAs happens, but MPD's log never shows a single entry. I hadn't got as far as packet dumps when this thread popped up.) And if you have a look at system stats, you'll see lots of UDP packets dropped because of invalid checksums Thanks for taking the time to reply. Actually, I find that each attempt to connect causes netstat -s -p udp to show a few UDP packets arriving and being dropped due to no socket, rather than bad checksums, so maybe I've got some other sort of problem with my mpd config, which I'll look into. -- David Murray ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
On Fri, Jan 22, 2010 at 11:23:55AM +0100, Harald Schmalzbauer wrote: But I had some questions about zfs raidz states. I think that isn't a matter of atacam but if I removed one disk, zpool status still showed me the ada3 device online. After reinserting (and proper detection/initialisazion with cam, ada3 was present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O errors. I coudn't get the device into the pool again, no matter what I tried. Only rebooting the machine helped. Then I could clean and scrub. What are the needed steps to provide a reinsterted hard disk to geom? With the latest patches I don't need to issue any reset/rescan comman, right? So it's a zfs problem, right? My mistake in understanding? I can't speak with regards to the new ATA-via-CAM stuff, but with the classic AHCI (meaning ataahci(4)), the procedure I've used reliably for quite some time on Intel ICHx controllers is this: For SATA disks that are purely UFS/UFS2: - Single-user mode might be required here; it varies - Terminate any processes which rely on filesystems on that disk - umount /filesystem - atacontrol detach ataX (where X = channel associated with disk) - Physically remove bad disk - Physically insert new disk - Wait 15 seconds for stuff to settle - atacontrol attach ataX (where X = previous channel detached) - sade / sysinstall / gpart / whatever you like - Restore data... :-) For SATA disks part of a ZFS mirror or raidz[123] pool: - zpool offline pool disk - atacontrol detach ataX (where X = channel associated with disk) - Physically remove bad disk - Physically insert new disk - Wait 15 seconds for stuff to settle - atacontrol attach ataX (where X = previous channel detached) - zpool replace pool disk - zpool online pool disk -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available
On Friday 22 January 2010 15:20:13 John Baldwin wrote: On Friday 22 January 2010 3:08:45 am Florian Smeets wrote: ... If it really is IPsec traffic then there are no rewrite rules only 10 pf pass rules on the enc0 interface and a scrub in all rule. Perhaps it matters that i have these set: net.enc.out.ipsec_bpf_mask=0x0001 net.enc.out.ipsec_filter_mask=0x0001 net.enc.in.ipsec_bpf_mask=0x0002 net.enc.in.ipsec_filter_mask=0x0002 so that i can filter the encapsulated traffic. I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have any ideas. pf could be the culprit if it were present in the trace, but I don't see any sign of it: On Thursday 21 January 2010 11:10:20 Florian Smeets wrote: #7 0xc0572e48 in m_copydata (m=0x0, off=0, len=40, cp=0xc23cced8 \203??b??\237\f)h?M\220\224?\023?\205K(e??s?\???k?oQ?~\223\020g\030) at /usr/src/sys/kern/uipc_mbuf.c:815 #8 0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1307 #9 0xc05fa30c in ip_input (m=0xc23dc900) at /usr/src/sys/netinet/ip_input.c:609 #10 0xc05c83d5 in netisr_dispatch (num=2, m=0xc23dc900) at /usr/src/sys/net/netisr.c:185 #11 0xc05bf581 in ether_demux (ifp=0xc20a4800, m=0xc23dc900) at /usr/src/sys/net/if_ethersubr.c:834 #12 0xc05bf973 in ether_input (ifp=0xc20a4800, m=0xc23dc900) at /usr/src/sys/net/if_ethersubr.c:692 #13 0xc04b8749 in sis_rxeof (sc=0xc2093800) at /usr/src/sys/dev/sis/if_sis.c:1476 #14 0xc04b8973 in sis_intr (arg=0xc2093800) at /usr/src/sys/dev/sis/if_sis.c:1667 #15 0xc050344b in ithread_loop (arg=0xc20ab410) at /usr/src/sys/kern/kern_intr.c:1126 #16 0xc04ffe36 in fork_exit (callout=0xc05032a0 ithread_loop, arg=0xc20ab410, frame=0xc1f15d38) at /usr/src/sys/kern/kern_fork.c:811 #17 0xc06d9180 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:271 pf does change the byte order in the pfil hook, but changes it back on return to the stack either when returning from the hook or when calling back into the stack. There have been some issues where we missed returns to the stack that would result in this situation, but since pf is not in the trace, this is clearly not the case here. It might indeed be related to enc(4). I remember there have been some issues in IPSEC where it failed to properly copy a packet before modifying it. Maybe this is what is happening. Details escape me at the moment. Can you also make sure that your if_enc.c has revision 174978: http://svn.freebsd.org/viewvc/base/release/7.2.0/sys/net/if_enc.c?view=diffr1=174977r2=174978 Regards, -- Max ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: do I want ch0 or pass1?
In the last episode (Jan 21), Dan Langille said: Please CC me on replies. I'm running into issues with hard-coding some devices (see recent post titled 'device.hints isn't setting what I want'). Associated with this issue is confusion over whether I want to use ch0 or pass1. I have these devices: DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1) DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2) My understanding: chio(1) will with ch0, whereas mtx(1) will work with pass1. Is this correct? More information/elaboration will help I'm sure. Why do I ask? I can get the tape changer and tape drive hardwired to ch0 and sa1 respectively. I cannot [yet] do the same with pass1. You can try wiring them down the same way you wire down regular devices, but if they're created sequentially in probe order, that won't work. Ideally, mtx should use cam_open_spec_device() which, when given a device name, will automatically open the matching pass device. -- Dan Nelson dnel...@allantgroup.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
posting coding bounties, appropriate money amounts?
Hello I am curious about posting some coding bounties, my current interest revolves around improving the ZVOL functionality in FreeBSD: fixing the known ZVOL SWAP reliability/stability problems as well as making ZVOLs work as a dumpon device (as is already the case in OpenSolaris) for crash dumps. I am a private individual and not some huge Fortune 100 and while I am not exactly rich, I am willing to put some of my personal money towards this. I am curious though, what would be the best way to approach this: directly approaching committer(s) with the know-how-and-why of the areas involved or through the FreeBSD Foundation? And how would one go about calculating the appropriate amount of money for such a thing? Thanks. - Sincerely, Dan Naumov ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: IPSec NAT-T in transport mode
Hi Yvan, On 10-01-22 Fri 5:15 pm, David Murray wrote: On 10-01-22 Fri 1:19 pm, VANHULLEBUS Yvan wrote: On Thu, Jan 21, 2010 at 04:36:12PM +, David Murray wrote: On 2010-01-20 Wed 1:22 pm, Crest wrote: Yes the NAT-T Patch has been integrated into FreeBSD 8.0. Are we saying that the NAT-T patch is there, but is missing checksum re-calculation, so MPD's packets are going to be discarded? Yes, see my other mail in this thread. (FWIW, this seems to be what happens. All the negotiation to set up IPSEC SAs happens, but MPD's log never shows a single entry. I hadn't got as far as packet dumps when this thread popped up.) And if you have a look at system stats, you'll see lots of UDP packets dropped because of invalid checksums Actually, I find that each attempt to connect causes netstat -s -p udp to show a few UDP packets arriving and being dropped due to no socket, rather than bad checksums, so maybe I've got some other sort of problem with my mpd config, which I'll look into. Ah, yes, I'd forgotten that my external IP address had changed since I last tried this, so I needed to restart racoon and ipsec. So now, like you say, I see UDP packets dropped due to bad checksums. I'll have a look at the NAT-T RFQs just in case support for NAT-OA payloads is something I could help with, but I suspect it'll need an in-depth knowledge of the IP stack. Thanks! -- David Murray ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
On 01/22/10 11:48, Freddie Cash wrote: On Fri, Jan 22, 2010 at 2:23 AM, Harald Schmalzbauer h.schmalzba...@omnilan.de wrote: Alexander Motin schrieb am 19.01.2010 17:12 (localtime): ... Patch can be found here: http://people.freebsd.org/~mav/cam-ata.20100119.patch Feedback as always welcome. Again, thanks a lot for your ongoing great work! The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't care. Otherwise I couldn't find any problems. The system detects reinserted SATA drives on ICH9 fine. This was tested on a zfs backup server which went to the backbone yesterday, so I can't physically remove any devices any more for testing... But I had some questions about zfs raidz states. I think that isn't a matter of atacam but if I removed one disk, zpool status still showed me the ada3 device online. After reinserting (and proper detection/initialisazion with cam, ada3 was present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O errors. I coudn't get the device into the pool again, no matter what I tried. Only rebooting the machine helped. Then I could clean and scrub. What are the needed steps to provide a reinsterted hard disk to geom? With the latest patches I don't need to issue any reset/rescan comman, right? So it's a zfs problem, right? My mistake in understanding? In my testing of pulling drives at random (using a 3Ware 9550SXU or 9650SE controller), you have to zpool offlinepool device while the drive is unplugged, before you can re-insert the same disk or a different disk. Without doing that step, it's very hard to re-insert the same disk, or replace it with a new one, without rebooting. Took me a couple of reboots and drive replacements before I figured that one out. :) I think you can do it without the 'zpool offline pool device' command; I may be wrong, but I believe you can use 'zpool replace' to accomplish what you're trying to do. i.e. if you have a bad drive ada3, and take it out, then replace it with a new disk, you can issue a 'zpool replace pool /dev/ada3 /dev/ada3' (yes, the same device is specified twice). ZFS should recognize that its a different disk and/or that it is lacking ZFS metadata and begin to resilver the pool onto the new device. If you watch 'zfs status' in the process you'll see something like: raidz1 DEGRADED 0 0 0 label/ada4ONLINE 0 0 0 12.4M resilvered label/ada5ONLINE 0 0 0 12.4M resilvered label/ada6ONLINE 0 0 0 12.3M resilvered replacing DEGRADED 0 0 0 label/ada3/old UNAVAIL 0 595 0 cannot open label/ada3 ONLINE 0 0 0 9.74G resilvered Try it out and let me know if it works for you. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: do I want ch0 or pass1?
On Thu, 21 Jan 2010 20:23:43 -0500 Dan Langille d...@langille.org wrote: Please CC me on replies. I'm running into issues with hard-coding some devices (see recent post titled 'device.hints isn't setting what I want'). Associated with this issue is confusion over whether I want to use ch0 or pass1. I have these devices: DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1) DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2) My understanding: chio(1) will with ch0, whereas mtx(1) will work with pass1. Is this correct? More information/elaboration will help I'm sure. Why do I ask? I can get the tape changer and tape drive hardwired to ch0 and sa1 respectively. I cannot [yet] do the same with pass1. Thanks folks. Hi Dan You might take a look at this thread. It looks like what you want to do. http://lists.freebsd.org/pipermail/freebsd-questions/2007-April/146738.html HTH Robert ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
On Fri, Jan 22, 2010 at 10:28 AM, Steve Polyack kor...@comcast.net wrote: On 01/22/10 11:48, Freddie Cash wrote: In my testing of pulling drives at random (using a 3Ware 9550SXU or 9650SE controller), you have to zpool offlinepool device while the drive is unplugged, before you can re-insert the same disk or a different disk. Without doing that step, it's very hard to re-insert the same disk, or replace it with a new one, without rebooting. Took me a couple of reboots and drive replacements before I figured that one out. :) I think you can do it without the 'zpool offline pool device' command; I may be wrong, but I believe you can use 'zpool replace' to accomplish what you're trying to do. i.e. if you have a bad drive ada3, and take it out, then replace it with a new disk, you can issue a 'zpool replace pool /dev/ada3 /dev/ada3' (yes, the same device is specified twice). ZFS should recognize that its a different disk and/or that it is lacking ZFS metadata and begin to resilver the pool onto the new device. If you watch 'zfs status' in the process you'll see something like: Yes, that does work ... but it's not nearly as reliable as doing the offline first. If you do things in the right order, drives can be replaced and resilvering started within minutes (our process takes a little less than 5 minutes, but the bulk of that is removing the dead drive from the caddy, and adding the new drive to the caddy). Do things in the wrong order, and it can take 15 minutes or more, and may require rebooting the system (as our manager discovered trying to replace a drive while I was away). :) Just because there are shortcuts available ... doesn't mean you should always take them. :D -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available
On Friday 22 January 2010 12:18:20 pm Max Laier wrote: On Friday 22 January 2010 15:20:13 John Baldwin wrote: On Friday 22 January 2010 3:08:45 am Florian Smeets wrote: ... If it really is IPsec traffic then there are no rewrite rules only 10 pf pass rules on the enc0 interface and a scrub in all rule. Perhaps it matters that i have these set: net.enc.out.ipsec_bpf_mask=0x0001 net.enc.out.ipsec_filter_mask=0x0001 net.enc.in.ipsec_bpf_mask=0x0002 net.enc.in.ipsec_filter_mask=0x0002 so that i can filter the encapsulated traffic. I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have any ideas. pf could be the culprit if it were present in the trace, but I don't see any sign of it: On Thursday 21 January 2010 11:10:20 Florian Smeets wrote: #7 0xc0572e48 in m_copydata (m=0x0, off=0, len=40, cp=0xc23cced8 \203??b??\237\f)h?M\220\224?\023?\205K(e??s?\???k?oQ?~\223\020g\030) at /usr/src/sys/kern/uipc_mbuf.c:815 #8 0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1307 #9 0xc05fa30c in ip_input (m=0xc23dc900) at /usr/src/sys/netinet/ip_input.c:609 #10 0xc05c83d5 in netisr_dispatch (num=2, m=0xc23dc900) at /usr/src/sys/net/netisr.c:185 #11 0xc05bf581 in ether_demux (ifp=0xc20a4800, m=0xc23dc900) at /usr/src/sys/net/if_ethersubr.c:834 #12 0xc05bf973 in ether_input (ifp=0xc20a4800, m=0xc23dc900) at /usr/src/sys/net/if_ethersubr.c:692 #13 0xc04b8749 in sis_rxeof (sc=0xc2093800) at /usr/src/sys/dev/sis/if_sis.c:1476 #14 0xc04b8973 in sis_intr (arg=0xc2093800) at /usr/src/sys/dev/sis/if_sis.c:1667 #15 0xc050344b in ithread_loop (arg=0xc20ab410) at /usr/src/sys/kern/kern_intr.c:1126 #16 0xc04ffe36 in fork_exit (callout=0xc05032a0 ithread_loop, arg=0xc20ab410, frame=0xc1f15d38) at /usr/src/sys/kern/kern_fork.c:811 #17 0xc06d9180 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:271 pf does change the byte order in the pfil hook, but changes it back on return to the stack either when returning from the hook or when calling back into the stack. There have been some issues where we missed returns to the stack that would result in this situation, but since pf is not in the trace, this is clearly not the case here. That isn't necessarily the case. ip_input() invokes the PFIL hooks which then return after possibly modifying the packet. The (possibly modified) packet is then passed to ip_forward() from ip_input(). If the PFIL hook modified the packet and returned ip_len in network byte order then it would cause this breakage without showing up in the stack trace. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD NFS client/Linux NFS server issue
On Fri, 22 Jan 2010, Mikolaj Golub wrote: We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod thread run for this mount, which is wrong -- nm_bufq will not be emptied until some other process starts writing to the nfsmount and starts nfsiod thread for this mount. Reviewing the code how it could happen I see the following path. Could someone confirm or disprove me? in nfs_bio.c:nfs_asyncio() we have: 1363 mtx_lock(nfs_iod_mtx); ... 1374 /* 1375 * Find a free iod to process this request. 1376 */ 1377 for (iod = 0; iod nfs_numasync; iod++) 1378 if (nfs_iodwant[iod]) { 1379 gotiod = TRUE; 1380 break; 1381 } 1382 1383 /* 1384 * Try to create one if none are free. 1385 */ 1386 if (!gotiod) { 1387 iod = nfs_nfsiodnew(); 1388 if (iod != -1) 1389 gotiod = TRUE; 1390 } Let's consider situation when new nfsiod is created. nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks nfs_iod_mtx: 179 mtx_unlock(nfs_iod_mtx); 180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, 181 0, nfsiod %d, newiod); 182 mtx_lock(nfs_iod_mtx); And nfs_nfsiod.c:nfssvc_iod() do the followin: 226 mtx_lock(nfs_iod_mtx); ... 238 nfs_iodwant[myiod] = curthread-td_proc; 239 nfs_iodmount[myiod] = NULL; ... 244 error = msleep(nfs_iodwant[myiod], nfs_iod_mtx, PWAIT | PCATCH, 245 -, timo); Let's at this moment another nfs_asyncio() request for another nfsmount has happened and this thread has locked nfs_iod_mtx. Then this thread will found nfs_iodwant[iod] in for loop and will use it. When the first thread actually has returned from nfs_nfsiodnew() it will insert buffer to nmp-nm_bufq but nfsiod will process other nmp. Ok, good catch, I think you've found the problem (or at least a race that might have caused it). It looks like the fix for this situation would be to check nfs_iodwant[iod] after nfs_nfsiodnew(): --- nfs_bio.c.orig 2010-01-22 15:38:02.0 + +++ nfs_bio.c 2010-01-22 15:39:58.0 + @@ -1385,7 +1385,7 @@ again: */ if (!gotiod) { iod = nfs_nfsiodnew(); - if (iod != -1) + if ((iod != -1) (nfs_iodwant[iod] == NULL)) gotiod = TRUE; } Unfortunately, I don't think the above fixes the problem. If another thread that called nfs_asyncio() has stolen the this iod, it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238) and it will remain NULL until the other thread is done with it. If you instead make it: if (iod != -1 nfs_iodwant[iod] != NULL) gotiod = TRUE; then I think it fixes your scenario above, but will break for the case where the mtx_lock(nfs_iod_mtx) call in nfs_nfsnewiod() (#182) wins out over the one near the beginning of nfssvc_iod() (#226), since in that case, nfs_iodwant[iod] will still be NULL because it hasn't yet been set by nfssvc_iod() (#238). There should probably be some sort of 3 way handshake between the code in nfs_asyncio() after calling nfs_nfsnewiod() and the code near the beginning of nfssvc_iod(), but I think the following somewhat cheesy fix might do the trick: if (!gotiod) { iod = nfs_nfsiodnew(); if (iod != -1) { if (nfs_iodwant[iod] == NULL) { /* * Either another thread has acquired this * iod or I acquired the nfs_iod_mtx mutex * before the new iod thread did in * nfssvc_iod(). To be safe, go back and * try again after allowing another thread * to acquire the nfs_iod_mtx mutex. */ mtx_unlock(nfs_iod_mtx); /* * So long as mtx_lock() implements some * sort of fairness, nfssvc_iod() should * get nfs_iod_mtx here and set * nfs_iodwant[iod] != NULL for the case * where the iod has not been stolen by * another thread for a different mount * point. */ mtx_lock(nfs_iod_mtx); goto again; } gotiod = TRUE; }
Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available
On Friday 22 January 2010 19:49:19 John Baldwin wrote: On Friday 22 January 2010 12:18:20 pm Max Laier wrote: On Friday 22 January 2010 15:20:13 John Baldwin wrote: On Friday 22 January 2010 3:08:45 am Florian Smeets wrote: ... If it really is IPsec traffic then there are no rewrite rules only 10 pf pass rules on the enc0 interface and a scrub in all rule. Perhaps it matters that i have these set: net.enc.out.ipsec_bpf_mask=0x0001 net.enc.out.ipsec_filter_mask=0x0001 net.enc.in.ipsec_bpf_mask=0x0002 net.enc.in.ipsec_filter_mask=0x0002 so that i can filter the encapsulated traffic. I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have any ideas. pf could be the culprit if it were present in the trace, but I don't see any sign of it: On Thursday 21 January 2010 11:10:20 Florian Smeets wrote: #7 0xc0572e48 in m_copydata (m=0x0, off=0, len=40, cp=0xc23cced8 \203??b??\237\f)h?M\220\224?\023?\205K(e??s?\???k?oQ?~\223\020g\030) at /usr/src/sys/kern/uipc_mbuf.c:815 #8 0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at /usr/src/sys/netinet/ip_input.c:1307 #9 0xc05fa30c in ip_input (m=0xc23dc900) at /usr/src/sys/netinet/ip_input.c:609 #10 0xc05c83d5 in netisr_dispatch (num=2, m=0xc23dc900) at /usr/src/sys/net/netisr.c:185 #11 0xc05bf581 in ether_demux (ifp=0xc20a4800, m=0xc23dc900) at /usr/src/sys/net/if_ethersubr.c:834 #12 0xc05bf973 in ether_input (ifp=0xc20a4800, m=0xc23dc900) at /usr/src/sys/net/if_ethersubr.c:692 #13 0xc04b8749 in sis_rxeof (sc=0xc2093800) at /usr/src/sys/dev/sis/if_sis.c:1476 #14 0xc04b8973 in sis_intr (arg=0xc2093800) at /usr/src/sys/dev/sis/if_sis.c:1667 #15 0xc050344b in ithread_loop (arg=0xc20ab410) at /usr/src/sys/kern/kern_intr.c:1126 #16 0xc04ffe36 in fork_exit (callout=0xc05032a0 ithread_loop, arg=0xc20ab410, frame=0xc1f15d38) at /usr/src/sys/kern/kern_fork.c:811 #17 0xc06d9180 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:271 pf does change the byte order in the pfil hook, but changes it back on return to the stack either when returning from the hook or when calling back into the stack. There have been some issues where we missed returns to the stack that would result in this situation, but since pf is not in the trace, this is clearly not the case here. That isn't necessarily the case. ip_input() invokes the PFIL hooks which then return after possibly modifying the packet. The (possibly modified) packet is then passed to ip_forward() from ip_input(). If the PFIL hook modified the packet and returned ip_len in network byte order then it would cause this breakage without showing up in the stack trace. What I meant to say was: if we return from the pfil hook we either report error (and/or consume the mbuf) or switch back to network byte order: http://fxr.watson.org/fxr/source/contrib/pf/net/pf_ioctl.c?v=FREEBSD72#L3655 While I can't completely rule out that there is a double flip happening in some obscure path through pf, I very much doubt this is what is going on (or there would be more reports and it would happen straight away, not only after passing some data). A quick search through the sources also didn't turn up any red flags. All byte order operations inside pf are either temporary or performed on a properly copied packet that is send back through the stack (icmp error, tcp packet, ...). Depending on how easily this can be reproduced, my money is on modifying a shared mbuf (possibly inside enc(4)). Regards, -- Max ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD NFS client/Linux NFS server issue
On Fri, 22 Jan 2010 14:37:48 -0500 (EST) Rick Macklem wrote: --- nfs_bio.c.orig 2010-01-22 15:38:02.0 + +++ nfs_bio.c 2010-01-22 15:39:58.0 + @@ -1385,7 +1385,7 @@ again: */ if (!gotiod) { iod = nfs_nfsiodnew(); - if (iod != -1) + if ((iod != -1) (nfs_iodwant[iod] == NULL)) gotiod = TRUE; } Unfortunately, I don't think the above fixes the problem. If another thread that called nfs_asyncio() has stolen the this iod, it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238) and it will remain NULL until the other thread is done with it. I see. I have missed this. Thanks. There should probably be some sort of 3 way handshake between the code in nfs_asyncio() after calling nfs_nfsnewiod() and the code near the beginning of nfssvc_iod(), but I think the following somewhat cheesy fix might do the trick: if (!gotiod) { iod = nfs_nfsiodnew(); if (iod != -1) { if (nfs_iodwant[iod] == NULL) { /* * Either another thread has acquired this * iod or I acquired the nfs_iod_mtx mutex * before the new iod thread did in * nfssvc_iod(). To be safe, go back and * try again after allowing another thread * to acquire the nfs_iod_mtx mutex. */ mtx_unlock(nfs_iod_mtx); /* * So long as mtx_lock() implements some * sort of fairness, nfssvc_iod() should * get nfs_iod_mtx here and set * nfs_iodwant[iod] != NULL for the case * where the iod has not been stolen by * another thread for a different mount * point. */ mtx_lock(nfs_iod_mtx); goto again; } gotiod = TRUE; } } Does anyone else have a better solution? (Mikolaj, could you by any chance test this? You can test yours, but I think it breaks.) Unfortunately we observed this only on our production servers. A week ago we made some changes in configuration as workaround -- reconfigure cron no to run scripts simultaneously, set the scripts in cron that just periodically write a line to the file on nfs share (to unlock it if it is locked). We have not been observed problems since then and we would not like to experiment in production. If I manage to produce good test case in test environment I will be able to test the patch but I am not sure... -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
On Friday 22 January 2010 11:46:01 am Torfinn Ingolfsen wrote: On Fri, 22 Jan 2010 17:21:56 +0100 Oliver Brandmueller o...@e-gitt.net wrote: Instaling the new kernel failed, since /boot/kernel/ is already well over 230 MBytes in size. moving that to kernel.old and writing a new one with about the same size fails due to no space left on device. This is not a question; I do know how to get around this and how to configure custom kernels so they are a fragment of that size afterwards. It would also be nice if we knew how to configure the whole make world procedure[1] to make a new kernel and modules without symbols. The FAQ doesn't seem to have that answer either. References: 1) http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html in your /etc/make.conf, do you have a line like: makeoptions DEBUG=-g if so, comment it out. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Pack of CAM improvements
2010/1/22 Harald Schmalzbauer h.schmalzba...@omnilan.de: Alexander Motin schrieb am 19.01.2010 17:12 (localtime): ... Patch can be found here: http://people.freebsd.org/~mav/cam-ata.20100119.patch Feedback as always welcome. Again, thanks a lot for your ongoing great work! The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't care. Since r202799 it applies cleanly to 8-STABLE. Otherwise I couldn't find any problems. The system detects reinserted SATA drives on ICH9 fine. This was tested on a zfs backup server which went to the backbone yesterday, so I can't physically remove any devices any more for testing... But I had some questions about zfs raidz states. I think that isn't a matter of atacam but if I removed one disk, zpool status still showed me the ada3 device online. After reinserting (and proper detection/initialisazion with cam, ada3 was present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O errors. I coudn't get the device into the pool again, no matter what I tried. Only rebooting the machine helped. Then I could clean and scrub. What are the needed steps to provide a reinsterted hard disk to geom? With the latest patches I don't need to issue any reset/rescan comman, right? So it's a zfs problem, right? My mistake in understanding? Thanks, -Harry -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org- against HTML email vCards X www: http://www.gid0.org- against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Multiple serial consoles via null modem cable
* Jeremy Chadwick free...@jdc.parodius.com [2010-01-19 23:46 -0800]: You cannot do something like where box1 COM1 is wired to box2 COM1, and depending on what box you're on doing the cu -l ttyu0 from, get a login prompt on the other. It doesn't work like that. :-) Isn't the reason for different dial-in and dial-out devices that this should work? Or does that only work with modem? http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/serial.html#ACCESS-SERIAL-PORTS Nicolas -- http://www.rachinsky.de/nicolas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Multiple serial consoles via null modem cable
--On Friday, January 22, 2010 10:05 PM +0100 Nicolas Rachinsky fbsd-stabl...@ml.turing-complete.org wrote: * Jeremy Chadwick free...@jdc.parodius.com [2010-01-19 23:46 -0800]: You cannot do something like where box1 COM1 is wired to box2 COM1, and depending on what box you're on doing the cu -l ttyu0 from, get a login prompt on the other. It doesn't work like that. :-) Isn't the reason for different dial-in and dial-out devices that this should work? Or does that only work with modem? You can't with two directly connected machines. When the two are physically wired together, and getty is configured (via ttys) to fire up on the port it takes over the port. If you connect two machines via a null modem cable, both with getty on the same port, the getty's will be chatting with each other. The locking mechanism will break the chat loop when you try to use the dialout device on one end or the other but you may have to wait some time before the other end restarts getty (because it previously would have been dieing very rapidly due to login failures) A NULL modem connection is ALWAYS active. A regular modem, is NOT. It has a state of 'inactive' or 'waiting for ring' if you will. The correct way to do what you want is as others have suggested, two serial null modem cables, and two com ports on each machine. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available
On 1/22/10 6:18 PM, Max Laier wrote: pf does change the byte order in the pfil hook, but changes it back on return to the stack either when returning from the hook or when calling back into the stack. There have been some issues where we missed returns to the stack that would result in this situation, but since pf is not in the trace, this is clearly not the case here. It might indeed be related to enc(4). I remember there have been some issues in IPSEC where it failed to properly copy a packet before modifying it. Maybe this is what is happening. Details escape me at the moment. Can you also make sure that your if_enc.c has revision 174978: http://svn.freebsd.org/viewvc/base/release/7.2.0/sys/net/if_enc.c?view=diffr1=174977r2=174978 Yes i have the latest if_enc.c available on stable/7 cvs 1.6.2.3, svn r183630 World and kernel were compiled from sources csuped on 12.01.2010. Thanks, Florian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD NFS client/Linux NFS server issue
On Fri, 22 Jan 2010, Rick Macklem wrote: There should probably be some sort of 3 way handshake between the code in nfs_asyncio() after calling nfs_nfsnewiod() and the code near the beginning of nfssvc_iod(), but I think the following somewhat cheesy fix might do the trick: [stuff deleted] I know it's a little weird to reply to my own posting, but I think this might be a reasonable patch (I have only tested it for a few minutes at this point). I basically redefined nfs_iodwant[] as a tri-state variable (although it was a struct proc *, it was only tested NULL/non-NULL). 0 - was NULL 1 - was non-NULL -1 - just created by nfs_asyncio() and will be used by it I'll keep testing it, but hopefully someone else can test and/or review it... rick ps: Mikolaj, I'm a sysadmin so I understand the problems with production systems, but if you do get a chance to test it somehow, that would be great. pss: This is against -current, but hopefully stable/7 can be patched about the same. --- patch for nfsiod race against -current --- --- nfsclient/nfs.h.sav 2010-01-22 16:21:53.0 -0500 +++ nfsclient/nfs.h 2010-01-22 16:22:04.0 -0500 @@ -252,7 +252,7 @@ intnfs_commit(struct vnode *vp, u_quad_t offset, int cnt, struct ucred *cred, struct thread *td); intnfs_readdirrpc(struct vnode *, struct uio *, struct ucred *); -intnfs_nfsiodnew(void); +intnfs_nfsiodnew(int); intnfs_asyncio(struct nfsmount *, struct buf *, struct ucred *, struct thread *); intnfs_doio(struct vnode *, struct buf *, struct ucred *, struct thread *); void nfs_doio_directwrite (struct buf *); --- nfsclient/nfsnode.h.sav 2010-01-22 14:56:34.0 -0500 +++ nfsclient/nfsnode.h 2010-01-22 14:56:52.0 -0500 @@ -180,7 +180,7 @@ * Queue head for nfsiod's */ extern TAILQ_HEAD(nfs_bufq, buf) nfs_bufq; -extern struct proc *nfs_iodwant[NFS_MAXASYNCDAEMON]; +extern int nfs_iodwant[NFS_MAXASYNCDAEMON]; extern struct nfsmount *nfs_iodmount[NFS_MAXASYNCDAEMON]; #if defined(_KERNEL) --- nfsclient/nfs_bio.c.sav 2010-01-22 14:57:28.0 -0500 +++ nfsclient/nfs_bio.c 2010-01-22 16:17:24.0 -0500 @@ -1377,7 +1377,7 @@ * Find a free iod to process this request. */ for (iod = 0; iod nfs_numasync; iod++) - if (nfs_iodwant[iod]) { + if (nfs_iodwant[iod] 0) { gotiod = TRUE; break; } @@ -1386,7 +1386,7 @@ * Try to create one if none are free. */ if (!gotiod) { - iod = nfs_nfsiodnew(); + iod = nfs_nfsiodnew(1); if (iod != -1) gotiod = TRUE; } @@ -1398,7 +1398,7 @@ */ NFS_DPF(ASYNCIO, (nfs_asyncio: waking iod %d for mount %p\n, iod, nmp)); - nfs_iodwant[iod] = NULL; + nfs_iodwant[iod] = 0; nfs_iodmount[iod] = nmp; nmp-nm_bufqiods++; wakeup(nfs_iodwant[iod]); --- nfsclient/nfs_nfsiod.c.sav 2010-01-22 14:57:28.0 -0500 +++ nfsclient/nfs_nfsiod.c 2010-01-22 16:32:31.0 -0500 @@ -113,7 +113,7 @@ * than the new minimum, create some more. */ for (i = nfs_iodmin - nfs_numasync; i 0; i--) - nfs_nfsiodnew(); + nfs_nfsiodnew(0); out: mtx_unlock(nfs_iod_mtx); return (0); @@ -147,7 +147,7 @@ */ iod = nfs_numasync - 1; for (i = 0; i nfs_numasync - nfs_iodmax; i++) { - if (nfs_iodwant[iod]) + if (nfs_iodwant[iod] 0) wakeup(nfs_iodwant[iod]); iod--; } @@ -160,7 +160,7 @@ Max number of nfsiod kthreads); int -nfs_nfsiodnew(void) +nfs_nfsiodnew(int set_iodwant) { int error, i; int newiod; @@ -176,12 +176,17 @@ } if (newiod == -1) return (-1); + if (set_iodwant 0) + nfs_iodwant[i] = -1; mtx_unlock(nfs_iod_mtx); error = kproc_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID, 0, nfsiod %d, newiod); mtx_lock(nfs_iod_mtx); - if (error) + if (error) { + if (set_iodwant 0) + nfs_iodwant[i] = 0; return (-1); + } nfs_numasync++; return (newiod); } @@ -199,7 +204,7 @@ nfs_iodmin = NFS_MAXASYNCDAEMON; for (i = 0; i nfs_iodmin; i++) { - error = nfs_nfsiodnew(); + error = nfs_nfsiodnew(0); if (error == -1) panic(nfsiod_setup: nfs_nfsiodnew failed); } @@ -236,7 +241,8 @@ goto finish; if (nmp) nmp-nm_bufqiods--; - nfs_iodwant[myiod] = curthread-td_proc; + if
Re: top Segmentation faulting on 8.0p2 amd64
On Wed, 20 Jan 2010 08:06:23 +0100 Harald Schmalzbauer wrote: Dear all, I have no idea why top crashes with segmentation fault on my amd64 machine running FreeBSD 8.0-RELEASE-p2. If someone wants to have a loot at the core dump: http://www.schmalzbauer.de/downloads/top.core core file is useless without binary and libraries. So it is better to run gdb on your host, produce backtrace and post here: gdb /usr/bin/top top.core bt And sure a backtrace from the top built with -g would be much better. cd /usr/src/usr.bin/top CFLAGS=-g make -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: posting coding bounties, appropriate money amounts?
Dan Naumov wrote: Hello I am curious about posting some coding bounties, my current interest revolves around improving the ZVOL functionality in FreeBSD: fixing the known ZVOL SWAP reliability/stability problems as well as making ZVOLs work as a dumpon device (as is already the case in OpenSolaris) for crash dumps. I am a private individual and not some huge Fortune 100 and while I am not exactly rich, I am willing to put some of my personal money towards this. I am curious though, what would be the best way to approach this: directly approaching committer(s) with the know-how-and-why of the areas involved or through the FreeBSD Foundation? And how would one go about calculating the appropriate amount of money for such a thing? Hi, This idea (bounties) appear approximately every 6 months and it appears there is no better way than contacting the developers directly. AFAIK all attempts to conglomerate such an effort have failed. One important conclusion is that it cannot go through the Foundation since they cannot accept targeted donations. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
Hi, On Fri, Jan 22, 2010 at 03:56:31PM -0500, Steven Friedrich wrote: in your /etc/make.conf, do you have a line like: makeoptions DEBUG=-g if so, comment it out. The GENEREIC kernel by default has the following config: makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols You don't need anything special in your make.conf In fact having the debug symbols is useful in many cases. So raising the default size for the / partition might be the better option (OK, doesn't help for already installed systems of course). - Oliver -- | Oliver Brandmueller http://sysadm.in/ o...@sysadm.in | |Ich bin das Internet. Sowahr ich Gott helfe. | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
IPSec NAT-T in transport mode
I'm very interested in this problem -- I want to run an L2TP server myself. Is anyone actually working on this? I might be able to chip in a few bucks... But I'm not seeing bad checksums. Here's my setup: L2tp server AB Freebsd NAT box C ---internal network---D my mac Where should I be seeing the bad checksums? A, B, C, or D? Looking only at B, I don't see any bad udp checksums, but I'm seeing a bunch of these (IP numbers changed to bracketed names): 23:49:48.004107 IP (tos 0x0, ttl 64, id 52328, offset 0, flags [none], proto ICMP (1), length 56) [NAT Box] [External Server] ICMP [NAT Box] udp port 58660 unreachable, length 36 IP (tos 0x20, ttl 59, id 36320, offset 0, flags [none], proto UDP (17), length 143) [External Server].1701 [NAT Box].58660: [|l2tp] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: posting coding bounties, appropriate money amounts?
On Fri, Jan 22, 2010 at 3:06 PM, Ivan Voras ivo...@freebsd.org wrote: Dan Naumov wrote: Hello I am curious about posting some coding bounties, my current interest revolves around improving the ZVOL functionality in FreeBSD: fixing the known ZVOL SWAP reliability/stability problems as well as making ZVOLs work as a dumpon device (as is already the case in OpenSolaris) for crash dumps. I am a private individual and not some huge Fortune 100 and while I am not exactly rich, I am willing to put some of my personal money towards this. I am curious though, what would be the best way to approach this: directly approaching committer(s) with the know-how-and-why of the areas involved or through the FreeBSD Foundation? And how would one go about calculating the appropriate amount of money for such a thing? Hi, This idea (bounties) appear approximately every 6 months and it appears there is no better way than contacting the developers directly. AFAIK all attempts to conglomerate such an effort have failed. One important conclusion is that it cannot go through the Foundation since they cannot accept targeted donations. Awhile back, we built a simple app for posting bounties, getting devs and sponsors on board, posting the committed code in a browser viewable format, and then handle final payout upon completion. iXsystems is more than willing to handle financial details and I would gladly be the first to sponsor this project on the site. http://www.sponsorbsd.org We would need a team leader *cough* Ivan *cough* that could make sure developing contributors are actually involved so that the final payoff can be shared accordingly. It's a cakephp app and I'm sure it needs a bit more polish but we could do it on the fly and it shouldn't be to hard :) Any cakephp or php devs interested in helping testing and launch, let me know. I just haven't had much time to spend on launching it although I still think it's a great idea. If somebody would like to spearhead this effort, that would be great. For companies wishing to sponsor non-community code, it also has the option of hiding the community committed code. best, -matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: posting coding bounties, appropriate money amounts?
On Fri, Jan 22, 2010 at 07:49:46PM +0200, Dan Naumov wrote: I am curious about posting some coding bounties, my current interest revolves around improving the ZVOL functionality in FreeBSD: fixing the known ZVOL SWAP reliability/stability problems as well as making ZVOLs work as a dumpon device (as is already the case in OpenSolaris) for crash dumps. I am a private individual and not some huge Fortune 100 and while I am not exactly rich, I am willing to put some of my personal money towards this. I am curious though, what would be the best way to approach this: directly approaching committer(s) with the know-how-and-why of the areas involved or through the FreeBSD Foundation? And how would one go about calculating the appropriate amount of money for such a thing? For what it's worth: count me in here, and not just with regards to zvol. I'd be more than happy to donate money to a pool (pun intended) to get some of the ZFS-centric issues looked at / focused on, and possibly fixed. I'd be willing to put up a thousand USD or possibly more depending on what sort of work was being considered. I suppose a better choice would be for someone here to make a list of issues which the community feels need attention, and put the pooled donations to whatever things had highest priority -- or, if that isn't plausible, then to what interested developers wanted to work on. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: top Segmentation faulting on 8.0p2 amd64 (nss_ldapd problem?)
Mikolaj Golub schrieb am 22.01.2010 23:26 (localtime): On Wed, 20 Jan 2010 08:06:23 +0100 Harald Schmalzbauer wrote: Dear all, I have no idea why top crashes with segmentation fault on my amd64 machine running FreeBSD 8.0-RELEASE-p2. If someone wants to have a loot at the core dump: http://www.schmalzbauer.de/downloads/top.core core file is useless without binary and libraries. So it is better to run gdb on your host, produce backtrace and post here: gdb /usr/bin/top top.core bt And sure a backtrace from the top built with -g would be much better. cd /usr/src/usr.bin/top CFLAGS=-g make Unfortunately nss_ldap seems to be the culprit. gdb /usr/bin/top top.core GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd... Core was generated by `top'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libncurses.so.8...done. Loaded symbols for /lib/libncurses.so.8 Reading symbols from /lib/libm.so.5...done. Loaded symbols for /lib/libm.so.5 Reading symbols from /lib/libkvm.so.5...done. Loaded symbols for /lib/libkvm.so.5 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /usr/local/lib/nss_ldap.so.1...done. Loaded symbols for /usr/local/lib/nss_ldap.so.1 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 bt: #0 0x000800d08403 in __nss_compat_gethostbyname () from /usr/local/lib/nss_ldap.so.1 #0 0x000800d08403 in __nss_compat_gethostbyname () from /usr/local/lib/nss_ldap.so.1 #1 0x000800d0606f in _nss_ldap_getpwent_r () from /usr/local/lib/nss_ldap.so.1 #2 0x0008009ffc54 in __nss_compat_getpwent_r () from /lib/libc.so.7 #3 0x000800a84a3d in nsdispatch () from /lib/libc.so.7 #4 0x000800a50976 in getpwent_r () from /lib/libc.so.7 #5 0x000800a50596 in sysctlbyname () from /lib/libc.so.7 #6 0x00406c6d in machine_init (statics=0x7fffea30, do_unames=1 '\001') at /usr/src/usr.bin/top/machine.c:257 #7 0x00407a10 in main (argc=1, argv=0x7fffeb08) at /usr/src/usr.bin/top/../../contrib/top/top.c:458 I'm using nss_ldapd-0.7.2 and there's no way to live without ldap... Any help highly appreciated! Thanks, -Harry signature.asc Description: OpenPGP digital signature
Re: posting coding bounties, appropriate money amounts?
On Fri, Jan 22, 2010 at 6:40 PM, Jeremy Chadwick free...@jdc.parodius.comwrote: On Fri, Jan 22, 2010 at 07:49:46PM +0200, Dan Naumov wrote: I am curious about posting some coding bounties, my current interest revolves around improving the ZVOL functionality in FreeBSD: fixing the known ZVOL SWAP reliability/stability problems as well as making ZVOLs work as a dumpon device (as is already the case in OpenSolaris) for crash dumps. I am a private individual and not some huge Fortune 100 and while I am not exactly rich, I am willing to put some of my personal money towards this. I am curious though, what would be the best way to approach this: directly approaching committer(s) with the know-how-and-why of the areas involved or through the FreeBSD Foundation? And how would one go about calculating the appropriate amount of money for such a thing? For what it's worth: count me in here, and not just with regards to zvol. I'd be more than happy to donate money to a pool (pun intended) to get some of the ZFS-centric issues looked at / focused on, and possibly fixed. I'd be willing to put up a thousand USD or possibly more depending on what sort of work was being considered. I suppose a better choice would be for someone here to make a list of issues which the community feels need attention, and put the pooled donations to whatever things had highest priority -- or, if that isn't plausible, then to what interested developers wanted to work on. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | To the best of my understanding, that is basically what donating to the FreeBSD Foundation accomplishes, although it would be nice so see some more transparency in their decision making process. -- Adam Vande More ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
On Fri, Jan 22, 2010 at 05:21:56PM +0100, Oliver Brandmueller wrote: I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default 512M. First step after setup was to csup to RELENG_8 and buildkernel and buildworld (no custom kernel, no make.conf). Instaling the new kernel failed, since /boot/kernel/ is already well over 230 MBytes in size. moving that to kernel.old and writing a new one with about the same size fails due to no space left on device. This is not a question; I do know how to get around this and how to configure custom kernels so they are a fragment of that size afterwards. However, I think this is a clear POLA violation. So, either GENERIC with less debugging information (symbols and stuff), which makes debugging harder or setting a higher default for / would be options, if not anyone else has better ideas. /usr/src/UPDATING has this which will allow you to remove symbols when installing a kernel: 20060118: This actually occured some time ago, but installing the kernel now also installs a bunch of symbol files for the kernel modules. This increases the size of /boot/kernel to about 67Mbytes. You will need twice this if you will eventually back this up to kernel.old on your next install. If you have a shortage of room in your root partition, you should add -DINSTALL_NODEBUG to your make arguments or add INSTALL_NODEBUG=yes to your /etc/make.conf. I concur that the 235 MB size of an amd64 8.0 kernel is a bit of a surprise. An i386 kernel is a mere 135 MB. IMO increasing the sysinstall default root slice size for at least amd64 would be a good thing. -- Adrian Wontroba ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8.0-RELEASE - -STABLE and size of /
On Friday 22 January 2010 06:32:02 pm Oliver Brandmueller wrote: Hi, On Fri, Jan 22, 2010 at 03:56:31PM -0500, Steven Friedrich wrote: in your /etc/make.conf, do you have a line like: makeoptions DEBUG=-g if so, comment it out. The GENEREIC kernel by default has the following config: makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols You don't need anything special in your make.conf In fact having the debug symbols is useful in many cases. So raising the default size for the / partition might be the better option (OK, doesn't help for already installed systems of course). - Oliver I'm sorry. My response to him should have been more precise. I was trying to clue him in on how to build a non-debug kernel, but my answer was in fact wrong. I said he may have a line in make.conf, but that was a mistake. I pulled the line from a kernel config file. If he wants to build a kernel with no symbols, as he stated he does, he needs to comment out the line and build a kernel. Could buildworld and installworld, too. But he and I went off topic. I should have changed the subject line to start a new thread to discuss building without symbols. He was complaining that it wasn't in the FAQ or the handbook. It's in GENERIC, which is required reading if you're ever going to build custom kernels. As for the main topic, I have been making 4GB root partitions for some time. Our disk requirements have been soaring over the last decade, while cost per MB have plummeted. I don't want to have to guess what sizes each partition should be. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org