Re: Multiple serial consoles via null modem cable

2010-01-22 Thread N.J. Mann
In message 717f7a3e1001210137p7884adcbxc66a4f7fff928...@mail.gmail.com,
Marin Atanasov (dna...@gmail.com) wrote:
 Hello Jeremy,
 
 Now I'm a little confused :)
 
 I've made some tests with my machines and a couple of null modem cables, and
 here's what I've got.
 
 On Wed, Jan 20, 2010 at 9:46 AM, Jeremy Chadwick
 free...@jdc.parodius.comwrote:
 
  On Wed, Jan 20, 2010 at 08:46:48AM +0200, Marin Atanasov wrote:
   Hello,
  
   Using `cu' only works with COM1 for me.
  
   Currently I have two serial ports on the system, and only the first is
  able
   to make the connection - the serial consoles are enabled in /etc/tty, but
  as
   I said only COM1 is able to make the connection.
 
  I'm a little confused by this statement, so I'll add some clarify:
 
  /etc/ttys is for configuring a machine to tie getty (think login prompt)
  to a device (in this case, a serial port).  Meaning: the device on the
  other end of the serial cable will start seeing login: and so on
  assuming you attach to the serial port there.
 
  For example:
 
  box1 COM1/ttyu0 is wired to box2 COM3/ttyu2 using a null modem cable.
  box1 COM2/ttyu1 is wired to box2 COM4/ttyu3 using a null modem cable.
 
  On box1, you'd have something like this in /etc/ttys:
 
  ttyu0   /usr/libexec/getty std.9600   vt100  on secure
  ttyu1   /usr/libexec/getty std.9600   vt100  on secure
 
 
 Here's what I did:
 
 box1 COM1/ttyd0 - box2 COM1/ttyd0 - using null modem cable
 box1 COM2/ttyd1 - box3 COM1/ttyd0 - using null modem cable
 
 On box1 I have this in /etc/ttys:
 
 ttyd0   /usr/libexec/getty std.9600   vt100   on secure
 ttyd1   /usr/libexec/getty std.9600   vt100   on secure
 
 Now if I want to connect to box1 from box2 or box3 through the serial
 connection it should work, right?
 But I only can connect to box1 from box2, because box2's COM port is
 connected to box1's COM1 port.
 
 From box2 I can get a login prompt
 box2# cu -l /dev/cuad0 -s 9600
 Connected
 
 login:
 )
 (host.domain) (ttyd0)
 
 login: ~
 [EOT]
 
 But if I try to connect to box1 from box3 - no success there.
 box3# cu -l /dev/cuad0 -s 9600
 Connected
 ~
 [EOT]

You need to reduce the number of unknowns, e.g. where is the problem:
box1, box3 or in between.  So, swap the cables on box1 so that you now
have box1:COM1 - box3:COM1 and box1:COM2 - box2:COM1.  Now repeat the
tests above and post your results.


Cheers,
   Nick.
-- 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available

2010-01-22 Thread Florian Smeets

On 1/21/10 9:15 PM, John Baldwin wrote:

On Thursday 21 January 2010 2:09:34 pm Florian Smeets wrote:

On 1/21/10 8:05 PM, John Baldwin wrote:

On Thursday 21 January 2010 1:33:35 pm Florian Smeets wrote:

On 1/21/10 6:58 PM, John Baldwin wrote:

On Thursday 21 January 2010 8:25:22 am Florian Smeets wrote:

(kgdb) frame 8
#8  0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at
/usr/src/sys/netinet/ip_input.c:1307
1307m_copydata(m, 0, mcopy-m_len, mtod(mcopy, caddr_t));
(kgdb) l
1302mcopy = NULL;
1303}
1304if (mcopy != NULL) {
1305mcopy-m_len = min(ip-ip_len, M_TRAILINGSPACE(mcopy));
1306mcopy-m_pkthdr.len = mcopy-m_len;
1307m_copydata(m, 0, mcopy-m_len, mtod(mcopy, caddr_t));
1308}
1309
1310#ifdef IPSTEALTH
1311if (!ipstealth) {
(kgdb) p *m
$1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc271e80e
E\020, mh_len = 164, mh_flags = 3, mh_type = 1, pad = \000}, M_dat

=

{MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0, len = 164,
csum_flags = 3072,
csum_data = 65535, tso_segsz = 0, ether_vtag = 0, tags =
{slh_first = 0xc35bc380}}, MH_dat = {MH_ext = {ext_buf = 0xc271e800 ,
ext_free = 0, ext_args = 0x0, ext_size = 2048, ref_cnt = 0xc2703ab4,
ext_type = 6},
MH_databuf =
\000?q?\000\000\000\000\000\000\000\000\000\b\000\000?:p?

\006\000\000\000dL?\t+?\202\200\020

O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l?

\000\000\001%r???

\200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b߼?




\237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002?

3...@\210d\021\000\001?

\001b\000!e\000\...@bv\000\000@2\032$W\213\n\034...}},


M_databuf =
\000H\n?\000\000\000\000?\000\000\000\000\f\000\000??

\000\000\000\000\000\000\200?[?\000?q?
\000\000\000\000\000\000\000\000\000\b\000\000?:p?\006\000\000\000dL?

\t+?

\202\200\020

O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l?

\000\000\001%r???

\200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b߼?




\237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002?

3...}}

Ok, can you do 'p *m_copy'?



What ever you want :-)

(kgdb) p *m_copy
No symbol m_copy in current context.
(kgdb) p *m_copydata
$2 = {void (const struct mbuf *, int, int, caddr_t)}

0xc0572e10m_copydata

(kgdb) p *mcopy
$1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc23cce34
E\020, mh_len = 204, mh_flags = 2, mh_type = 1, pad = \000}, M_dat =
{MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0,
   len = 204, csum_flags = 3072, csum_data = 65535, tso_segsz = 0,
ether_vtag = 0, tags = {slh_first = 0xc23c3e00}}, MH_dat = {MH_ext =
{ext_buf = 0x84001045Address 0x84001045 out of bounds,


Hmm, ok.  Can you do 'p *ip'?  mcopy-m_len (204) is larger than m-m_len
(164).  That shouldn't be the case unless ip-ip_len is somehow larger

than m-

m_len.




(kgdb) p *ip
$3 = {ip_hl = 5, ip_v = 4, ip_tos = 16 '\020', ip_len = 33792, ip_id =
61492, ip_off = 64, ip_ttl = 64 '@', ip_p = 6 '\006', ip_sum = 34849,
ip_src = {s_addr = 355576000}, ip_dst = {
  s_addr = 2251401408}}


Looks like ip_len is in network byte order instead of host byte order and that
is causing the problem.  33792 == 0x8400.  Swapping that gives 0x84 == 132
which would be a reasonable length.  Are you using any firewall rules that
would rewrite packets?  I wonder if you are having a packet rewritten and the
new IP header is written in network byte order, but we swap the IP header len
field to host byte order earlier in ip_input().  Luigi Rizzo may have some
insight into this.



Well, when looking at MH_databuf i see Jboss MQ traffic that would mean 
that this traffic was coming from or going to an IPsec tunnel, i could 
say for sure when i would have a clue how to get an IP address from 
something like ip_src = {s_addr = 355576000}.


If it really is IPsec traffic then there are no rewrite rules only 10 pf 
pass rules on the enc0 interface and a scrub in all rule.


Perhaps it matters that i have these set:

net.enc.out.ipsec_bpf_mask=0x0001
net.enc.out.ipsec_filter_mask=0x0001
net.enc.in.ipsec_bpf_mask=0x0002
net.enc.in.ipsec_filter_mask=0x0002

so that i can filter the encapsulated traffic.

Thanks,
Florian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: device.hints isn't setting what I want

2010-01-22 Thread Greg Byshenk
On Thu, Jan 21, 2010 at 08:23:23PM -0500, Dan Langille wrote:
 
 First, see also my post: do I want ch0 or pass1?
 
 I have an external tape library and an external tape drive.  They are
 not always powered up.  My goal: always get the same devices regardless
 of whether or not the tape library is powered on at boot.
 
 After booting, with the tape library powered on, I have these devices:
 
 # camcontrol devlist
 QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0)
 DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1)
 DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2)
 HL-DT-ST DVDRAM GSA-H10A JL02at scbus2 target 0 lun 0 (cd0,pass3)
 USB 2.0 Storage Device 0100  at scbus5 target 0 lun 0 (da0,pass4)
 
 In /boot/devices, I have added these entries:
 
 hint.scbus.1.at=ahc0
 hint.scbus.0.at=ahc1
 hint.scbus.2.at=acd0
 hint.scbus.5.at=umass0

I think that this is wrong.

I had a similar issue (multiple tape drives and changer devices that 
needed to stay at the same ids).

Your device.hints entries should look something like this:

   hint.sa.0.at=scbus0
   hint.sa.0.target=5
   hint.sa.0.unit=0
   hint.sa.1.at=scbus0
   hint.sa.1.target=3
   hint.sa.1.unit=0
   hint.sa.2.at=scbus0
   hint.sa.2.target=1
   hint.sa.2.unit=0
   hint.ch.0.at=scbus0
   hint.ch.0.target=4
   hint.ch.0.unit=0
   hint.ch.1.at=scbus0
   hint.ch.1.target=2
   hint.ch.1.unit=0
   hint.ch.2.at=scbus0
   hint.ch.2.target=0
   hint.ch.2.unit=0

Which I use to get this:

   # camcontrol devlist
   SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch2)
   SONY SDX-1100 0102   at scbus0 target 1 lun 0 (sa2,pass1)
   SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch1)
   SONY SDX-900V 0102   at scbus0 target 3 lun 0 (sa1,pass3)
   # 

(Currently the first changer is not powered up.)


So I think that what you want is something like:

   hint.sa.0.at=scbus0
   hint.sa.0.target=5
   hint.sa.0.unit=0
   hint.sa.1.at=scbus1
   hint.sa.1.target=5
   hint.sa.1.unit=0
   hint.ch.0.at=scbus1
   hint.ch.0.target=0
   hint.ch.0.unit=0
   [...]


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: device.hints isn't setting what I want

2010-01-22 Thread Greg Byshenk
On Fri, Jan 22, 2010 at 10:01:02AM +0100, Greg Byshenk wrote:
 On Thu, Jan 21, 2010 at 08:23:23PM -0500, Dan Langille wrote:
  
  First, see also my post: do I want ch0 or pass1?
  
  I have an external tape library and an external tape drive.  They are
  not always powered up.  My goal: always get the same devices regardless
  of whether or not the tape library is powered on at boot.
  
  After booting, with the tape library powered on, I have these devices:
  
  # camcontrol devlist
  QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0)
  DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1)
  DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2)
  HL-DT-ST DVDRAM GSA-H10A JL02at scbus2 target 0 lun 0 (cd0,pass3)
  USB 2.0 Storage Device 0100  at scbus5 target 0 lun 0 (da0,pass4)
  
  In /boot/devices, I have added these entries:
  
  hint.scbus.1.at=ahc0
  hint.scbus.0.at=ahc1
  hint.scbus.2.at=acd0
  hint.scbus.5.at=umass0
 
 I think that this is wrong.
 
 I had a similar issue (multiple tape drives and changer devices that 
 needed to stay at the same ids).
 
 Your device.hints entries should look something like this:
 
hint.sa.0.at=scbus0
hint.sa.0.target=5
hint.sa.0.unit=0
hint.sa.1.at=scbus0
hint.sa.1.target=3
hint.sa.1.unit=0
hint.sa.2.at=scbus0
hint.sa.2.target=1
hint.sa.2.unit=0
hint.ch.0.at=scbus0
hint.ch.0.target=4
hint.ch.0.unit=0
hint.ch.1.at=scbus0
hint.ch.1.target=2
hint.ch.1.unit=0
hint.ch.2.at=scbus0
hint.ch.2.target=0
hint.ch.2.unit=0
 
 Which I use to get this:
 
# camcontrol devlist
SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch2)
SONY SDX-1100 0102   at scbus0 target 1 lun 0 (sa2,pass1)
SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch1)
SONY SDX-900V 0102   at scbus0 target 3 lun 0 (sa1,pass3)
# 
 
 (Currently the first changer is not powered up.)
 
 
 So I think that what you want is something like:
 
hint.sa.0.at=scbus0
hint.sa.0.target=5
hint.sa.0.unit=0
hint.sa.1.at=scbus1
hint.sa.1.target=5
hint.sa.1.unit=0
hint.ch.0.at=scbus1
hint.ch.0.target=0
hint.ch.0.unit=0
[...]


Just saw your second message.

I don't know if you can wire down 'pass?' the same way, but if you can,
I would assume that you need to set it the same way as the 'sa?' and 
other devices.

That is, if you want:

  QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0)

Then the device.hints entry would look like:

   hint.pass.0.at=scbus0
   hint.pass.0.target=5
   hint.pass.0.unit=0

(If you can do that.)

-greg

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Multiple serial consoles via null modem cable

2010-01-22 Thread Jeremy Chadwick
On Fri, Jan 22, 2010 at 08:36:51AM +0200, Marin Atanasov wrote:
 On Thu, Jan 21, 2010 at 6:26 PM, Ulrich Spörlein u...@spoerlein.net wrote:
 
  On Thu, 21.01.2010 at 11:37:06 +0200, Marin Atanasov wrote:
   Here's what I did:
  
   box1 COM1/ttyd0 - box2 COM1/ttyd0 - using null modem cable
   box1 COM2/ttyd1 - box3 COM1/ttyd0 - using null modem cable
  
   On box1 I have this in /etc/ttys:
  
   ttyd0   /usr/libexec/getty std.9600   vt100   on secure
   ttyd1   /usr/libexec/getty std.9600   vt100   on secure
  
   Now if I want to connect to box1 from box2 or box3 through the serial
   connection it should work, right?
   But I only can connect to box1 from box2, because box2's COM port is
   connected to box1's COM1 port.
 
  Are there actually two gettys running on the serial ports? Did you do
  kill -1 1 after the changes to /etc/ttys?
 
  On box1, what do the following commands produce
 
  egrep uart|sio /var/run/dmesg.boot
  pgrep -fl getty
 
  Regards,
  Uli
 
 Hi,
 
 This is the output from the requested commands:
 
 box1# egrep 'uart|sio' /var/run/dmesg.boot
 usb0: USB revision 1.0
 sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on
 acpi0
 sio0: type 16550A
 sio1: 16550A-compatible COM port port 0x2f8-0x2ff irq 3 on acpi0
 sio1: type 16550A
 
 box1# pgrep -fl getty
 3066 /usr/libexec/getty std.9600 ttyd1
 3065 /usr/libexec/getty std.9600 ttyd0
 534 /usr/libexec/getty Pc ttyv7
 533 /usr/libexec/getty Pc ttyv6
 532 /usr/libexec/getty Pc ttyv5
 531 /usr/libexec/getty Pc ttyv4
 530 /usr/libexec/getty Pc ttyv3
 529 /usr/libexec/getty Pc ttyv2
 528 /usr/libexec/getty Pc ttyv1
 527 /usr/libexec/getty Pc ttyv0

Can you run the same commands on box2 please?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


make buildkernel failing on zfs

2010-01-22 Thread Colin

Hi Folks,
I'm having a problem since I upgraded from 7.0 to the latest stable over 
Christmas with quotas and exim not rebuilding.
I'm told that exim requires NIS and WITHOUT_NIS=yes was defined in 
/usr/src.conf
My  /usr/src.conf for that build is as follows though I've removed it 
for a test build:


WITHOUT_ATM=yes
WITHOUT_BLUETOOTH=yes
WITHOUT_GAMES=yes
WITHOUT_I4B=yes
WITHOUT_IPX=yes
WITHOUT_NCP=yes
WITHOUT_NIS=yes
WITHOUT_SENDMAIL=yes
WITHOUT_INET6=yes
WITHOUT_PROFILE=yes

I've been building with the following command (this worked with the 
sources I used at Christmas, but I updated yesterday and now it won't):


cd /usr/src  make buildworld | tee /root/build  make 
kernel-toolchain | tee /root/build2  make -DALWAYS_CHECK_MAKE 
buildkernel KERNCONF=TED | tee /root/build3


It goes for a while and then the buildkernel fails with this:

cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS  
-D_KERNEL -DKLD_MODULE -std=c99 -nostdinc  
-I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo
n/fs/zfs 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common 
-I/usr/src/sys/modules/zfs/../.. 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/openso
laris/common/zfs 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common 
-I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS 
-include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq 
-finline-l
imit=8000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED 
-mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx 
-mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -
Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
-Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
-Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas 
-Wno-missing-prototypes -Wno-undef -Wno-strict-pro
totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls 
-Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline 
-Wno-switch -Wno-pointer-arith -c 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z

node.c
*** Error code 1

Stop in /usr/src/sys/modules/zfs.
*** Error code 1

Stop in /usr/src/sys/modules.
*** Error code 1

Stop in /usr/obj/usr/src/sys/TED.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.


It is a custom kernel and the conf file is here: 
http://www.pastebin.org/80156
First build I used the old conf file and that failed then I compared the 
generic and the custom before attempting this rebuild and copied in a 
couple of new lines (options P1003_1B_SEMAPHORES and device vlan) but 
they've nothing to do with it from what I can tell.
I don't even need zfs seen as this is a ufs machine but I do need some 
of the other options in the kernel hence using custom.
I am currently running another csup and will try building a generic 
kernel to see if that works but I can't really install a generic.


Anyone got any pointers?
Cheers,
Colin.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel failing on zfs

2010-01-22 Thread Christer Solskogen
On Fri, Jan 22, 2010 at 9:56 AM, Colin free...@southportcomputers.co.uk wrote:

 Anyone got any pointers?

Could you post your /etc/make.conf?
That said, I recon you build your kernel in a rather wierd way. Delete
/usr/obj/* and run make cleandir  make cleandir in /usr/src. Then
build your world and kernel like this make buildworld buildkernel
KERNCONF=TED. If that goes as well, run make installkernel
KERNCONF=TED, reboot, make installworld, run mergemaster and reboot
again.


-- 
chs
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel failing on zfs

2010-01-22 Thread N.J. Mann
In message 4b596838.9020...@southportcomputers.co.uk,
Colin (free...@southportcomputers.co.uk) wrote:

[snip]
 It goes for a while and then the buildkernel fails with this:
 
 cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS  
 -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc  
 -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris 
 -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo
 n/fs/zfs 
 -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod 
 -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common 
 -I/usr/src/sys/modules/zfs/../.. 
 -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso
 laris/common/zfs 
 -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common 
 -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS 
 -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq 
 -finline-l
 imit=8000 --param inline-unit-growth=100 --param 
 large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED 
 -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx 
 -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -
 Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
 -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
 -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas 
 -Wno-missing-prototypes -Wno-undef -Wno-strict-pro
 totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls 
 -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline 
 -Wno-switch -Wno-pointer-arith -c 
 /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z
 node.c
 *** Error code 1

I think this was a temporary problem that has already been fixed.  Try
updating to the latest version and see if that builds okay.


Cheers,
   Nick.
-- 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel failing on zfs

2010-01-22 Thread Colin

Christer Solskogen wrote:

On Fri, Jan 22, 2010 at 9:56 AM, Colin free...@southportcomputers.co.uk wrote:

  

Anyone got any pointers?



Could you post your /etc/make.conf?
That said, I recon you build your kernel in a rather wierd way. Delete
/usr/obj/* and run make cleandir  make cleandir in /usr/src. Then
build your world and kernel like this make buildworld buildkernel
KERNCONF=TED. If that goes as well, run make installkernel
KERNCONF=TED, reboot, make installworld, run mergemaster and reboot
again.


  

Thanks for the reply.
I have deleted /usr/obj in between builds (forgot to mention that) but 
didn't do make cleandir. I think the first way that I did the build when 
I started was similar to what you suggested but I was having problems 
with installworld so after various reading and suggestions dropped back 
to that format which I read was a more paranoid way of doing it.


make.conf as below:

SUP=/usr/local/bin/cvsup
SUPFLAGS=   -g -L 2
SUPHOST=cvsup.FreeBSD.org
SUPFILE=/root/standard-supfile
PORTSSUPFILE=   /root/ports-supfile

#*REMOVE* OPENSSL_OVERWRITE_BASE=NO

# added by use.perl 2009-06-14 11:10:18
#PERL_VERSION=5.8.9
#BATCH=YES
#CRYPT_DES=0
#WITHOUT_ALT_CONFIG_PREFIX=YES

#CFLAGS= -O -pipe
NO_FORTRAN= true
NO_OBJC=true
NO_X=   true
NO_GAMES=true
NO_PROFILE=  true

BATCH=YES
WITHOUT_X11=YES
SKIP_DNS_CHECK=YES
CRYPT_DES=0
WITH_PORT_REPLACES_BASE_BIND8=YES
WITH_PORT_REPLACES_BASE_BIND9=YES
WITHOUT_ALT_CONFIG_PREFIX=YES
WITH_OPENSSL_PORT=YES
X11BASE=${LOCALBASE}

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel failing on zfs

2010-01-22 Thread Colin

N.J. Mann wrote:

In message 4b596838.9020...@southportcomputers.co.uk,
Colin (free...@southportcomputers.co.uk) wrote:
  
[snip]
  

It goes for a while and then the buildkernel fails with this:

cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS  
-D_KERNEL -DKLD_MODULE -std=c99 -nostdinc  
-I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo
n/fs/zfs 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common 
-I/usr/src/sys/modules/zfs/../.. 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/openso
laris/common/zfs 
-I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common 
-I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS 
-include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq 
-finline-l
imit=8000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED 
-mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx 
-mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -
Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
-Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
-Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas 
-Wno-missing-prototypes -Wno-undef -Wno-strict-pro
totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls 
-Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline 
-Wno-switch -Wno-pointer-arith -c 
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z

node.c
*** Error code 1



I think this was a temporary problem that has already been fixed.  Try
updating to the latest version and see if that builds okay.


Cheers,
   Nick.
  
Ahh if that is the case then the build I currently have running should 
work seen as I ran a csup less than an hour ago.

Fingers crossed!
Regards,
Colin.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-22 Thread Harald Schmalzbauer

Alexander Motin schrieb am 19.01.2010 17:12 (localtime):
...

Patch can be found here:
http://people.freebsd.org/~mav/cam-ata.20100119.patch

Feedback as always welcome.


Again, thanks a lot for your ongoing great work!
The patch doesn't cleanly apply with vpo, but I don't use vpo so I 
didn't care.

Otherwise I couldn't find any problems.
The system detects reinserted SATA drives on ICH9 fine.

This was tested on a zfs backup server which went to the backbone 
yesterday, so I can't physically remove any devices any more for testing...


But I had some questions about zfs raidz states. I think that isn't a 
matter of atacam but if I removed one disk, zpool status still showed me 
the ada3 device online.
After reinserting (and proper detection/initialisazion with cam, ada3 
was present again) and zpool clean, it set the devicea as UNAVAIL sinve 
I/O errors.

I coudn't get the device into the pool again, no matter what I tried.
Only rebooting the machine helped. Then I could clean and scrub.

What are the needed steps to provide a reinsterted hard disk to geom? 
With the latest patches I don't need to issue any reset/rescan comman, 
right?

So it's a zfs problem, right? My mistake in understanding?

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: make buildkernel failing on zfs

2010-01-22 Thread Ruben de Groot
On Fri, Jan 22, 2010 at 10:40:17AM +0100, Christer Solskogen typed:
 On Fri, Jan 22, 2010 at 9:56 AM, Colin free...@southportcomputers.co.uk 
 wrote:
 
  Anyone got any pointers?
 
 Could you post your /etc/make.conf?
 That said, I recon you build your kernel in a rather wierd way. Delete
 /usr/obj/* and run make cleandir  make cleandir in /usr/src. Then

Bit redundant ;)
cleandir only effects /usr/obj, which you just blew away.

Ruben

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel failing on zfs

2010-01-22 Thread Jeremy Chadwick
On Fri, Jan 22, 2010 at 09:45:46AM +, N.J. Mann wrote:
 In message 4b596838.9020...@southportcomputers.co.uk,
   Colin (free...@southportcomputers.co.uk) wrote:
 
 [snip]
  It goes for a while and then the buildkernel fails with this:
  
  cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS  
  -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc  
  -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris 
  -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo
  n/fs/zfs 
  -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod 
  -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common 
  -I/usr/src/sys/modules/zfs/../.. 
  -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso
  laris/common/zfs 
  -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common 
  -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS 
  -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq 
  -finline-l
  imit=8000 --param inline-unit-growth=100 --param 
  large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED 
  -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx 
  -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -
  Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
  -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas 
  -Wno-missing-prototypes -Wno-undef -Wno-strict-pro
  totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls 
  -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline 
  -Wno-switch -Wno-pointer-arith -c 
  /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z
  node.c
  *** Error code 1
 
 I think this was a temporary problem that has already been fixed.  Try
 updating to the latest version and see if that builds okay.

The FreeBSD tinderbox build system noticed this problem as well, dying
in the same piece of code.  It's not you.

 738 01/21 17:04  FreeBSD Tinderbox   (9.0K) [releng_7 tinderbox] failure 
on amd64/amd64
 739 01/21 17:44  FreeBSD Tinderbox   (8.7K) [releng_7 tinderbox] failure 
on i386/i386

Normally I'd shake my finger at the committer for committing code to
stable branches without testing, but I hold the committer (jhb) in very
high regards and he has a very established history of not breaking
things + doing excellent work.  Mistakes happen, we're all human.  :-)

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: make buildkernel failing on zfs

2010-01-22 Thread Dmitry Morozovsky
On Fri, 22 Jan 2010, Ruben de Groot wrote:

RdG  Could you post your /etc/make.conf?
RdG  That said, I recon you build your kernel in a rather wierd way. Delete
RdG  /usr/obj/* and run make cleandir  make cleandir in /usr/src. Then
RdG 
RdG Bit redundant ;)
RdG cleandir only effects /usr/obj, which you just blew away.

Not exactly: without objdir, cleandir removes built objects from source 
directory (where they may accidentally reside if one type 'make all' without
'make obj' previously)

-- 
Sincerely,
D.Marck [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer: ma...@freebsd.org ]

*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- ma...@rinet.ru ***

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS performance degradation over time

2010-01-22 Thread Alexander Leidinger


Quoting Jeremy Chadwick free...@jdc.parodius.com (from Tue, 19 Jan  
2010 09:01:01 -0800):



On Tue, Jan 19, 2010 at 11:40:50AM -0500, Garrett Moore wrote:

I've been watching my memory usage and I have no idea what is consuming
memory as 'Active'.

Last night I had around 6500MB 'Active' again, 1500MB Wired, no inact, ~30MB
buf, no free, and ~100MB swap used. My performance copying ZFS-ZFS was
again slow (1MB/s). I tried killing rTorrent and no significant amount of
memory was reclaimed - maybe 100MB. `ps aux` showed no processes using any
significant amount of memory, and I was definitely nowhere near 6500MB
usage.

I tried running the perl oneliner again to hog a bunch of memory, and almost
all of the Active memory was IMMEDIATELY marked as Free, and my performance
was excellent again.

I'm not sure what in userland could be causing the issue. The only things
I've installed are rTorrent, lighttpd, samba, smartmontools, vim, bash,
Python, Perl, and SABNZBd. There is nothing that *should* be consuming any
serious amount of memory.


I've two recommendations:

1) Have you considered upgrading to RELENG_8 (e.g. 8.0-STABLE) instead
of sticking with 8.0-RELEASE?  There's been a recent MFC to RELENG_8
which pertain to ARC drainage.  I'm referring to the commit labelled
revision 1.22.2.2 (RELENG_8):

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c


This patch can be merged stand-alone if necessary, no need to go to  
RELENG_8 if there are reservations.



2) Have you tried using vfs.zfs.arc_max in loader.conf to limit the ARC
size?  I'd recommend picking something like 1GB as a cap (your machine


Or even less... to be determined by experimenting.


has 8GB total at present, if I remember right).  I believe long ago
someone said this isn't an explicit hard limit on the maximum size of
the ARC, but I believe this was during the RELENG_7 days and the ARC
stuff on FreeBSD has changed since then.  I wish the tunables were
better documented, or at least explained in detail (hello Wiki!).


The commit you refer to above is just doing this: limiting the arc  
more to the arc_max than it was the case before.


This patch is in 7-stable too (in case someone is interested).

Bye,
Alexander.

--
Johnson's First Law:
When any mechanical contrivance fails, it will do so at the
most inconvenient possible time.

http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org   netchild @ FreeBSD.org  : PGP ID = 72077137
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: IPSec NAT-T in transport mode

2010-01-22 Thread VANHULLEBUS Yvan
Hi.

On Thu, Jan 21, 2010 at 04:36:12PM +, David Murray wrote:
[...]
 On 2010-01-20 Wed 1:22 pm, Crest wrote:
 
 Yes the NAT-T Patch has been integrated into FreeBSD 8.0.
 
 Just rebuild your kernel with this options:
 device crypto # IPsec depends on this
 options IPSEC
 options IPSEC_DEBUG
 options IPSEC_NAT_T
 
 I'm trying to do the same thing as the OP, so thanks for these replies.
 
 However, they seem to be at odds.  Are we saying that the NAT-T patch is 
 there, but is missing checksum re-calculation, so MPD's packets are 
 going to be discarded?

Yes, see my other mail in this thread.


 (FWIW, this seems to be what happens.  All the negotiation to set up 
 IPSEC SAs happens, but MPD's log never shows a single entry.  I hadn't 
 got as far as packet dumps when this thread popped up.)

And if you have a look at system stats, you'll see lots of UDP packets
dropped because of invalid checksums


Yvan.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available

2010-01-22 Thread John Baldwin
On Friday 22 January 2010 3:08:45 am Florian Smeets wrote:
 On 1/21/10 9:15 PM, John Baldwin wrote:
  On Thursday 21 January 2010 2:09:34 pm Florian Smeets wrote:
  On 1/21/10 8:05 PM, John Baldwin wrote:
  On Thursday 21 January 2010 1:33:35 pm Florian Smeets wrote:
  On 1/21/10 6:58 PM, John Baldwin wrote:
  On Thursday 21 January 2010 8:25:22 am Florian Smeets wrote:
  (kgdb) frame 8
  #8  0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at
  /usr/src/sys/netinet/ip_input.c:1307
  1307   m_copydata(m, 0, mcopy-m_len, mtod(mcopy, 
  caddr_t));
  (kgdb) l
  1302   mcopy = NULL;
  1303   }
  1304   if (mcopy != NULL) {
  1305   mcopy-m_len = min(ip-ip_len, 
  M_TRAILINGSPACE(mcopy));
  1306   mcopy-m_pkthdr.len = mcopy-m_len;
  1307   m_copydata(m, 0, mcopy-m_len, mtod(mcopy, 
  caddr_t));
  1308   }
  1309   
  1310   #ifdef IPSTEALTH
  1311   if (!ipstealth) {
  (kgdb) p *m
  $1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc271e80e
  E\020, mh_len = 164, mh_flags = 3, mh_type = 1, pad = \000}, M_dat
  =
  {MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0, len = 164,
  csum_flags = 3072,
  csum_data = 65535, tso_segsz = 0, ether_vtag = 0, tags =
  {slh_first = 0xc35bc380}}, MH_dat = {MH_ext = {ext_buf = 0xc271e800 ,
  ext_free = 0, ext_args = 0x0, ext_size = 2048, ref_cnt = 0xc2703ab4,
  ext_type = 6},
  MH_databuf =
  \000?q?\000\000\000\000\000\000\000\000\000\b\000\000?:p?
  \006\000\000\000dL?\t+?\202\200\020
  O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l?
  \000\000\001%r???
  \200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b߼?
 
 
  \237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002?
  3...@\210d\021\000\001?
  \001b\000!e\000\...@bv\000\000@2\032$W\213\n\034...}},
 
  M_databuf =
  \000H\n?\000\000\000\000?\000\000\000\000\f\000\000??
  \000\000\000\000\000\000\200?[?\000?q?
  \000\000\000\000\000\000\000\000\000\b\000\000?:p?\006\000\000\000dL?
  \t+?
  \202\200\020
  O/\207\000\000\001\001\b\n-?b\230qms?\000\000\004\001?l?
  \000\000\001%r???
  \200\000\034?Ot?\b?{sr\000\034org.jboss.mq.ConnectionToken?\b߼?
 
 
  \237N\002\000\005I\000\004hashZ\000\asameJVML\000\bclientIDt\000\022Ljava/l\000\220\032Ae\207\000\002?
  3...}}
 
  Ok, can you do 'p *m_copy'?
 
 
  What ever you want :-)
 
  (kgdb) p *m_copy
  No symbol m_copy in current context.
  (kgdb) p *m_copydata
  $2 = {void (const struct mbuf *, int, int, caddr_t)}
  0xc0572e10m_copydata
  (kgdb) p *mcopy
  $1 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc23cce34
  E\020, mh_len = 204, mh_flags = 2, mh_type = 1, pad = \000}, M_dat =
  {MH = {MH_pkthdr = {rcvif = 0xc20a4800, header = 0x0,
 len = 204, csum_flags = 3072, csum_data = 65535, tso_segsz = 
  0,
  ether_vtag = 0, tags = {slh_first = 0xc23c3e00}}, MH_dat = {MH_ext =
  {ext_buf = 0x84001045Address 0x84001045 out of bounds,
 
  Hmm, ok.  Can you do 'p *ip'?  mcopy-m_len (204) is larger than m-m_len
  (164).  That shouldn't be the case unless ip-ip_len is somehow larger
  than m-
  m_len.
 
 
  (kgdb) p *ip
  $3 = {ip_hl = 5, ip_v = 4, ip_tos = 16 '\020', ip_len = 33792, ip_id =
  61492, ip_off = 64, ip_ttl = 64 '@', ip_p = 6 '\006', ip_sum = 34849,
  ip_src = {s_addr = 355576000}, ip_dst = {
s_addr = 2251401408}}
 
  Looks like ip_len is in network byte order instead of host byte order and 
  that
  is causing the problem.  33792 == 0x8400.  Swapping that gives 0x84 == 132
  which would be a reasonable length.  Are you using any firewall rules that
  would rewrite packets?  I wonder if you are having a packet rewritten and 
  the
  new IP header is written in network byte order, but we swap the IP header 
  len
  field to host byte order earlier in ip_input().  Luigi Rizzo may have some
  insight into this.
 
 
 Well, when looking at MH_databuf i see Jboss MQ traffic that would mean 
 that this traffic was coming from or going to an IPsec tunnel, i could 
 say for sure when i would have a clue how to get an IP address from 
 something like ip_src = {s_addr = 355576000}.

Something like this should show you the IP:

(gdb) set $i = 355576000
(gdb) printf %d.%d.%d.%d\n, $i  24, $i  16  0xff, $i  8  0xff, $i  
0xff
21.49.168.192

In this case I probably printed it backwards, so it is probably
192.168.49.21.

 If it really is IPsec traffic then there are no rewrite rules only 10 pf 
 pass rules on the enc0 interface and a scrub in all rule.
 
 Perhaps it matters that i have these set:
 
 net.enc.out.ipsec_bpf_mask=0x0001
 net.enc.out.ipsec_filter_mask=0x0001
 net.enc.in.ipsec_bpf_mask=0x0002
 net.enc.in.ipsec_filter_mask=0x0002
 
 so that i can filter the encapsulated traffic.

I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have
any ideas.

-- 
John Baldwin

Re: make buildkernel failing on zfs

2010-01-22 Thread John Baldwin
On Friday 22 January 2010 6:17:08 am Jeremy Chadwick wrote:
 On Fri, Jan 22, 2010 at 09:45:46AM +, N.J. Mann wrote:
  In message 4b596838.9020...@southportcomputers.co.uk,
  Colin (free...@southportcomputers.co.uk) wrote:
  
  [snip]
   It goes for a while and then the buildkernel fails with this:
   
   cc -O2 -fno-strict-aliasing -pipe -DFREEBSD_NAMECACHE -DBUILDING_ZFS  
   -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc  
   -I/usr/src/sys/modules/zfs/../../cddl/compat/opensolaris 
   -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/commo
   n/fs/zfs 
   -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/zmod 
   -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common 
   -I/usr/src/sys/modules/zfs/../.. 
   -I/usr/src/sys/modules/zfs/../../cddl/contrib/openso
   laris/common/zfs 
   -I/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/common 
   -I/usr/src/sys/modules/zfs/../../../include -DHAVE_KERNEL_OPTION_HEADERS 
   -include /usr/obj/usr/src/sys/TED/opt_global.h -I. -I@ -I@/contrib/altq 
   -finline-l
   imit=8000 --param inline-unit-growth=100 --param 
   large-function-growth=1000 -fno-common -g -I/usr/obj/usr/src/sys/TED 
   -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx 
   -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -
   Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
   -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
   -Wno-pointer-sign -fformat-extensions -Wno-unknown-pragmas 
   -Wno-missing-prototypes -Wno-undef -Wno-strict-pro
   totypes -Wno-cast-qual -Wno-parentheses -Wno-redundant-decls 
   -Wno-missing-braces -Wno-uninitialized -Wno-unused -Wno-inline 
   -Wno-switch -Wno-pointer-arith -c 
   /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_z
   node.c
   *** Error code 1
  
  I think this was a temporary problem that has already been fixed.  Try
  updating to the latest version and see if that builds okay.
 
 The FreeBSD tinderbox build system noticed this problem as well, dying
 in the same piece of code.  It's not you.
 
  738 01/21 17:04  FreeBSD Tinderbox   (9.0K) [releng_7 tinderbox] failure 
 on amd64/amd64
  739 01/21 17:44  FreeBSD Tinderbox   (8.7K) [releng_7 tinderbox] failure 
 on i386/i386
 
 Normally I'd shake my finger at the committer for committing code to
 stable branches without testing, but I hold the committer (jhb) in very
 high regards and he has a very established history of not breaking
 things + doing excellent work.  Mistakes happen, we're all human.  :-)

Kind words aside, in this case the testing wasn't quite adequate.  While I
did run-test it under UFS, I only built a GENERIC kernel w/o modules which is
how I missed ZFS.  Given that the patch touched ZFS I really should have
built the module.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD NFS client/Linux NFS server issue

2010-01-22 Thread Mikolaj Golub
On Tue, 19 Jan 2010 10:02:57 +0200 Mikolaj Golub wrote:

 So, on some of our freebsd7.1 nfs clients (and it looks like we have had
 similar case with 6.3), which have several nfs mounts to the same CentOS 5.3
 NFS server (mount options: rw,-3,-T,-s,-i,-r=32768,-w=32768,-o=noinet6), at
 some moment the access to one of the NFS mount gets stuck, while the access to
 the other mounts works ok.

 In all cases we have been observed so far the first gotten stuck process was
 php script (or two) that was (were) writing to logs file (appending). In
 tcpdump we see that every write to the file causes the sequence of the
 following rpc: ACCESS - READ - WRITE - COMMIT. And at some moment this stops
 after READ rpc call and successful reply.

 After this in tcpdump successful readdir/access/lookup/fstat calls are
 observed from our other utilities, which just check the presence of some files
 and they work ok (df also works). The php process at this state is in bo_wwait
 invalidating buffer cache [1].

 If at this time we try accessing the share with mc then it hangs acquiring the
 vn_lock held by php process [2] and after this any operations with this NFS
 share hang (df hangs too).

 If instead some other process is started that writes to some other file on
 this share (append) then the first process unfreezes too (starting from
 WRITE rpc, so there is no any retransmits).

So it looks for me that the problem here is that eventually problem nfsmount
ends up in this state:

(kgdb) p *nmp
$1 = {nm_mtx = {lock_object = {lo_name = 0xc0b808ee NFSmount lock, 
  lo_type = 0xc0b808ee NFSmount lock, lo_flags = 16973824, 
lo_witness_data = {lod_list = {
  stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 
0}, nm_flag = 35399, 
  nm_state = 1310720, nm_mountp = 0xc6b472cc, nm_numgrps = 16, 
  nm_fh = \001\000\000\000\000\223\000\000\...@\003\n, '\0' repeats 115 
times, nm_fhsize = 12, 
  nm_rpcclnt = {rc_flag = 0, rc_wsize = 0, rc_rsize = 0, rc_name = 0x0, rc_so = 
0x0, rc_sotype = 0, 
rc_soproto = 0, rc_soflags = 0, rc_timeo = 0, rc_retry = 0, rc_srtt = {0, 
0, 0, 0}, rc_sdrtt = {0, 
  0, 0, 0}, rc_sent = 0, rc_cwnd = 0, rc_timeouts = 0, rc_deadthresh = 0, 
rc_authtype = 0, 
rc_auth = 0x0, rc_prog = 0x0, rc_proctlen = 0, rc_proct = 0x0}, nm_so = 
0xc6e81d00, nm_sotype = 1, 
  nm_soproto = 0, nm_soflags = 44, nm_nam = 0xc6948640, nm_timeo = 6000, 
nm_retry = 2, nm_srtt = {15, 
15, 31, 52}, nm_sdrtt = {3, 3, 15, 15}, nm_sent = 0, nm_cwnd = 4096, 
nm_timeouts = 0, 
  nm_deadthresh = 9, nm_rsize = 32768, nm_wsize = 32768, nm_readdirsize = 4096, 
nm_readahead = 1, 
  nm_wcommitsize = 1177026, nm_acdirmin = 30, nm_acdirmax = 60, nm_acregmin = 
3, nm_acregmax = 60, 
  nm_verf = JК╬W\000\004oМ, nm_bufq = {tqh_first = 0xda82dc70, tqh_last = 
0xda8058e0}, 
  nm_bufqlen = 2, nm_bufqwant = 0, nm_bufqiods = 1, nm_maxfilesize = 
1099511627775, 
  nm_rpcops = 0xc0c2b5bc, nm_tprintf_initial_delay = 12, nm_tprintf_delay = 30, 
nm_nfstcpstate = {
rpcresid = 0, flags = 1, sock_send_inprog = 0}, 
  nm_hostname = 172.30.10.92\000/var/www/app31, '\0' repeats 60 times, 
nm_clientid = 0, nm_fsid = {
val = {0, 0}}, nm_lease_time = 0, nm_last_renewal = 0}

We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod
thread run for this mount, which is wrong -- nm_bufq will not be emptied until
some other process starts writing to the nfsmount and starts nfsiod thread for
this mount.

Reviewing the code how it could happen I see the following path. Could someone
confirm or disprove me?

in nfs_bio.c:nfs_asyncio() we have:

   1363 mtx_lock(nfs_iod_mtx);
...
   1374 /*
   1375  * Find a free iod to process this request.
   1376  */
   1377 for (iod = 0; iod  nfs_numasync; iod++)
   1378 if (nfs_iodwant[iod]) {
   1379 gotiod = TRUE;
   1380 break;
   1381 }
   1382 
   1383 /*
   1384  * Try to create one if none are free.
   1385  */
   1386 if (!gotiod) {
   1387 iod = nfs_nfsiodnew();
   1388 if (iod != -1)
   1389 gotiod = TRUE;
   1390 }

Let's consider situation when new nfsiod is created. 

nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks 
nfs_iod_mtx:

179 mtx_unlock(nfs_iod_mtx);
180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, 
RFHIGHPID,
181 0, nfsiod %d, newiod);
182 mtx_lock(nfs_iod_mtx);


And  nfs_nfsiod.c:nfssvc_iod() do the followin:

226 mtx_lock(nfs_iod_mtx);
...
238 nfs_iodwant[myiod] = curthread-td_proc;
239 nfs_iodmount[myiod] = NULL;
...
244 error = msleep(nfs_iodwant[myiod], nfs_iod_mtx, PWAIT 
| PCATCH,
245 -, timo);

Let's at this moment another nfs_asyncio() 

Re: Multiple serial consoles via null modem cable

2010-01-22 Thread Marin Atanasov
On Fri, Jan 22, 2010 at 10:02 AM, N.J. Mann n...@njm.me.uk wrote:

 In message 717f7a3e1001210137p7884adcbxc66a4f7fff928...@mail.gmail.com,
 Marin Atanasov (dna...@gmail.com) wrote:
  Hello Jeremy,
 
  Now I'm a little confused :)
 
  I've made some tests with my machines and a couple of null modem cables,
 and
  here's what I've got.
 
  On Wed, Jan 20, 2010 at 9:46 AM, Jeremy Chadwick
  free...@jdc.parodius.comwrote:
 
   On Wed, Jan 20, 2010 at 08:46:48AM +0200, Marin Atanasov wrote:
Hello,
   
Using `cu' only works with COM1 for me.
   
Currently I have two serial ports on the system, and only the first
 is
   able
to make the connection - the serial consoles are enabled in /etc/tty,
 but
   as
I said only COM1 is able to make the connection.
  
   I'm a little confused by this statement, so I'll add some clarify:
  
   /etc/ttys is for configuring a machine to tie getty (think login
 prompt)
   to a device (in this case, a serial port).  Meaning: the device on the
   other end of the serial cable will start seeing login: and so on
   assuming you attach to the serial port there.
  
   For example:
  
   box1 COM1/ttyu0 is wired to box2 COM3/ttyu2 using a null modem cable.
   box1 COM2/ttyu1 is wired to box2 COM4/ttyu3 using a null modem cable.
  
   On box1, you'd have something like this in /etc/ttys:
  
   ttyu0   /usr/libexec/getty std.9600   vt100  on secure
   ttyu1   /usr/libexec/getty std.9600   vt100  on secure
  
 
  Here's what I did:
 
  box1 COM1/ttyd0 - box2 COM1/ttyd0 - using null modem cable
  box1 COM2/ttyd1 - box3 COM1/ttyd0 - using null modem cable
 
  On box1 I have this in /etc/ttys:
 
  ttyd0   /usr/libexec/getty std.9600   vt100   on secure
  ttyd1   /usr/libexec/getty std.9600   vt100   on secure
 
  Now if I want to connect to box1 from box2 or box3 through the serial
  connection it should work, right?
  But I only can connect to box1 from box2, because box2's COM port is
  connected to box1's COM1 port.
 
  From box2 I can get a login prompt
  box2# cu -l /dev/cuad0 -s 9600
  Connected
 
  login:
  )
  (host.domain) (ttyd0)
 
  login: ~
  [EOT]
 
  But if I try to connect to box1 from box3 - no success there.
  box3# cu -l /dev/cuad0 -s 9600
  Connected
  ~
  [EOT]

 You need to reduce the number of unknowns, e.g. where is the problem:
 box1, box3 or in between.  So, swap the cables on box1 so that you now
 have box1:COM1 - box3:COM1 and box1:COM2 - box2:COM1.  Now repeat the
 tests above and post your results.


 Cheers,
   Nick.
 --


Seems I've found the issue, that I'm having - a broken null modem cable :(

The last time I was using that cable it was working fine. And now that I
connected a second one to the machine, it seemed that only the one connected
to COM1 was actually working, and I was left with the impression from the
documentation that only COM1 is able to do a serial console connection.

I'm very sorry to bother you like that. I'll continue setting up the servers
once I get a new null modem cable.

Thanks and regards,
Marin

-- 
Marin Atanasov Nikolov
dnaeon AT gmail DOT com
daemon AT unix-heaven DOT org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Oliver Brandmueller
Hi,

I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default 
512M. First step after setup was to csup to RELENG_8 and buildkernel and 
buildworld (no custom kernel, no make.conf).

Instaling the new kernel failed, since /boot/kernel/ is already well 
over 230 MBytes in size. moving that to kernel.old and writing a new one 
with about the same size fails due to no space left on device.

This is not a question; I do know how to get around this and how to 
configure custom kernels so they are a fragment of that size afterwards. 
However, I think this is a clear POLA violation. So, either GENERIC with 
less debugging information (symbols and stuff), which makes debugging 
harder or setting a higher default for / would be options, if not anyone 
else has better ideas.


- Oliver



-- 
| Oliver Brandmueller  http://sysadm.in/ o...@sysadm.in |
|Ich bin das Internet. Sowahr ich Gott helfe. |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Marian Hettwer
Hi All,

On Fri, 22 Jan 2010 17:21:56 +0100, Oliver Brandmueller o...@e-gitt.net
wrote:
 Hi,
 
 I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default
 512M. First step after setup was to csup to RELENG_8 and buildkernel and
 buildworld (no custom kernel, no make.conf).
 
 Instaling the new kernel failed, since /boot/kernel/ is already well
 over 230 MBytes in size. moving that to kernel.old and writing a new one
 with about the same size fails due to no space left on device.
 
 This is not a question; I do know how to get around this and how to
 configure custom kernels so they are a fragment of that size afterwards.
 However, I think this is a clear POLA violation. So, either GENERIC with
 less debugging information (symbols and stuff), which makes debugging
 harder or setting a higher default for / would be options, if not anyone
 else has better ideas.

+1 vote for making / bigger. 
At least a size where a make installkernel runs through. 

I like FreeBSD because it honors POLA.
And as Oliver stated, this is a clear POLA violation.

Cheers,
Marian

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Torfinn Ingolfsen
On Fri, 22 Jan 2010 17:21:56 +0100
Oliver Brandmueller o...@e-gitt.net wrote:

 Instaling the new kernel failed, since /boot/kernel/ is already well 
 over 230 MBytes in size. moving that to kernel.old and writing a new one 
 with about the same size fails due to no space left on device.
 
 This is not a question; I do know how to get around this and how to 
 configure custom kernels so they are a fragment of that size afterwards. 

It would also be nice if we knew how to configure the
whole make world procedure[1] to make a new kernel and modules without 
symbols. 
The FAQ doesn't seem to have that answer either.


References:
1) http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
-- 
Regards,
Torfinn Ingolfsen

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-22 Thread Freddie Cash
On Fri, Jan 22, 2010 at 2:23 AM, Harald Schmalzbauer 
h.schmalzba...@omnilan.de wrote:

 Alexander Motin schrieb am 19.01.2010 17:12 (localtime):
 ...

  Patch can be found here:
 http://people.freebsd.org/~mav/cam-ata.20100119.patch

 Feedback as always welcome.


 Again, thanks a lot for your ongoing great work!
 The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't
 care.
 Otherwise I couldn't find any problems.
 The system detects reinserted SATA drives on ICH9 fine.

 This was tested on a zfs backup server which went to the backbone
 yesterday, so I can't physically remove any devices any more for testing...

 But I had some questions about zfs raidz states. I think that isn't a
 matter of atacam but if I removed one disk, zpool status still showed me the
 ada3 device online.
 After reinserting (and proper detection/initialisazion with cam, ada3 was
 present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O
 errors.
 I coudn't get the device into the pool again, no matter what I tried.
 Only rebooting the machine helped. Then I could clean and scrub.

 What are the needed steps to provide a reinsterted hard disk to geom? With
 the latest patches I don't need to issue any reset/rescan comman, right?
 So it's a zfs problem, right? My mistake in understanding?

 In my testing of pulling drives at random (using a 3Ware 9550SXU or 9650SE
controller), you have to zpool offline pool device while the drive is
unplugged, before you can re-insert the same disk or a different disk.
 Without doing that step, it's very hard to re-insert the same disk, or
replace it with a new one, without rebooting.

Took me a couple of reboots and drive replacements before I figured that one
out.  :)

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: device.hints isn't setting what I want

2010-01-22 Thread Freddie Cash
Just as a side note:  does mergemaster or installworld handle the
installation of /boot/device.hints?

If it's mergemaster, then everything is fine, it'll detect your changes.
If it's installworld, you'll lose your changes at the next update.

Either way, I find it nicer/simpler to use /boot/loader.conf for this, as
nothing in the build/update process touches it, and it keeps all boot/kernel
options in one file.  And, it overrides any settings in device.hints.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Jeremy Chadwick
On Fri, Jan 22, 2010 at 05:27:52PM +0100, Marian Hettwer wrote:
 Hi All,
 
 On Fri, 22 Jan 2010 17:21:56 +0100, Oliver Brandmueller o...@e-gitt.net
 wrote:
  Hi,
  
  I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default
  512M. First step after setup was to csup to RELENG_8 and buildkernel and
  buildworld (no custom kernel, no make.conf).
  
  Instaling the new kernel failed, since /boot/kernel/ is already well
  over 230 MBytes in size. moving that to kernel.old and writing a new one
  with about the same size fails due to no space left on device.
  
  This is not a question; I do know how to get around this and how to
  configure custom kernels so they are a fragment of that size afterwards.
  However, I think this is a clear POLA violation. So, either GENERIC with
  less debugging information (symbols and stuff), which makes debugging
  harder or setting a higher default for / would be options, if not anyone
  else has better ideas.
 
 +1 vote for making / bigger. 
 At least a size where a make installkernel runs through. 
 
 I like FreeBSD because it honors POLA.
 And as Oliver stated, this is a clear POLA violation.

I'd like to see the default root filesystem size default to 1GB.  For
most folks this works well.  If people are paranoid, 2GB should be more
than sufficient.


While I'm here, I figure I'd share how I end up partitioning most of the
server systems I maintain.  I use this general formula when building a
new system, unless it's a 4-disk box (see bottom of mail):

ad4s1a = /= UFS2= 1GB
ad4s1b = swap   = (2*RAM) or (2*MaxRAMPossible)
ad4s1d = /var = UFS2+SU = 16GB  (mandatory: must be = 2*RAM)
ad4s1e = /tmp = UFS2+SU = (2*RAM)
ad4s1f = /usr = UFS2+SU = 16GB

There's lots of leftover space on the disk of course -- for either
ad4s1g or ad4s2. 
 
For 1-disk boxes, I add ad8s1g = /home = UFS2+SU = remaining space,
or sometimes name it /storage (depends on the role of the box).

For 2-disk boxes, I almost always go with disks that are identical in
size, use the above formula, and add ZFS mirroring as so:

ad4s2  = ZFS mirror pool  = remaining space
ad6= ZFS mirror pool  = entire disk

Then /home or /storage are ZFS filesystems in that pool.  Folks will say
but that means you're losing/wasting gigs of space on ad6, since the
mirror size is based on the smallest pool member!  Yep, but I consider
the trade off easily worth it.  Given the size of disks today (500GB to
2TB), I really don't stress about it:

Wasted space for 4GB RAM systems: 1 + 2*4 + 16 + 2*4 + 16 = 49GB
Wasted space for 8GB RAM systems: 1 + 2*8 + 16 + 2*8 + 16 = 65GB

If the machine is 4-disk, I use a slightly modified formula:

ad4s1a = /  = UFS2= 1GB
ad4s1b = swap = (2*RAM) or (2*MaxRAMPossible)
ad4s1d = /var   = UFS2+SU = 16GB  (mandatory: must be = 2*RAM)
ad4s1e = /tmp   = UFS2+SU = (2*RAM)
ad4s1f = /usr   = UFS2+SU = 16GB
ad4s1g = /spare = UFS2+SU = remaining space
ad6= ZFS raidz1 pool  = entire disk
ad8= ZFS raidz1 pool  = entire disk
ad10   = ZFS raidz1 pool  = entire disk

The ad4s1g part might seem silly, but I've found it useful.  If a
filesystem like /var goes awry (usually if bad blocks exist on the disk
where that filesystem lies), you can temporarily work around it by
rsync'ing as much data over to /spare, then remount /spare as /var to
avoid use of the sectors involved in ad4s1d.  I've had to do this on two
separate occasions.

There are network backups for all the boxes, so I don't OCD about it all
too much.  :-)

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: IPSec NAT-T in transport mode

2010-01-22 Thread David Murray

Hi Yvan,

On 10-01-22 Fri 1:19 pm, VANHULLEBUS Yvan wrote:


On Thu, Jan 21, 2010 at 04:36:12PM +, David Murray wrote:


On 2010-01-20 Wed 1:22 pm, Crest wrote:


Yes the NAT-T Patch has been integrated into FreeBSD 8.0.


Are we saying that the NAT-T patch is there, but is missing checksum 
re-calculation, so MPD's packets are going to be discarded?


Yes, see my other mail in this thread.


(FWIW, this seems to be what happens. All the negotiation to set up 
IPSEC SAs happens, but MPD's log never shows a single entry. I hadn't 
got as far as packet dumps when this thread popped up.)


And if you have a look at system stats, you'll see lots of UDP packets 
dropped because of invalid checksums


Thanks for taking the time to reply.

Actually, I find that each attempt to connect causes netstat -s -p udp 
to show a few UDP packets arriving and being dropped due to no socket, 
rather than bad checksums, so maybe I've got some other sort of problem 
with my mpd config, which I'll look into.



--
David Murray



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-22 Thread Jeremy Chadwick
On Fri, Jan 22, 2010 at 11:23:55AM +0100, Harald Schmalzbauer wrote:
 But I had some questions about zfs raidz states. I think that isn't
 a matter of atacam but if I removed one disk, zpool status still
 showed me the ada3 device online.
 After reinserting (and proper detection/initialisazion with cam,
 ada3 was present again) and zpool clean, it set the devicea as
 UNAVAIL sinve I/O errors.
 I coudn't get the device into the pool again, no matter what I tried.
 Only rebooting the machine helped. Then I could clean and scrub.
 
 What are the needed steps to provide a reinsterted hard disk to
 geom? With the latest patches I don't need to issue any reset/rescan
 comman, right?
 So it's a zfs problem, right? My mistake in understanding?

I can't speak with regards to the new ATA-via-CAM stuff, but with
the classic AHCI (meaning ataahci(4)), the procedure I've used
reliably for quite some time on Intel ICHx controllers is this:

For SATA disks that are purely UFS/UFS2:

- Single-user mode might be required here; it varies
- Terminate any processes which rely on filesystems on that disk
- umount /filesystem
- atacontrol detach ataX (where X = channel associated with disk)
- Physically remove bad disk
- Physically insert new disk
- Wait 15 seconds for stuff to settle
- atacontrol attach ataX (where X = previous channel detached)
- sade / sysinstall / gpart / whatever you like
- Restore data...  :-)

For SATA disks part of a ZFS mirror or raidz[123] pool:

- zpool offline pool disk
- atacontrol detach ataX (where X = channel associated with disk)
- Physically remove bad disk
- Physically insert new disk
- Wait 15 seconds for stuff to settle
- atacontrol attach ataX (where X = previous channel detached)
- zpool replace pool disk
- zpool online pool disk

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available

2010-01-22 Thread Max Laier
On Friday 22 January 2010 15:20:13 John Baldwin wrote:
 On Friday 22 January 2010 3:08:45 am Florian Smeets wrote:
...
  If it really is IPsec traffic then there are no rewrite rules only 10 pf
  pass rules on the enc0 interface and a scrub in all rule.
 
  Perhaps it matters that i have these set:
 
  net.enc.out.ipsec_bpf_mask=0x0001
  net.enc.out.ipsec_filter_mask=0x0001
  net.enc.in.ipsec_bpf_mask=0x0002
  net.enc.in.ipsec_filter_mask=0x0002
 
  so that i can filter the encapsulated traffic.
 
 I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have
 any ideas.

pf could be the culprit if it were present in the trace, but I don't see any 
sign of it:

On Thursday 21 January 2010 11:10:20 Florian Smeets wrote:
 #7  0xc0572e48 in m_copydata (m=0x0, off=0, len=40, cp=0xc23cced8
 \203??b??\237\f)h?M\220\224?\023?\205K(e??s?\???k?oQ?~\223\020g\030)
  at /usr/src/sys/kern/uipc_mbuf.c:815
 #8  0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at
 /usr/src/sys/netinet/ip_input.c:1307
 #9  0xc05fa30c in ip_input (m=0xc23dc900) at
 /usr/src/sys/netinet/ip_input.c:609
 #10 0xc05c83d5 in netisr_dispatch (num=2, m=0xc23dc900) at
 /usr/src/sys/net/netisr.c:185
 #11 0xc05bf581 in ether_demux (ifp=0xc20a4800, m=0xc23dc900) at
 /usr/src/sys/net/if_ethersubr.c:834
 #12 0xc05bf973 in ether_input (ifp=0xc20a4800, m=0xc23dc900) at
 /usr/src/sys/net/if_ethersubr.c:692
 #13 0xc04b8749 in sis_rxeof (sc=0xc2093800) at
 /usr/src/sys/dev/sis/if_sis.c:1476
 #14 0xc04b8973 in sis_intr (arg=0xc2093800) at
 /usr/src/sys/dev/sis/if_sis.c:1667
 #15 0xc050344b in ithread_loop (arg=0xc20ab410) at
 /usr/src/sys/kern/kern_intr.c:1126
 #16 0xc04ffe36 in fork_exit (callout=0xc05032a0 ithread_loop,
 arg=0xc20ab410, frame=0xc1f15d38) at /usr/src/sys/kern/kern_fork.c:811
 #17 0xc06d9180 in fork_trampoline () at
 /usr/src/sys/i386/i386/exception.s:271

pf does change the byte order in the pfil hook, but changes it back on return 
to the stack either when returning from the hook or when calling back into the 
stack.  There have been some issues where we missed returns to the stack that 
would result in this situation, but since pf is not in the trace, this is 
clearly not the case here.

It might indeed be related to enc(4).  I remember there have been some issues 
in IPSEC where it failed to properly copy a packet before modifying it.  Maybe 
this is what is happening.  Details escape me at the moment.

Can you also make sure that your if_enc.c has revision 174978:
http://svn.freebsd.org/viewvc/base/release/7.2.0/sys/net/if_enc.c?view=diffr1=174977r2=174978

Regards,
--
  Max
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: do I want ch0 or pass1?

2010-01-22 Thread Dan Nelson
In the last episode (Jan 21), Dan Langille said:
 Please CC me on replies.
 
 I'm running into issues with hard-coding some devices (see recent post
 titled 'device.hints isn't setting what I want').
 
 Associated with this issue is confusion over whether I want to use ch0
 or pass1.  I have these devices:
 
 DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1)
 DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2)
 
 My understanding: chio(1) will with ch0, whereas mtx(1) will work with
 pass1.  Is this correct?  More information/elaboration will help I'm sure.
 
 Why do I ask? I can get the tape changer and tape drive hardwired to ch0
 and sa1 respectively.  I cannot [yet] do the same with pass1.

You can try wiring them down the same way you wire down regular devices, but
if they're created sequentially in probe order, that won't work.

Ideally, mtx should use cam_open_spec_device() which, when given a device
name, will automatically open the matching pass device.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


posting coding bounties, appropriate money amounts?

2010-01-22 Thread Dan Naumov
Hello

I am curious about posting some coding bounties, my current interest
revolves around improving the ZVOL functionality in FreeBSD: fixing
the known ZVOL SWAP reliability/stability problems as well as making
ZVOLs work as a dumpon device (as is already the case in OpenSolaris)
for crash dumps. I am a private individual and not some huge Fortune
100 and while I am not exactly rich, I am willing to put some of my
personal money towards this. I am curious though, what would be the
best way to approach this: directly approaching committer(s) with the
know-how-and-why of the areas involved or through the FreeBSD
Foundation? And how would one go about calculating the appropriate
amount of money for such a thing?

Thanks.

- Sincerely,
Dan Naumov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: IPSec NAT-T in transport mode

2010-01-22 Thread David Murray

Hi Yvan,

On 10-01-22 Fri 5:15 pm, David Murray wrote:


On 10-01-22 Fri 1:19 pm, VANHULLEBUS Yvan wrote:


On Thu, Jan 21, 2010 at 04:36:12PM +, David Murray wrote:


On 2010-01-20 Wed 1:22 pm, Crest wrote:


Yes the NAT-T Patch has been integrated into FreeBSD 8.0.


Are we saying that the NAT-T patch is there, but is missing checksum 
re-calculation, so MPD's packets are going to be discarded?


Yes, see my other mail in this thread.


(FWIW, this seems to be what happens. All the negotiation to set up 
IPSEC SAs happens, but MPD's log never shows a single entry. I 
hadn't got as far as packet dumps when this thread popped up.)


And if you have a look at system stats, you'll see lots of UDP 
packets dropped because of invalid checksums


Actually, I find that each attempt to connect causes netstat -s -p udp 
to show a few UDP packets arriving and being dropped due to no socket, 
rather than bad checksums, so maybe I've got some other sort of 
problem with my mpd config, which I'll look into.


Ah, yes, I'd forgotten that my external IP address had changed since I 
last tried this, so I needed to restart racoon and ipsec.


So now, like you say, I see UDP packets dropped due to bad checksums.

I'll have a look at the NAT-T RFQs just in case support for NAT-OA 
payloads is something I could help with, but I suspect it'll need an 
in-depth knowledge of the IP stack.


Thanks!


--
David Murray


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-22 Thread Steve Polyack

On 01/22/10 11:48, Freddie Cash wrote:

On Fri, Jan 22, 2010 at 2:23 AM, Harald Schmalzbauer
h.schmalzba...@omnilan.de  wrote:

   

Alexander Motin schrieb am 19.01.2010 17:12 (localtime):
...

  Patch can be found here:
 

http://people.freebsd.org/~mav/cam-ata.20100119.patch

Feedback as always welcome.

   

Again, thanks a lot for your ongoing great work!
The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't
care.
Otherwise I couldn't find any problems.
The system detects reinserted SATA drives on ICH9 fine.

This was tested on a zfs backup server which went to the backbone
yesterday, so I can't physically remove any devices any more for testing...

But I had some questions about zfs raidz states. I think that isn't a
matter of atacam but if I removed one disk, zpool status still showed me the
ada3 device online.
After reinserting (and proper detection/initialisazion with cam, ada3 was
present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O
errors.
I coudn't get the device into the pool again, no matter what I tried.
Only rebooting the machine helped. Then I could clean and scrub.

What are the needed steps to provide a reinsterted hard disk to geom? With
the latest patches I don't need to issue any reset/rescan comman, right?
So it's a zfs problem, right? My mistake in understanding?

In my testing of pulling drives at random (using a 3Ware 9550SXU or 9650SE
 

controller), you have to zpool offlinepool  device while the drive is
unplugged, before you can re-insert the same disk or a different disk.
  Without doing that step, it's very hard to re-insert the same disk, or
replace it with a new one, without rebooting.

Took me a couple of reboots and drive replacements before I figured that one
out.  :)

   
I think you can do it without the 'zpool offline pool device' 
command;  I may be wrong, but I believe you can use 'zpool replace' to 
accomplish what you're trying to do.  i.e. if you have a bad drive ada3, 
and take it out, then replace it with a new disk, you can issue a 'zpool 
replace pool /dev/ada3 /dev/ada3' (yes, the same device is specified 
twice). ZFS should recognize that its a different disk and/or that it is 
lacking ZFS metadata and begin to resilver the pool onto the new 
device.  If you watch 'zfs status' in the process you'll see something like:


raidz1   DEGRADED 0 0 0
label/ada4ONLINE   0 0 0  12.4M resilvered
label/ada5ONLINE   0 0 0  12.4M resilvered
label/ada6ONLINE   0 0 0  12.3M resilvered
replacing  DEGRADED 0 0 0
  label/ada3/old  UNAVAIL  0   595 0  cannot open
  label/ada3  ONLINE   0 0 0  9.74G resilvered

Try it out and let me know if it works for you.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: do I want ch0 or pass1?

2010-01-22 Thread Robert
On Thu, 21 Jan 2010 20:23:43 -0500
Dan Langille d...@langille.org wrote:

 Please CC me on replies.
 
 I'm running into issues with hard-coding some devices (see recent post
 titled 'device.hints isn't setting what I want').
 
 Associated with this issue is confusion over whether I want to use ch0
 or pass1.  I have these devices:
 
 DEC TL800(C) DEC 0326at scbus1 target 0 lun 0
 (ch0,pass1) DEC TZ89 (C) DEC 1837at scbus1 target 5 lun
 0 (sa1,pass2)
 
 My understanding: chio(1) will with ch0, whereas mtx(1) will work with
 pass1.  Is this correct?  More information/elaboration will help I'm
 sure.
 
 Why do I ask? I can get the tape changer and tape drive hardwired to
 ch0 and sa1 respectively.  I cannot [yet] do the same with pass1.
 
 Thanks folks.
Hi Dan

You might take a look at this thread. It looks like what you want to do.

http://lists.freebsd.org/pipermail/freebsd-questions/2007-April/146738.html

HTH

Robert
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-22 Thread Freddie Cash
On Fri, Jan 22, 2010 at 10:28 AM, Steve Polyack kor...@comcast.net wrote:

 On 01/22/10 11:48, Freddie Cash wrote:

 In my testing of pulling drives at random (using a 3Ware 9550SXU or 9650SE


 controller), you have to zpool offlinepool  device while the drive
 is
 unplugged, before you can re-insert the same disk or a different disk.
  Without doing that step, it's very hard to re-insert the same disk, or
 replace it with a new one, without rebooting.

 Took me a couple of reboots and drive replacements before I figured that
 one
 out.  :)


 I think you can do it without the 'zpool offline pool device' command;
  I may be wrong, but I believe you can use 'zpool replace' to accomplish
 what you're trying to do.  i.e. if you have a bad drive ada3, and take it
 out, then replace it with a new disk, you can issue a 'zpool replace pool
 /dev/ada3 /dev/ada3' (yes, the same device is specified twice). ZFS should
 recognize that its a different disk and/or that it is lacking ZFS metadata
 and begin to resilver the pool onto the new device.  If you watch 'zfs
 status' in the process you'll see something like:

 Yes, that does work ... but it's not nearly as reliable as doing the
offline first.

If you do things in the right order, drives can be replaced and resilvering
started within minutes (our process takes a little less than 5 minutes, but
the bulk of that is removing the dead drive from the caddy, and adding the
new drive to the caddy).

Do things in the wrong order, and it can take 15 minutes or more, and may
require rebooting the system (as our manager discovered trying to replace a
drive while I was away).  :)

Just because there are shortcuts available ... doesn't mean you should
always take them.  :D

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available

2010-01-22 Thread John Baldwin
On Friday 22 January 2010 12:18:20 pm Max Laier wrote:
 On Friday 22 January 2010 15:20:13 John Baldwin wrote:
  On Friday 22 January 2010 3:08:45 am Florian Smeets wrote:
 ...
   If it really is IPsec traffic then there are no rewrite rules only 10 pf
   pass rules on the enc0 interface and a scrub in all rule.
  
   Perhaps it matters that i have these set:
  
   net.enc.out.ipsec_bpf_mask=0x0001
   net.enc.out.ipsec_filter_mask=0x0001
   net.enc.in.ipsec_bpf_mask=0x0002
   net.enc.in.ipsec_filter_mask=0x0002
  
   so that i can filter the encapsulated traffic.
  
  I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they have
  any ideas.
 
 pf could be the culprit if it were present in the trace, but I don't see any 
 sign of it:
 
 On Thursday 21 January 2010 11:10:20 Florian Smeets wrote:
  #7  0xc0572e48 in m_copydata (m=0x0, off=0, len=40, cp=0xc23cced8
  \203??b??\237\f)h?M\220\224?\023?\205K(e??s?\???k?oQ?~\223\020g\030)
   at /usr/src/sys/kern/uipc_mbuf.c:815
  #8  0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at
  /usr/src/sys/netinet/ip_input.c:1307
  #9  0xc05fa30c in ip_input (m=0xc23dc900) at
  /usr/src/sys/netinet/ip_input.c:609
  #10 0xc05c83d5 in netisr_dispatch (num=2, m=0xc23dc900) at
  /usr/src/sys/net/netisr.c:185
  #11 0xc05bf581 in ether_demux (ifp=0xc20a4800, m=0xc23dc900) at
  /usr/src/sys/net/if_ethersubr.c:834
  #12 0xc05bf973 in ether_input (ifp=0xc20a4800, m=0xc23dc900) at
  /usr/src/sys/net/if_ethersubr.c:692
  #13 0xc04b8749 in sis_rxeof (sc=0xc2093800) at
  /usr/src/sys/dev/sis/if_sis.c:1476
  #14 0xc04b8973 in sis_intr (arg=0xc2093800) at
  /usr/src/sys/dev/sis/if_sis.c:1667
  #15 0xc050344b in ithread_loop (arg=0xc20ab410) at
  /usr/src/sys/kern/kern_intr.c:1126
  #16 0xc04ffe36 in fork_exit (callout=0xc05032a0 ithread_loop,
  arg=0xc20ab410, frame=0xc1f15d38) at /usr/src/sys/kern/kern_fork.c:811
  #17 0xc06d9180 in fork_trampoline () at
  /usr/src/sys/i386/i386/exception.s:271
 
 pf does change the byte order in the pfil hook, but changes it back on return 
 to the stack either when returning from the hook or when calling back into 
 the 
 stack.  There have been some issues where we missed returns to the stack that 
 would result in this situation, but since pf is not in the trace, this is 
 clearly not the case here.

That isn't necessarily the case.  ip_input() invokes the PFIL hooks which
then return after possibly modifying the packet.  The (possibly modified)
packet is then passed to ip_forward() from ip_input().  If the PFIL hook
modified the packet and returned ip_len in network byte order then it would
cause this breakage without showing up in the stack trace.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD NFS client/Linux NFS server issue

2010-01-22 Thread Rick Macklem



On Fri, 22 Jan 2010, Mikolaj Golub wrote:



We have nonempty nm_bufq, nm_bufqiods = 1, but actually there is no nfsiod
thread run for this mount, which is wrong -- nm_bufq will not be emptied until
some other process starts writing to the nfsmount and starts nfsiod thread for
this mount.

Reviewing the code how it could happen I see the following path. Could someone
confirm or disprove me?

in nfs_bio.c:nfs_asyncio() we have:

  1363 mtx_lock(nfs_iod_mtx);
...
  1374 /*
  1375  * Find a free iod to process this request.
  1376  */
  1377 for (iod = 0; iod  nfs_numasync; iod++)
  1378 if (nfs_iodwant[iod]) {
  1379 gotiod = TRUE;
  1380 break;
  1381 }
  1382
  1383 /*
  1384  * Try to create one if none are free.
  1385  */
  1386 if (!gotiod) {
  1387 iod = nfs_nfsiodnew();
  1388 if (iod != -1)
  1389 gotiod = TRUE;
  1390 }

Let's consider situation when new nfsiod is created.

nfs_nfsiod.c:nfs_nfsiodnew() before creating nfssvc_iod thread unlocks 
nfs_iod_mtx:

   179 mtx_unlock(nfs_iod_mtx);
   180 error = kthread_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, 
RFHIGHPID,
   181 0, nfsiod %d, newiod);
   182 mtx_lock(nfs_iod_mtx);


And  nfs_nfsiod.c:nfssvc_iod() do the followin:

   226 mtx_lock(nfs_iod_mtx);
...
   238 nfs_iodwant[myiod] = curthread-td_proc;
   239 nfs_iodmount[myiod] = NULL;
...
   244 error = msleep(nfs_iodwant[myiod], nfs_iod_mtx, PWAIT 
| PCATCH,
   245 -, timo);

Let's at this moment another nfs_asyncio() request for another nfsmount has
happened and this thread has locked nfs_iod_mtx. Then this thread will found
nfs_iodwant[iod] in for loop and will use it. When the first thread actually
has returned from nfs_nfsiodnew() it will insert buffer to nmp-nm_bufq but
nfsiod will process other nmp.



Ok, good catch, I think you've found the problem (or at least a race
that might have caused it).


It looks like the fix for this situation would be to check nfs_iodwant[iod]
after nfs_nfsiodnew():

--- nfs_bio.c.orig  2010-01-22 15:38:02.0 +
+++ nfs_bio.c   2010-01-22 15:39:58.0 +
@@ -1385,7 +1385,7 @@ again:
*/
   if (!gotiod) {
   iod = nfs_nfsiodnew();
-   if (iod != -1)
+   if ((iod != -1)  (nfs_iodwant[iod] == NULL))
   gotiod = TRUE;
   }



Unfortunately, I don't think the above fixes the problem.
If another thread that called nfs_asyncio() has stolen the this iod,
it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238)
and it will remain NULL until the other thread is done with it.

If you instead make it:
if (iod != -1  nfs_iodwant[iod] != NULL)
gotiod = TRUE;
then I think it fixes your scenario above, but will break for the
case where the mtx_lock(nfs_iod_mtx) call in nfs_nfsnewiod() (#182) wins
out over the one near the beginning of nfssvc_iod() (#226), since in that
case, nfs_iodwant[iod] will still be NULL because it hasn't yet been
set by nfssvc_iod() (#238).

There should probably be some sort of 3 way handshake between
the code in nfs_asyncio() after calling nfs_nfsnewiod() and the
code near the beginning of nfssvc_iod(), but I think the following
somewhat cheesy fix might do the trick:

if (!gotiod) {
iod = nfs_nfsiodnew();
if (iod != -1) {
if (nfs_iodwant[iod] == NULL) {
/*
 * Either another thread has acquired this
 * iod or I acquired the nfs_iod_mtx mutex
 * before the new iod thread did in
 * nfssvc_iod(). To be safe, go back and
 * try again after allowing another thread
 * to acquire the nfs_iod_mtx mutex.
 */
mtx_unlock(nfs_iod_mtx);
/*
 * So long as mtx_lock() implements some
 * sort of fairness, nfssvc_iod() should
 * get nfs_iod_mtx here and set
 * nfs_iodwant[iod] != NULL for the case
 * where the iod has not been stolen by
 * another thread for a different mount
 * point.
 */
mtx_lock(nfs_iod_mtx);
goto again;
}
gotiod = TRUE;
}
   

Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available

2010-01-22 Thread Max Laier
On Friday 22 January 2010 19:49:19 John Baldwin wrote:
 On Friday 22 January 2010 12:18:20 pm Max Laier wrote:
  On Friday 22 January 2010 15:20:13 John Baldwin wrote:
   On Friday 22 January 2010 3:08:45 am Florian Smeets wrote:
 
  ...
 
If it really is IPsec traffic then there are no rewrite rules only 10
pf pass rules on the enc0 interface and a scrub in all rule.
   
Perhaps it matters that i have these set:
   
net.enc.out.ipsec_bpf_mask=0x0001
net.enc.out.ipsec_filter_mask=0x0001
net.enc.in.ipsec_bpf_mask=0x0002
net.enc.in.ipsec_filter_mask=0x0002
   
so that i can filter the encapsulated traffic.
  
   I have no idea, I've cc'd mlaier@ (pf) and bz@ (ipsec) to see if they
   have any ideas.
 
  pf could be the culprit if it were present in the trace, but I don't see
  any sign of it:
 
  On Thursday 21 January 2010 11:10:20 Florian Smeets wrote:
   #7  0xc0572e48 in m_copydata (m=0x0, off=0, len=40, cp=0xc23cced8
   \203??b??\237\f)h?M\220\224?\023?\205K(e??s?\???k?oQ?~\223\020g\030)
at /usr/src/sys/kern/uipc_mbuf.c:815
   #8  0xc05f8b28 in ip_forward (m=0xc23dc900, srcrt=0) at
   /usr/src/sys/netinet/ip_input.c:1307
   #9  0xc05fa30c in ip_input (m=0xc23dc900) at
   /usr/src/sys/netinet/ip_input.c:609
   #10 0xc05c83d5 in netisr_dispatch (num=2, m=0xc23dc900) at
   /usr/src/sys/net/netisr.c:185
   #11 0xc05bf581 in ether_demux (ifp=0xc20a4800, m=0xc23dc900) at
   /usr/src/sys/net/if_ethersubr.c:834
   #12 0xc05bf973 in ether_input (ifp=0xc20a4800, m=0xc23dc900) at
   /usr/src/sys/net/if_ethersubr.c:692
   #13 0xc04b8749 in sis_rxeof (sc=0xc2093800) at
   /usr/src/sys/dev/sis/if_sis.c:1476
   #14 0xc04b8973 in sis_intr (arg=0xc2093800) at
   /usr/src/sys/dev/sis/if_sis.c:1667
   #15 0xc050344b in ithread_loop (arg=0xc20ab410) at
   /usr/src/sys/kern/kern_intr.c:1126
   #16 0xc04ffe36 in fork_exit (callout=0xc05032a0 ithread_loop,
   arg=0xc20ab410, frame=0xc1f15d38) at /usr/src/sys/kern/kern_fork.c:811
   #17 0xc06d9180 in fork_trampoline () at
   /usr/src/sys/i386/i386/exception.s:271
 
  pf does change the byte order in the pfil hook, but changes it back on
  return to the stack either when returning from the hook or when calling
  back into the stack.  There have been some issues where we missed returns
  to the stack that would result in this situation, but since pf is not in
  the trace, this is clearly not the case here.
 
 That isn't necessarily the case.  ip_input() invokes the PFIL hooks which
 then return after possibly modifying the packet.  The (possibly modified)
 packet is then passed to ip_forward() from ip_input().  If the PFIL hook
 modified the packet and returned ip_len in network byte order then it would
 cause this breakage without showing up in the stack trace.

What I meant to say was: if we return from the pfil hook we either report 
error (and/or consume the mbuf) or switch back to network byte order:

http://fxr.watson.org/fxr/source/contrib/pf/net/pf_ioctl.c?v=FREEBSD72#L3655

While I can't completely rule out that there is a double flip happening in 
some obscure path through pf, I very much doubt this is what is going on (or 
there would be more reports and it would happen straight away, not only after 
passing some data).  A quick search through the sources also didn't turn up 
any red flags.  All byte order operations inside pf are either temporary or 
performed on a properly copied packet that is send back through the stack 
(icmp error, tcp packet, ...).

Depending on how easily this can be reproduced, my money is on modifying a 
shared mbuf (possibly inside enc(4)).

Regards,
--
  Max
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD NFS client/Linux NFS server issue

2010-01-22 Thread Mikolaj Golub
On Fri, 22 Jan 2010 14:37:48 -0500 (EST) Rick Macklem wrote:

 --- nfs_bio.c.orig  2010-01-22 15:38:02.0 +
 +++ nfs_bio.c   2010-01-22 15:39:58.0 +
 @@ -1385,7 +1385,7 @@ again:
 */
if (!gotiod) {
iod = nfs_nfsiodnew();
 -   if (iod != -1)
 +   if ((iod != -1)  (nfs_iodwant[iod] == NULL))
gotiod = TRUE;
}


 Unfortunately, I don't think the above fixes the problem.
 If another thread that called nfs_asyncio() has stolen the this iod,
 it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238)
 and it will remain NULL until the other thread is done with it.

I see. I have missed this. Thanks.


 There should probably be some sort of 3 way handshake between
 the code in nfs_asyncio() after calling nfs_nfsnewiod() and the
 code near the beginning of nfssvc_iod(), but I think the following
 somewhat cheesy fix might do the trick:

   if (!gotiod) {
   iod = nfs_nfsiodnew();
   if (iod != -1) {
   if (nfs_iodwant[iod] == NULL) {
   /*
* Either another thread has acquired this
* iod or I acquired the nfs_iod_mtx mutex
* before the new iod thread did in
* nfssvc_iod(). To be safe, go back and
* try again after allowing another thread
* to acquire the nfs_iod_mtx mutex.
*/
   mtx_unlock(nfs_iod_mtx);
   /*
* So long as mtx_lock() implements some
* sort of fairness, nfssvc_iod() should
* get nfs_iod_mtx here and set
* nfs_iodwant[iod] != NULL for the case
* where the iod has not been stolen by
* another thread for a different mount
* point.
*/
   mtx_lock(nfs_iod_mtx);
   goto again;
   }
   gotiod = TRUE;
   }
   }

 Does anyone else have a better solution?
 (Mikolaj, could you by any chance test this? You can test yours, but I
 think it breaks.)

Unfortunately we observed this only on our production servers. A week ago we
made some changes in configuration as workaround -- reconfigure cron no to run
scripts simultaneously, set the scripts in cron that just periodically write a
line to the file on nfs share (to unlock it if it is locked). We have not
been observed problems since then and we would not like to experiment in
production. If I manage to produce good test case in test environment I will
be able to test the patch but I am not sure...

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Steven Friedrich
On Friday 22 January 2010 11:46:01 am Torfinn Ingolfsen wrote:
 On Fri, 22 Jan 2010 17:21:56 +0100
 
 Oliver Brandmueller o...@e-gitt.net wrote:
  Instaling the new kernel failed, since /boot/kernel/ is already well
  over 230 MBytes in size. moving that to kernel.old and writing a new one
  with about the same size fails due to no space left on device.
 
  This is not a question; I do know how to get around this and how to
  configure custom kernels so they are a fragment of that size afterwards.
 
 It would also be nice if we knew how to configure the
 whole make world procedure[1] to make a new kernel and modules without
  symbols. The FAQ doesn't seem to have that answer either.
 
 
 References:
 1) http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
 
in your /etc/make.conf, do you have a line like:
makeoptions DEBUG=-g

if so, comment it out.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Pack of CAM improvements

2010-01-22 Thread Olivier Smedts
2010/1/22 Harald Schmalzbauer h.schmalzba...@omnilan.de:
 Alexander Motin schrieb am 19.01.2010 17:12 (localtime):
 ...

 Patch can be found here:
 http://people.freebsd.org/~mav/cam-ata.20100119.patch

 Feedback as always welcome.

 Again, thanks a lot for your ongoing great work!
 The patch doesn't cleanly apply with vpo, but I don't use vpo so I didn't
 care.

Since r202799 it applies cleanly to 8-STABLE.

 Otherwise I couldn't find any problems.
 The system detects reinserted SATA drives on ICH9 fine.

 This was tested on a zfs backup server which went to the backbone yesterday,
 so I can't physically remove any devices any more for testing...

 But I had some questions about zfs raidz states. I think that isn't a matter
 of atacam but if I removed one disk, zpool status still showed me the ada3
 device online.
 After reinserting (and proper detection/initialisazion with cam, ada3 was
 present again) and zpool clean, it set the devicea as UNAVAIL sinve I/O
 errors.
 I coudn't get the device into the pool again, no matter what I tried.
 Only rebooting the machine helped. Then I could clean and scrub.

 What are the needed steps to provide a reinsterted hard disk to geom? With
 the latest patches I don't need to issue any reset/rescan comman, right?
 So it's a zfs problem, right? My mistake in understanding?

 Thanks,

 -Harry





-- 
Olivier Smedts _
ASCII ribbon campaign ( )
e-mail: oliv...@gid0.org- against HTML email  vCards  X
www: http://www.gid0.org- against proprietary attachments / \

  Il y a seulement 10 sortes de gens dans le monde :
  ceux qui comprennent le binaire,
  et ceux qui ne le comprennent pas.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Multiple serial consoles via null modem cable

2010-01-22 Thread Nicolas Rachinsky
* Jeremy Chadwick free...@jdc.parodius.com [2010-01-19 23:46 -0800]:
 You cannot do something like where box1 COM1 is wired to box2 COM1, and
 depending on what box you're on doing the cu -l ttyu0 from, get a
 login prompt on the other.  It doesn't work like that.  :-)

Isn't the reason for different dial-in and dial-out devices that this
should work? Or does that only work with modem?

http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/serial.html#ACCESS-SERIAL-PORTS

Nicolas
-- 
http://www.rachinsky.de/nicolas
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Multiple serial consoles via null modem cable

2010-01-22 Thread Michael Loftis



--On Friday, January 22, 2010 10:05 PM +0100 Nicolas Rachinsky 
fbsd-stabl...@ml.turing-complete.org wrote:



* Jeremy Chadwick free...@jdc.parodius.com [2010-01-19 23:46 -0800]:

You cannot do something like where box1 COM1 is wired to box2 COM1, and
depending on what box you're on doing the cu -l ttyu0 from, get a
login prompt on the other.  It doesn't work like that.  :-)


Isn't the reason for different dial-in and dial-out devices that this
should work? Or does that only work with modem?


You can't with two directly connected machines.  When the two are 
physically wired together, and getty is configured (via ttys) to fire up on 
the port it takes over the port.  If you connect two machines via a null 
modem cable, both with getty on the same port, the getty's will be chatting 
with each other.  The locking mechanism will break the chat loop when you 
try to use the dialout device on one end or the other but you may have to 
wait some time before the other end restarts getty (because it previously 
would have been dieing very rapidly due to login failures)


A NULL modem connection is ALWAYS active.  A regular modem, is NOT.  It has 
a state of 'inactive' or 'waiting for ring' if you will.


The correct way to do what you want is as others have suggested, two serial 
null modem cables, and two com ports on each machine.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-STABLE page fault with kernel from 12.01.2010 / crashinfo available

2010-01-22 Thread Florian Smeets

On 1/22/10 6:18 PM, Max Laier wrote:

pf does change the byte order in the pfil hook, but changes it back on return
to the stack either when returning from the hook or when calling back into the
stack.  There have been some issues where we missed returns to the stack that
would result in this situation, but since pf is not in the trace, this is
clearly not the case here.

It might indeed be related to enc(4).  I remember there have been some issues
in IPSEC where it failed to properly copy a packet before modifying it.  Maybe
this is what is happening.  Details escape me at the moment.

Can you also make sure that your if_enc.c has revision 174978:
http://svn.freebsd.org/viewvc/base/release/7.2.0/sys/net/if_enc.c?view=diffr1=174977r2=174978


Yes i have the latest if_enc.c available on stable/7

cvs 1.6.2.3, svn r183630

World and kernel were compiled from sources csuped on 12.01.2010.

Thanks,
Florian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD NFS client/Linux NFS server issue

2010-01-22 Thread Rick Macklem



On Fri, 22 Jan 2010, Rick Macklem wrote:



There should probably be some sort of 3 way handshake between
the code in nfs_asyncio() after calling nfs_nfsnewiod() and the
code near the beginning of nfssvc_iod(), but I think the following
somewhat cheesy fix might do the trick:


[stuff deleted]
I know it's a little weird to reply to my own posting, but I think
this might be a reasonable patch (I have only tested it for a few
minutes at this point).

I basically redefined nfs_iodwant[] as a tri-state variable (although
it was a struct proc *, it was only tested NULL/non-NULL).
0 - was NULL
1 - was non-NULL
-1 - just created by nfs_asyncio() and will be used by it

I'll keep testing it, but hopefully someone else can test and/or
review it... rick
ps: Mikolaj, I'm a sysadmin so I understand the problems with
production systems, but if you do get a chance to test it somehow,
that would be great.
pss: This is against -current, but hopefully stable/7 can be patched
about the same.

--- patch for nfsiod race against -current ---
--- nfsclient/nfs.h.sav 2010-01-22 16:21:53.0 -0500
+++ nfsclient/nfs.h 2010-01-22 16:22:04.0 -0500
@@ -252,7 +252,7 @@
 intnfs_commit(struct vnode *vp, u_quad_t offset, int cnt,
struct ucred *cred, struct thread *td);
 intnfs_readdirrpc(struct vnode *, struct uio *, struct ucred *);
-intnfs_nfsiodnew(void);
+intnfs_nfsiodnew(int);
 intnfs_asyncio(struct nfsmount *, struct buf *, struct ucred *, struct 
thread *);
 intnfs_doio(struct vnode *, struct buf *, struct ucred *, struct thread *);
 void   nfs_doio_directwrite (struct buf *);
--- nfsclient/nfsnode.h.sav 2010-01-22 14:56:34.0 -0500
+++ nfsclient/nfsnode.h 2010-01-22 14:56:52.0 -0500
@@ -180,7 +180,7 @@
  * Queue head for nfsiod's
  */
 extern TAILQ_HEAD(nfs_bufq, buf) nfs_bufq;
-extern struct proc *nfs_iodwant[NFS_MAXASYNCDAEMON];
+extern int nfs_iodwant[NFS_MAXASYNCDAEMON];
 extern struct nfsmount *nfs_iodmount[NFS_MAXASYNCDAEMON];

 #if defined(_KERNEL)
--- nfsclient/nfs_bio.c.sav 2010-01-22 14:57:28.0 -0500
+++ nfsclient/nfs_bio.c 2010-01-22 16:17:24.0 -0500
@@ -1377,7 +1377,7 @@
 * Find a free iod to process this request.
 */
for (iod = 0; iod  nfs_numasync; iod++)
-   if (nfs_iodwant[iod]) {
+   if (nfs_iodwant[iod]  0) {
gotiod = TRUE;
break;
}
@@ -1386,7 +1386,7 @@
 * Try to create one if none are free.
 */
if (!gotiod) {
-   iod = nfs_nfsiodnew();
+   iod = nfs_nfsiodnew(1);
if (iod != -1)
gotiod = TRUE;
}
@@ -1398,7 +1398,7 @@
 */
NFS_DPF(ASYNCIO, (nfs_asyncio: waking iod %d for mount %p\n,
iod, nmp));
-   nfs_iodwant[iod] = NULL;
+   nfs_iodwant[iod] = 0;
nfs_iodmount[iod] = nmp;
nmp-nm_bufqiods++;
wakeup(nfs_iodwant[iod]);
--- nfsclient/nfs_nfsiod.c.sav  2010-01-22 14:57:28.0 -0500
+++ nfsclient/nfs_nfsiod.c  2010-01-22 16:32:31.0 -0500
@@ -113,7 +113,7 @@
 * than the new minimum, create some more.
 */
for (i = nfs_iodmin - nfs_numasync; i  0; i--)
-   nfs_nfsiodnew();
+   nfs_nfsiodnew(0);
 out:
mtx_unlock(nfs_iod_mtx);
return (0);
@@ -147,7 +147,7 @@
 */
iod = nfs_numasync - 1;
for (i = 0; i  nfs_numasync - nfs_iodmax; i++) {
-   if (nfs_iodwant[iod])
+   if (nfs_iodwant[iod]  0)
wakeup(nfs_iodwant[iod]);
iod--;
}
@@ -160,7 +160,7 @@
 Max number of nfsiod kthreads);

 int
-nfs_nfsiodnew(void)
+nfs_nfsiodnew(int set_iodwant)
 {
int error, i;
int newiod;
@@ -176,12 +176,17 @@
}
if (newiod == -1)
return (-1);
+   if (set_iodwant  0)
+   nfs_iodwant[i] = -1;
mtx_unlock(nfs_iod_mtx);
error = kproc_create(nfssvc_iod, nfs_asyncdaemon + i, NULL, RFHIGHPID,
0, nfsiod %d, newiod);
mtx_lock(nfs_iod_mtx);
-   if (error)
+   if (error) {
+   if (set_iodwant  0)
+   nfs_iodwant[i] = 0;
return (-1);
+   }
nfs_numasync++;
return (newiod);
 }
@@ -199,7 +204,7 @@
nfs_iodmin = NFS_MAXASYNCDAEMON;

for (i = 0; i  nfs_iodmin; i++) {
-   error = nfs_nfsiodnew();
+   error = nfs_nfsiodnew(0);
if (error == -1)
panic(nfsiod_setup: nfs_nfsiodnew failed);
}
@@ -236,7 +241,8 @@
goto finish;
if (nmp)
nmp-nm_bufqiods--;
-   nfs_iodwant[myiod] = curthread-td_proc;
+   if 

Re: top Segmentation faulting on 8.0p2 amd64

2010-01-22 Thread Mikolaj Golub
On Wed, 20 Jan 2010 08:06:23 +0100 Harald Schmalzbauer wrote:

 Dear all,

 I have no idea why top crashes with segmentation fault on my amd64
 machine running FreeBSD 8.0-RELEASE-p2.
 If someone wants to have a loot at the core dump:
 http://www.schmalzbauer.de/downloads/top.core

core file is useless without binary and libraries. So it is better to run gdb
on your host, produce backtrace and post here:

gdb /usr/bin/top top.core
bt

And sure a backtrace from the top built with -g would be much better.

cd /usr/src/usr.bin/top
CFLAGS=-g make

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: posting coding bounties, appropriate money amounts?

2010-01-22 Thread Ivan Voras

Dan Naumov wrote:

Hello

I am curious about posting some coding bounties, my current interest
revolves around improving the ZVOL functionality in FreeBSD: fixing
the known ZVOL SWAP reliability/stability problems as well as making
ZVOLs work as a dumpon device (as is already the case in OpenSolaris)
for crash dumps. I am a private individual and not some huge Fortune
100 and while I am not exactly rich, I am willing to put some of my
personal money towards this. I am curious though, what would be the
best way to approach this: directly approaching committer(s) with the
know-how-and-why of the areas involved or through the FreeBSD
Foundation? And how would one go about calculating the appropriate
amount of money for such a thing?


Hi,

This idea (bounties) appear approximately every 6 months and it appears 
there is no better way than contacting the developers directly. AFAIK 
all attempts to conglomerate such an effort have failed. One important 
conclusion is that it cannot go through the Foundation since they cannot 
accept targeted donations.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Oliver Brandmueller
Hi,

On Fri, Jan 22, 2010 at 03:56:31PM -0500, Steven Friedrich wrote:
 in your /etc/make.conf, do you have a line like:
 makeoptions   DEBUG=-g
 if so, comment it out.

The GENEREIC kernel by default has the following config:

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols

You don't need anything special in your make.conf

In fact having the debug symbols is useful in many cases. So raising the 
default size for the / partition might be the better option (OK, doesn't 
help for already installed systems of course).

- Oliver

-- 
| Oliver Brandmueller  http://sysadm.in/ o...@sysadm.in |
|Ich bin das Internet. Sowahr ich Gott helfe. |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


IPSec NAT-T in transport mode

2010-01-22 Thread Nat Howard
I'm very interested in this problem -- I want to run an L2TP server myself.   
Is anyone actually working on this?  I might be able to chip in a few bucks...

But I'm not seeing bad checksums.   Here's my setup:


L2tp server  AB  Freebsd NAT box C ---internal 
network---D my mac

Where should I be seeing the bad checksums?  A, B, C, or D?


Looking only at B, I don't see any bad udp checksums, but I'm seeing a bunch of 
these (IP numbers changed to bracketed names):



23:49:48.004107 IP (tos 0x0, ttl 64, id 52328, offset 0, flags [none], proto 
ICMP (1), length 56) [NAT Box]  [External Server] ICMP [NAT Box] udp port 
58660 unreachable, length 36
IP (tos 0x20, ttl 59, id 36320, offset 0, flags [none], proto UDP (17), 
length 143) [External Server].1701  [NAT Box].58660:  [|l2tp]





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: posting coding bounties, appropriate money amounts?

2010-01-22 Thread Matt Olander
On Fri, Jan 22, 2010 at 3:06 PM, Ivan Voras ivo...@freebsd.org wrote:
 Dan Naumov wrote:

 Hello

 I am curious about posting some coding bounties, my current interest
 revolves around improving the ZVOL functionality in FreeBSD: fixing
 the known ZVOL SWAP reliability/stability problems as well as making
 ZVOLs work as a dumpon device (as is already the case in OpenSolaris)
 for crash dumps. I am a private individual and not some huge Fortune
 100 and while I am not exactly rich, I am willing to put some of my
 personal money towards this. I am curious though, what would be the
 best way to approach this: directly approaching committer(s) with the
 know-how-and-why of the areas involved or through the FreeBSD
 Foundation? And how would one go about calculating the appropriate
 amount of money for such a thing?

 Hi,

 This idea (bounties) appear approximately every 6 months and it appears
 there is no better way than contacting the developers directly. AFAIK all
 attempts to conglomerate such an effort have failed. One important
 conclusion is that it cannot go through the Foundation since they cannot
 accept targeted donations.

Awhile back, we built a simple app for posting bounties, getting devs
and sponsors on board, posting the committed code in a browser
viewable format, and then handle final payout upon completion.
iXsystems is more than willing to handle financial details and I would
gladly be the first to sponsor this project on the site.

http://www.sponsorbsd.org

We would need a team leader *cough* Ivan *cough* that could make sure
developing contributors are actually involved so that the final payoff
can be shared accordingly.

It's a cakephp app and I'm sure it needs a bit more polish but we
could do it on the fly and it shouldn't be to hard :)

Any cakephp or php devs interested in helping testing and launch, let
me know. I just haven't had much time to spend on launching it
although I still think it's a great idea. If somebody would like to
spearhead this effort, that would be great.

For companies wishing to sponsor non-community code, it also has the
option of hiding the community committed code.

best,
-matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: posting coding bounties, appropriate money amounts?

2010-01-22 Thread Jeremy Chadwick
On Fri, Jan 22, 2010 at 07:49:46PM +0200, Dan Naumov wrote:
 I am curious about posting some coding bounties, my current interest
 revolves around improving the ZVOL functionality in FreeBSD: fixing
 the known ZVOL SWAP reliability/stability problems as well as making
 ZVOLs work as a dumpon device (as is already the case in OpenSolaris)
 for crash dumps. I am a private individual and not some huge Fortune
 100 and while I am not exactly rich, I am willing to put some of my
 personal money towards this. I am curious though, what would be the
 best way to approach this: directly approaching committer(s) with the
 know-how-and-why of the areas involved or through the FreeBSD
 Foundation? And how would one go about calculating the appropriate
 amount of money for such a thing?

For what it's worth: count me in here, and not just with regards to
zvol.  I'd be more than happy to donate money to a pool (pun intended)
to get some of the ZFS-centric issues looked at / focused on, and
possibly fixed.

I'd be willing to put up a thousand USD or possibly more depending on
what sort of work was being considered.  I suppose a better choice would
be for someone here to make a list of issues which the community feels
need attention, and put the pooled donations to whatever things had
highest priority -- or, if that isn't plausible, then to what interested
developers wanted to work on.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: top Segmentation faulting on 8.0p2 amd64 (nss_ldapd problem?)

2010-01-22 Thread Harald Schmalzbauer

Mikolaj Golub schrieb am 22.01.2010 23:26 (localtime):

On Wed, 20 Jan 2010 08:06:23 +0100 Harald Schmalzbauer wrote:


Dear all,

I have no idea why top crashes with segmentation fault on my amd64
machine running FreeBSD 8.0-RELEASE-p2.
If someone wants to have a loot at the core dump:
http://www.schmalzbauer.de/downloads/top.core


core file is useless without binary and libraries. So it is better to run gdb
on your host, produce backtrace and post here:

gdb /usr/bin/top top.core
bt

And sure a backtrace from the top built with -g would be much better.

cd /usr/src/usr.bin/top
CFLAGS=-g make


Unfortunately nss_ldap seems to be the culprit.

gdb /usr/bin/top top.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd...
Core was generated by `top'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libncurses.so.8...done.
Loaded symbols for /lib/libncurses.so.8
Reading symbols from /lib/libm.so.5...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libkvm.so.5...done.
Loaded symbols for /lib/libkvm.so.5
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from /usr/local/lib/nss_ldap.so.1...done.
Loaded symbols for /usr/local/lib/nss_ldap.so.1
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
bt:
#0  0x000800d08403 in __nss_compat_gethostbyname () from 
/usr/local/lib/nss_ldap.so.1
#0  0x000800d08403 in __nss_compat_gethostbyname () from 
/usr/local/lib/nss_ldap.so.1
#1  0x000800d0606f in _nss_ldap_getpwent_r () from 
/usr/local/lib/nss_ldap.so.1

#2  0x0008009ffc54 in __nss_compat_getpwent_r () from /lib/libc.so.7
#3  0x000800a84a3d in nsdispatch () from /lib/libc.so.7
#4  0x000800a50976 in getpwent_r () from /lib/libc.so.7
#5  0x000800a50596 in sysctlbyname () from /lib/libc.so.7
#6  0x00406c6d in machine_init (statics=0x7fffea30, 
do_unames=1 '\001')

at /usr/src/usr.bin/top/machine.c:257
#7  0x00407a10 in main (argc=1, argv=0x7fffeb08)
at /usr/src/usr.bin/top/../../contrib/top/top.c:458

I'm using nss_ldapd-0.7.2 and there's no way to live without ldap...

Any help highly appreciated!

Thanks,

-Harry



signature.asc
Description: OpenPGP digital signature


Re: posting coding bounties, appropriate money amounts?

2010-01-22 Thread Adam Vande More
On Fri, Jan 22, 2010 at 6:40 PM, Jeremy Chadwick
free...@jdc.parodius.comwrote:

 On Fri, Jan 22, 2010 at 07:49:46PM +0200, Dan Naumov wrote:
  I am curious about posting some coding bounties, my current interest
  revolves around improving the ZVOL functionality in FreeBSD: fixing
  the known ZVOL SWAP reliability/stability problems as well as making
  ZVOLs work as a dumpon device (as is already the case in OpenSolaris)
  for crash dumps. I am a private individual and not some huge Fortune
  100 and while I am not exactly rich, I am willing to put some of my
  personal money towards this. I am curious though, what would be the
  best way to approach this: directly approaching committer(s) with the
  know-how-and-why of the areas involved or through the FreeBSD
  Foundation? And how would one go about calculating the appropriate
  amount of money for such a thing?

 For what it's worth: count me in here, and not just with regards to
 zvol.  I'd be more than happy to donate money to a pool (pun intended)
 to get some of the ZFS-centric issues looked at / focused on, and
 possibly fixed.

 I'd be willing to put up a thousand USD or possibly more depending on
 what sort of work was being considered.  I suppose a better choice would
 be for someone here to make a list of issues which the community feels
 need attention, and put the pooled donations to whatever things had
 highest priority -- or, if that isn't plausible, then to what interested
 developers wanted to work on.

 --
 | Jeremy Chadwick   j...@parodius.com |
 | Parodius Networking   http://www.parodius.com/ |
 | UNIX Systems Administrator  Mountain View, CA, USA |
 | Making life hard for others since 1977.  PGP: 4BD6C0CB |


To the best of my understanding, that is basically what donating to the
FreeBSD Foundation accomplishes, although it would be nice so see some more
transparency in their decision making process.

-- 
Adam Vande More
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Adrian Wontroba
On Fri, Jan 22, 2010 at 05:21:56PM +0100, Oliver Brandmueller wrote:
 
 I just noticed somthing: I setup an 8.0-RELEASE amd64 box, / is default 
 512M. First step after setup was to csup to RELENG_8 and buildkernel and 
 buildworld (no custom kernel, no make.conf).
 
 Instaling the new kernel failed, since /boot/kernel/ is already well 
 over 230 MBytes in size. moving that to kernel.old and writing a new one 
 with about the same size fails due to no space left on device.
 
 This is not a question; I do know how to get around this and how to 
 configure custom kernels so they are a fragment of that size afterwards. 
 However, I think this is a clear POLA violation. So, either GENERIC with 
 less debugging information (symbols and stuff), which makes debugging 
 harder or setting a higher default for / would be options, if not anyone 
 else has better ideas.

/usr/src/UPDATING has this which will allow you to remove symbols when
installing a kernel:

20060118:
This actually occured some time ago, but installing the kernel
now also installs a bunch of symbol files for the kernel modules.
This increases the size of /boot/kernel to about 67Mbytes. You
will need twice this if you will eventually back this up to kernel.old
on your next install.
If you have a shortage of room in your root partition, you should add
-DINSTALL_NODEBUG to your make arguments or add INSTALL_NODEBUG=yes
to your /etc/make.conf.

I concur that the 235 MB size of an amd64 8.0 kernel is a bit of a
surprise. An i386 kernel is a mere 135 MB.  IMO increasing the sysinstall
default root slice size for at least amd64 would be a good thing.

-- 
Adrian Wontroba
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8.0-RELEASE - -STABLE and size of /

2010-01-22 Thread Steven Friedrich
On Friday 22 January 2010 06:32:02 pm Oliver Brandmueller wrote:
 Hi,
 
 On Fri, Jan 22, 2010 at 03:56:31PM -0500, Steven Friedrich wrote:
  in your /etc/make.conf, do you have a line like:
  makeoptions DEBUG=-g
  if so, comment it out.
 
 The GENEREIC kernel by default has the following config:
 
 makeoptions DEBUG=-g# Build kernel with gdb(1) debug
  symbols
 
 You don't need anything special in your make.conf
 
 In fact having the debug symbols is useful in many cases. So raising the
 default size for the / partition might be the better option (OK, doesn't
 help for already installed systems of course).
 
 - Oliver
 
I'm sorry.  My response to him should have been more precise.
I was trying to clue him in on how to build a non-debug kernel, but my answer 
was in fact wrong.
I said he may have a line in make.conf, but that was a mistake. I pulled the 
line from a kernel config file.
If he wants to build a kernel with no symbols, as he stated he does, he needs 
to comment out the line and build a kernel. Could buildworld and installworld, 
too.
But he and I went off topic. I should have changed the subject line to start a 
new thread to discuss building without symbols. He was complaining that it 
wasn't in the FAQ or the handbook. It's in GENERIC, which is required reading 
if you're ever going to build custom kernels.

As for the main topic, I have been making 4GB root partitions for some time. 
Our disk requirements have been soaring over the last decade, while cost per 
MB have plummeted. I don't want to have to guess what sizes each partition 
should be.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org