Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks
On Thu, Jan 06, 2011 at 03:45:04PM +0200 I heard the voice of Daniel Kalchev, and lo! it spake thus: You should also know that having large L2ARC requires that you also have larger ARC, because there are data pointers in the ARC that point to the L2ARC data. Someone will do good to the community to publish some reasonable estimates of the memory needs, so that people do not end up with large but unusable L2ARC setups. Estimates I've read in the past are that L2ARC consumes ARC space at around 1-2%. -- Matthew Fuller (MF4839) | fulle...@over-yonder.net Systems/Network Administrator | http://www.over-yonder.net/~fullermd/ On the Internet, nobody can hear you scream. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
Hi, OpenOffice hangs on NFS when I try to save a file or even when I try to open the save dialog in this case. $ 17:25:35 ron...@ronald [~] procstat -kk 85575 PIDTID COMM TDNAME KSTACK 85575 100322 soffice.bin initial thread mi_switch+0x176 sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39 VOP_LOCK1_APV+0x46 _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8 nfs_lookup+0x65e VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82 kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40 85575 100502 soffice.bin -mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100576 soffice.bin -mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100577 soffice.bin -mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100578 soffice.bin -mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100579 soffice.bin -mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 $ 17:25:35 ron...@ronald [~] uname -a FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #6: Mon Dec 27 23:49:30 CET 2010 r...@ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64 It is not possible to exit or kill soffice.bin. I had a slighty different procstat stack before, but that was fixed a couple of days ago. Any thoughts? Enabling local locks in NFS doesn't fix it. The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26. If more info is needed. I can easily reproduce this. Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks
2011/1/7 Jeremy Chadwick free...@jdc.parodius.com: On Fri, Jan 07, 2011 at 12:29:17PM +1100, Jean-Yves Avenard wrote: On 6 January 2011 22:26, Chris Forgeron cforge...@acsi.ca wrote: You know, these days I'm not as happy with SSD's for ZIL. I may blog about some of the speed results I've been getting over the last 6mo-1yr that I've been running them with ZFS. I think people should be using hardware RAM drives. You can get old Gigabyte i-RAM drives with 4 gig of memory for the cost of a 60 gig SSD, and it will trounce the SSD for speed. I'd put your SSD to L2ARC (cache). Where do you find those though. I've looked and looked and all references I could find was that battery-powered RAM card that Sun used in their test setup, but it's not publicly available.. DDRdrive: http://www.ddrdrive.com/ http://www.engadget.com/2009/05/05/ddrdrives-ram-based-ssd-is-snappy-costly/ ACard ANS-9010: http://techreport.com/articles.x/16255 GC-RAMDISK (i-RAM) products: http://us.test.giga-byte.com/Products/Storage/Default.aspx Be aware these products are absurdly expensive for what they offer (the cost isn't justified), not to mention in some cases a bottleneck is imposed by use of a SATA-150 interface. I'm also not sure if all of them offer BBU capability. In some respects you might be better off just buying more RAM for your system and making md(4) memory disks that are used by L2ARC (cache). I've mentioned this in the past (specifically back in the days when the ARC piece of ZFS on FreeBSD was causing havok, and asked if one could work around the complexity by using L2ARC with md(4) drives instead). Once you have got extra RAM, why not just reserve it directly to ARC (via vm.kmem_size[_max] and vfs.zfs.arc_max)? Markiyan. I tried this, but couldn't get rc.d/mdconfig2 to do what I wanted on startup WRT the aforementioned. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks
On 1/7/11 1:10 PM, Markiyan Kushnir wrote: 2011/1/7 Jeremy Chadwick free...@jdc.parodius.com: On Fri, Jan 07, 2011 at 12:29:17PM +1100, Jean-Yves Avenard wrote: On 6 January 2011 22:26, Chris Forgeron cforge...@acsi.ca wrote: You know, these days I'm not as happy with SSD's for ZIL. I may blog about some of the speed results I've been getting over the last 6mo-1yr that I've been running them with ZFS. I think people should be using hardware RAM drives. You can get old Gigabyte i-RAM drives with 4 gig of memory for the cost of a 60 gig SSD, and it will trounce the SSD for speed. I'd put your SSD to L2ARC (cache). Where do you find those though. I've looked and looked and all references I could find was that battery-powered RAM card that Sun used in their test setup, but it's not publicly available.. DDRdrive: http://www.ddrdrive.com/ http://www.engadget.com/2009/05/05/ddrdrives-ram-based-ssd-is-snappy-costly/ ACard ANS-9010: http://techreport.com/articles.x/16255 GC-RAMDISK (i-RAM) products: http://us.test.giga-byte.com/Products/Storage/Default.aspx Be aware these products are absurdly expensive for what they offer (the cost isn't justified), not to mention in some cases a bottleneck is imposed by use of a SATA-150 interface. I'm also not sure if all of them offer BBU capability. In some respects you might be better off just buying more RAM for your system and making md(4) memory disks that are used by L2ARC (cache). I've mentioned this in the past (specifically back in the days when the ARC piece of ZFS on FreeBSD was causing havok, and asked if one could work around the complexity by using L2ARC with md(4) drives instead). Once you have got extra RAM, why not just reserve it directly to ARC (via vm.kmem_size[_max] and vfs.zfs.arc_max)? Markiyan. I haven't calculated yet but perhaps SSDs are cheaper by the GB than raw RAM. Not to mention DIMM slots are usually scarce, disk ones aren't. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFS - DNS fail stops boot in mountlate
On Thursday, January 06, 2011 2:26:10 pm grarpamp wrote: RELENG_8. ### setup mount -d -a -l -v -t nfs exec: mount_nfs -o ro -o tcp -o bg -o nolockd -o intr 192.168.0.10:/tmp /mnt exec: mount_nfs -o ro -o tcp -o bg -o nolockd -o intr foo:/tmp /mnt 192.168.0.10 has been unplugged, no arp entry. Host foo not found: 3(NXDOMAIN) ### result mount -v 192.168.0.10:/tmp ; echo $? [tcp] 192.168.0.10:/tmp: RPCPROG_NFS: RPC: Port mapper failure - RPC: Timed out mount_nfs: Cannot immediately mount 192.168.0.10:/tmp, backgrounding /dev/ad0s1a on / (ufs, local, read-only, fsid snip1) 0 [this is ok.] I've seen a regression in 8 at work where NFS mounts seem to fail on DNS on every boot (we have a small number of mounts, 10) whereas 7 worked fine on every boot. I haven't tracked it down yet, but 8 is certainly more fragile than 7 for mounting NFS on boot. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFSv4 - how to set up at FreeBSD 8.1 ?
On 7 January 2011 08:16, Rick Macklem rmack...@uoguelph.ca wrote: When I said I recalled that they didn't do TCP because of excessive overhead, I forgot to mention that my recollection could be wrong. Also, I suspect you are correct w.r.t. the above statement. (ie. Sun's official position vs something I heard.) Anyhow, appologies if I gave the impression that I was correcting your statement. My intent was just to throw out another statement that I vaguely recalled someone an Sun stating. After hitting yet another serious bug in 8.2 ; I reverted back to 8.1 Interestingly, it now complains about having V4: / in /etc/exports At one time the V4: line was required to be at the end of the /etc/exports file. (You could consider that a bug left over from the OpenBSD port, where it was a separate section of /etc/exports.) I removed that restriction from mountd.c at some point, but maybe after 8.1. So, try just moving the V4: line to the end of /etc/exports. NFSv4 isn't available in 8.1 ? It should be there, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFS - DNS fail stops boot in mountlate
On Thu, Jan 06, 2011 at 09:19:06PM -0500, grarpamp wrote: So what was unclear? mount_nfs emits a nonzero exit status upon failing to look up an FQDN causing mountlate to trigger a dump to shell on boot during rc processing. That's a *showstopper*. The right thing to do is to hack mount_nfs to punt to background mounting in this case with an appropriate exit status. Personally I'd distinguish mount_nfs exit codes between: 0 - mounted 1 - backgrounded, for any reason 2 - none of the above and adjust the rc's to deal with it accordingly. Words are subject to interpretation and take time. Though perhaps masked by brevity, I believe all the above elements were in the prior concise post. Thanks everybody :) So basically the problem is that the bg option in mount_nfs only applies to network unreachable conditions and not DNS resolution failed conditions. Initially I was going to refute the above request until I looked closely at the mount_nfs(8) man page which has the following clauses: For non-critical file systems, the bg and retrycnt options provide mechanisms to prevent the boot process from hanging if the server is unavailable. [...describing the bg option...] Useful for fstab(5), where the file system mount is not critical to multiuser operation. I read these statements to mean if -o bg is used, the system should not hang/stall/fail during the boot process. Dumping to /bin/sh on boot as a result of a DNS lookup failure violates those statements, IMHO. I would agree that DNS resolution should be part of the bg/retry feature of bg in mount_nfs. How/whether this is feasible to implement is unknown to me. I don't think punting to bg when a DNS failure occurs is a particularily good idea, mostly because it doesn't help for critical mounts. (I haven't looked to see if the change is feasible, either.) It would be nice to get DNS working more reliably early in boot and the, of course, there is what Doug stated w.r.t. use IP numbers or put entries in /etc/hosts for NFS servers. rick ps: I do think that server unavailable doesn't imply server is available, but DNS can't resolve it's address. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFS - DNS fail stops boot in mountlate
I've seen a regression in 8 at work where NFS mounts seem to fail on DNS on every boot (we have a small number of mounts, 10) whereas 7 worked fine on every boot. I haven't tracked it down yet, but 8 is certainly more fragile than 7 for mounting NFS on boot. /etc/defaults/rc.conf has nfs_server_flags=-u -t -n 4 Once I had a server with example /etc/rc.conf -n 10 had problems when I added partions beyond eg 10 ... so I suggest check rc.conf against fstab /etc/exports Cheers, Julian -- Julian Stacey, BSD Unix Linux C Sys Eng Consultants Munich http://berklix.com Mail plain text; Not quoted-printable, or HTML or base 64. Avoid top posting, it cripples itemised cumulative responses. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFS - DNS fail stops boot in mountlate
On Friday, January 07, 2011 10:29:22 am Julian H. Stacey wrote: I've seen a regression in 8 at work where NFS mounts seem to fail on DNS on every boot (we have a small number of mounts, 10) whereas 7 worked fine on every boot. I haven't tracked it down yet, but 8 is certainly more fragile than 7 for mounting NFS on boot. /etc/defaults/rc.conf has nfs_server_flags=-u -t -n 4 Once I had a server with example /etc/rc.conf -n 10 had problems when I added partions beyond eg 10 ... so I suggest check rc.conf against fstab /etc/exports That should not matter for establishing mounts. Also, keep in mind that 7 worked fine with the same settings. In fact, I'm just booting an 8 kernel on the same 7 userland currently and it's only the 8 kernel that has the problem (a pure 8 system also has the same symptoms, so it's not a problem due to mixing a 7 world with 8 kernel). -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFS - DNS fail stops boot in mountlate
Hi, Reference: From: John Baldwin j...@freebsd.org Date: Fri, 7 Jan 2011 11:03:35 -0500 Message-id: 201101071103.35500@freebsd.org John Baldwin wrote: On Friday, January 07, 2011 10:29:22 am Julian H. Stacey wrote: I've seen a regression in 8 at work where NFS mounts seem to fail on DNS on every boot (we have a small number of mounts, 10) whereas 7 worked fine on every boot. I haven't tracked it down yet, but 8 is certainly more fragile than 7 for mounting NFS on boot. /etc/defaults/rc.conf has nfs_server_flags=-u -t -n 4 Once I had a server with example /etc/rc.conf -n 10 had problems when I added partions beyond eg 10 ... so I suggest check rc.conf against fstab /etc/exports That should not matter for establishing mounts. Also, keep in mind that 7 worked fine with the same settings. In fact, I'm just booting an 8 kernel on the same 7 userland currently and it's only the 8 kernel that has the problem (a pure 8 system also has the same symptoms, so it's not a problem due to mixing a 7 world with 8 kernel). -- John Baldwin OK, I have no /etc/fstab direct invoked NFS mounts, Just AMD NFS invoked (on mixed net of Releases: 8,7,6,4 I guess my DNS has longer to start (or my AMD+NFS falls back to other hosts already running DNS) Good luck tracing it. Cheers, Julian -- Julian Stacey, BSD Unix Linux C Sys Eng Consultants Munich http://berklix.com Mail plain text; Not quoted-printable, or HTML or base 64. Avoid top posting, it cripples itemised cumulative responses. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: NFSv4 - how to set up at FreeBSD 8.1 ?
On Fri, Jan 7, 2011 at 7:05 AM, Rick Macklem rmack...@uoguelph.ca wrote: NFSv4 isn't available in 8.1 ? It should be there, rick It is. I'm running FreeBSD 8.1 i386 at home, using NFSv4 to share folders out to my media PC/laptop beast thingy. [fc...@rogue /home/fcash]$ uname -a FreeBSD rogue.ashesofthe.net 8.1-RELEASE FreeBSD 8.1-RELEASE #0 r211388: Sun Aug 22 15:18:36 PDT 2010 r...@rogue.ashesofthe.net:/usr/obj/usr/src-8/sys/ROGUE i386 [fc...@rogue /home/fcash]$ cat /etc/exports /home -mapall=fcash -network 172.20.0.0/24 V4: /home/samba/shared -sec=sys -network 172.20.0.0/24 Never tried it with the V4: line anywhere but at the end of the file. -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks
On Fri, Jan 7, 2011 at 3:16 AM, Matthew D. Fuller fulle...@over-yonder.net wrote: On Thu, Jan 06, 2011 at 03:45:04PM +0200 I heard the voice of Daniel Kalchev, and lo! it spake thus: You should also know that having large L2ARC requires that you also have larger ARC, because there are data pointers in the ARC that point to the L2ARC data. Someone will do good to the community to publish some reasonable estimates of the memory needs, so that people do not end up with large but unusable L2ARC setups. Estimates I've read in the past are that L2ARC consumes ARC space at around 1-2%. Each record in L2ARC takes about 250 bytes in ARC. If I understand it correctly, not all records are 128K which is default record size on ZFS. If you end up with a lot of small records (for instance, if you have a lot of small files or due to a lot of synchronous writes or if record size is set to a lower value) then you could potentially end up with much higher ARC requirements. So, 1-2% seems to be a reasonable estimate assuming that ZFS deals with ~10K-20K records most of the time. If you mostly store large files your ratio would probably be much better. One way to get specific ratio for *your* pool would be to collect record size statistics from your pool using zdb -L -b pool and then calculate L2ARC:ARC ratio based on average record size. I'm not sure, though whether L2ARC stores records in compressed or uncompressed form. --Artem ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Supermicro Bladeserver
I am trying to track down a problem being experienced at icir.org using SuperMicro bladeservers, the SERDES 82575 interfaces are having connectivity or perhaps autoneg problems, resulting in link transitions and watchdog resets. The closest hardware my org at Intel has is a Fujitsu server who's blades also have this device, but testing on that has failed to repro the problem. I was wondering if anyone else out there has this hardware, if so could you let me know your experience, have you had problems or not, etc etc? Thanks much for any information! Jack ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
Hi, OpenOffice hangs on NFS when I try to save a file or even when I try to open the save dialog in this case. $ 17:25:35 ron...@ronald [~] procstat -kk 85575 PID TID COMM TDNAME KSTACK 85575 100322 soffice.bin initial thread mi_switch+0x176 sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39 VOP_LOCK1_APV+0x46 _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8 nfs_lookup+0x65e VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82 kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40 85575 100502 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100576 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100577 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100578 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100579 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 $ 17:25:35 ron...@ronald [~] uname -a FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #6: Mon Dec 27 23:49:30 CET 2010 r...@ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64 I think all the above tells us is that the thread is waiting for a vnode lock. The question then becomes what is holding a lock on that vnode and why?. It is not possible to exit or kill soffice.bin. I had a slighty different procstat stack before, but that was fixed a couple of days ago. Yea, it will be in an uniterruptible sleep when waiting for a vnode lock. Any thoughts? Enabling local locks in NFS doesn't fix it. Here's some things you could try: 1 - apply the attached patch. It fixes a known problem w.r.t. the client side of the krpc. Not likely to fix this, but I can hope:-) 2 - If #1 doesn't fix the problem: - before making it hang, start capturing packets via: # tcpdump -s 0 -w xxx host server - then make it hang, kill the above and # procstat -ka # ps axHlww and capture the output of both of these. Hopefully these 2 commands will indicate what is holding the vnode lock and maybe, why. The xxx file can be looked at in wireshark to see what/if any NFS traffic is happening. If you aren't comfortable looking at the above, you can email them to me and I'll take a stab at them someday. 3 - Try the experimental client to see if it behaves differently. The mount command is: # mount -t newnfs -o nfsv3,the options you already use server:/path /mntpath (This might ideantify if the regular client has an infrequently executed code path that forgets to unlock the vnode, since it uses a somewhat different RPC layer. The buffer cache handling etc are almost the same, but the RPC stuff is fairly different.) The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26. I'm afraid I can't blame Linux (at least not until we have more info;-). If more info is needed. I can easily reproduce this. See above #2. Good luck with it and let us know how it goes, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
On Fri, Jan 07, 2011 at 02:37:25PM -0500, Rick Macklem wrote: Hi, OpenOffice hangs on NFS when I try to save a file or even when I try to open the save dialog in this case. $ 17:25:35 ron...@ronald [~] procstat -kk 85575 PID TID COMM TDNAME KSTACK 85575 100322 soffice.bin initial thread mi_switch+0x176 sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39 VOP_LOCK1_APV+0x46 _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8 nfs_lookup+0x65e VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82 kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40 85575 100502 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100576 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100577 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100578 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 85575 100579 soffice.bin - mi_switch+0x176 sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186 syscall+0x40 Xfast_syscall+0xe2 $ 17:25:35 ron...@ronald [~] uname -a FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #6: Mon Dec 27 23:49:30 CET 2010 r...@ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64 I think all the above tells us is that the thread is waiting for a vnode lock. The question then becomes what is holding a lock on that vnode and why?. It is not possible to exit or kill soffice.bin. I had a slighty different procstat stack before, but that was fixed a couple of days ago. Yea, it will be in an uniterruptible sleep when waiting for a vnode lock. Any thoughts? Enabling local locks in NFS doesn't fix it. Here's some things you could try: 1 - apply the attached patch. It fixes a known problem w.r.t. the client side of the krpc. Not likely to fix this, but I can hope:-) 1a - Look around of other processes in the uninterruptible sleep state, quite possible, one of them also owns the lock the openoffice is waiting for. Also see http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html Of the particular interest are the witness output and backtraces for all threads that are reported by witness as owning the vnode locks. 2 - If #1 doesn't fix the problem: - before making it hang, start capturing packets via: # tcpdump -s 0 -w xxx host server - then make it hang, kill the above and # procstat -ka # ps axHlww and capture the output of both of these. Hopefully these 2 commands will indicate what is holding the vnode lock and maybe, why. The xxx file can be looked at in wireshark to see what/if any NFS traffic is happening. If you aren't comfortable looking at the above, you can email them to me and I'll take a stab at them someday. 3 - Try the experimental client to see if it behaves differently. The mount command is: # mount -t newnfs -o nfsv3,the options you already use server:/path /mntpath (This might ideantify if the regular client has an infrequently executed code path that forgets to unlock the vnode, since it uses a somewhat different RPC layer. The buffer cache handling etc are almost the same, but the RPC stuff is fairly different.) The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26. I'm afraid I can't blame Linux (at least not until we have more info;-). If more info is needed. I can easily reproduce this. See above #2. Good luck with it and let us know how it goes, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org pgp5EBbygNWgK.pgp Description: PGP signature
stunnel transparent proxy
Folks, Would it be possible to devise an ipfw 'fwd' rule to pass along a socket connection with IP_BINDANY set via stunnel that forwards it to another process? The problem I'm having is the vnc service on the other side cannot reply back to the IP address because the routing does not redirect back through stunnel. I am testing configurations using apache (port 80 and 443) for convenience. Request : ext ip - stunnel - vnc svc Response : vnc svc X-ext ip instead of : vnc svc - stunnel - ext ip With stunnel's transparent set option traffic looks like : 19:31:34.162337 IP 192.168.103.69.52671 127.0.0.1.80: Flags [S], seq 2050938762, win 65535, options [mss 16344,nop,wscale 3,sackOK,TS val 7437993 ecr 0], length 0 19:31:37.153079 IP 192.168.103.69.52671 127.0.0.1.80: Flags [S], snip.. 19:31:40.351804 IP 192.168.103.69.52671 127.0.0.1.80: Flags [S], snip .. 19:31:43.550543 IP 192.168.103.69.52671 127.0.0.1.80: Flags [S], seq 2050938762, win 65535, options [mss 16344,sackOK,eol], length 0 Without transparent, traffic flows fine, and looks like : 19:32:55.883404 IP 127.0.0.1.30326 127.0.0.1.80: Flags [S], seq 2147354729, win 65535, options [mss 16344,nop,wscale 3,sackOK,TS val 7446169 ecr 0], length 0 19:32:55.883575 IP 127.0.0.1.80 127.0.0.1.30326: Flags [S.], seq 2770470513, ack 2147354730, win 65535, options [mss 16344,nop,wscale 3,sackOK,TS val 1229815108 ecr 7446169], length 0 19:32:55.883589 IP 127.0.0.1.30326 127.0.0.1.80: Flags [.], ack 1, win 8960, options [nop,nop,TS val 7446169 ecr 1229815108], length 0 ... I did try and devise pf rules to redirect or rdr and nat, but neither worked. I am only vaguely familiar with ipfw, and from some of my research led me to believe it may be possible. Thanks P.S. I did post the same question earlier on freebsd-pf list as well. http://lists.freebsd.org/pipermail/freebsd-pf/2011-January/005914.html ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Supermicro Bladeserver
In article aanlktinkcvjy19nxwon_hfngae9etebtpg=szff2w...@mail.gmail.com Jack Vogel jfvo...@gmail.com writes: I am trying to track down a problem being experienced at icir.org using SuperMicro bladeservers, the SERDES 82575 interfaces are having connectivity or perhaps autoneg problems, resulting in link transitions and watchdog resets. The closest hardware my org at Intel has is a Fujitsu server who's blades also have this device, but testing on that has failed to repro the problem. I was wondering if anyone else out there has this hardware, if so could you let me know your experience, have you had problems or not, etc etc? My machine has the following em(4) device and it has a autoneg problem. When I was using 8-stable kernel at 2010/11/01, it has no problem. But I update to 8-stable at 2010/12/01, the kernel is only linked up as 10M. e...@pci0:0:25:0:class=0x02 card=0x13d510cf chip=0x104a8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82566DM Gigabit Network Connection' class = network subclass = ethernet --- TAKAHASHI Yoshihiro n...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org