Re: 9-STABLE - NFS - NetAPP:

2013-02-19 Thread John Baldwin
On Friday, February 15, 2013 11:31:11 pm Marc Fournier wrote: Trying the patch now … but what do you mean by using 'SIGSTOP'? I generally do a 'kill -HUP' then when that doesn't work 'kill -9' … should Iuse -STOP instead of 9? No. This patch only helps if you are using kill -STOP to pause

Re: 9-STABLE - NFS - NetAPP:

2013-02-18 Thread Marc Fournier
2days, 6hrs since reboot with new kernel, server shows unreachable: # ssh mercury ssh_exchange_identification: Connection closed by remote host although runtime shows it is up: mercuryup 2+06:17, 0 users, load 0.63, 0.69, 0.70 Remote console shows: I could press

Re: 9-STABLE - NFS - NetAPP:

2013-02-18 Thread Marc Fournier
According to /var/log/messages, everything seems to have been running (at least against the local file system) up until the reboot: === Feb 18 12:00:00 mercury kernel: bce1: promiscuous mode disabled Feb 18 12:00:00 mercury kernel: bce1: promiscuous mode enabled Feb 18 12:13:55 mercury syslogd:

Re: 9-STABLE - NFS - NetAPP:

2013-02-17 Thread Rick Macklem
Marc Fournier wrote: On 2013-02-15, at 7:21 AM, Rick Macklem rmack...@uoguelph.ca wrote: Righto. Thanks jhb and kib for looking at this. Btw John, PBDRY still gets set for sleeps in the sys/rpc code. However, as far as I can tell, it just sets TDF_SBDRY when it is already set and

Re: 9-STABLE - NFS - NetAPP:

2013-02-16 Thread Marc Fournier
On 2013-02-15, at 7:21 AM, Rick Macklem rmack...@uoguelph.ca wrote: Righto. Thanks jhb and kib for looking at this. Btw John, PBDRY still gets set for sleeps in the sys/rpc code. However, as far as I can tell, it just sets TDF_SBDRY when it is already set and seems harmless. (Since this

Re: 9-STABLE - NFS - NetAPP:

2013-02-15 Thread John Baldwin
On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote: Marc Fournier wrote: On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state for the ps auxlH. Different server, last kernel update on Jan 22nd, https process this time instead

Re: 9-STABLE - NFS - NetAPP:

2013-02-15 Thread Konstantin Belousov
On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote: On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote: Marc Fournier wrote: On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state for the ps auxlH. Different

Re: 9-STABLE - NFS - NetAPP:

2013-02-15 Thread Rick Macklem
Konstantin Belousov wrote: On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote: On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote: Marc Fournier wrote: On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state

Re: 9-STABLE - NFS - NetAPP:

2013-02-15 Thread John Baldwin
On Friday, February 15, 2013 10:21:11 am Rick Macklem wrote: Konstantin Belousov wrote: On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote: On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote: Marc Fournier wrote: On 2013-02-13, at 3:54 PM, Rick Macklem

Re: 9-STABLE - NFS - NetAPP:

2013-02-15 Thread Marc Fournier
Trying the patch now … but what do you mean by using 'SIGSTOP'? I generally do a 'kill -HUP' then when that doesn't work 'kill -9' … should Iuse -STOP instead of 9? On 2013-02-15, at 5:44 AM, John Baldwin j...@freebsd.org wrote: I think this is the right idea, but in HEAD with the

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Rick Macklem
Marc Fournier wrote: On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state for the ps auxlH. Different server, last kernel update on Jan 22nd, https process this time instead of du last time. I've attached: ps auxlH ps auxlH of just the

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Marc G. Fournier
On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote: Btw Marc, if you just want this problem to go away, I suspect getting rid of the intr mount option would do that. Am more interested in fixing the problem (if possible) then just masking it, but ... Based on the man page

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Rick Macklem
Marc Fournier wrote: On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote: Btw Marc, if you just want this problem to go away, I suspect getting rid of the intr mount option would do that. Am more interested in fixing the problem (if possible) then just masking it, but

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Marc G. Fournier
On 2013-02-14, at 16:24 , Rick Macklem rmack...@uoguelph.ca wrote: Marc Fournier wrote: On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote: Btw Marc, if you just want this problem to go away, I suspect getting rid of the intr mount option would do that. Am more

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Rick Macklem
Marc Fournier wrote: On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state for the ps auxlH. Different server, last kernel update on Jan 22nd, https process this time instead of du last time. I've attached: ps auxlH ps auxlH of just the

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Rick Macklem
Marc Fournier wrote: On 2013-02-14, at 16:24 , Rick Macklem rmack...@uoguelph.ca wrote: Marc Fournier wrote: On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote: Btw Marc, if you just want this problem to go away, I suspect getting rid of the intr mount option

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Konstantin Belousov
On Tue, Feb 12, 2013 at 08:50:39PM -0500, Rick Macklem wrote: Marc Fournier wrote: Just reset server, so any further details will have to be 'next time' ??? but, just did a csup and am rebuilding ??? the following three files were modified since last build: grep nfs /tmp/output Edit

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Rick Macklem
Konstantin Belousov wrote: On Tue, Feb 12, 2013 at 08:50:39PM -0500, Rick Macklem wrote: Marc Fournier wrote: Just reset server, so any further details will have to be 'next time' ??? but, just did a csup and am rebuilding ??? the following three files were modified since last

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc G. Fournier
On 2013-02-13, at 14:50 , Rick Macklem rmack...@uoguelph.ca wrote: He does get the odd error reported by nfs_getpages() and I don't think we've isolated why yet. The error is 13 (EACCES), but jhb@ thought it might be because of the bug he fixed where the krpc reported EACCES for the EINTR

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Konstantin Belousov
On Wed, Feb 13, 2013 at 05:50:13PM -0500, Rick Macklem wrote: I got it resent from him. I've attached it to this post, just in case you are interested in taking a look at it. I do not see the voffset wchains surprising. All of them seems to occur in the multithreading process. The usual reason

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc G. Fournier
On 2013-02-13, at 15:16 , Konstantin Belousov kostik...@gmail.com wrote: On Wed, Feb 13, 2013 at 05:50:13PM -0500, Rick Macklem wrote: I got it resent from him. I've attached it to this post, just in case you are interested in taking a look at it. I do not see the voffset wchains

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Rick Macklem
Marc Fournier wrote: On 2013-02-13, at 14:50 , Rick Macklem rmack...@uoguelph.ca wrote: He does get the odd error reported by nfs_getpages() and I don't think we've isolated why yet. The error is 13 (EACCES), but jhb@ thought it might be because of the bug he fixed where the krpc

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc Fournier
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state for the ps auxlH. Different server, last kernel update on Jan 22nd, https process this time instead of du last time. I've attached: ps auxlH ps auxlH of just the processes that are in TJ

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc Fournier
Note that checking the console, there are no errors pertaining to this on it … On 2013-02-13, at 9:26 PM, Marc Fournier scra...@hub.org wrote: On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote: The pid that is in T state for the ps auxlH. Different server, last

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc Fournier
I don't know if this provides any benefit, but I just shut down all the VPSs on that server, so that all the 'noise' is removed from the ps listing, which I've attached … On 2013-02-13, at 9:31 PM, Marc Fournier scra...@hub.org wrote: Note that checking the console, there are no errors

Re: 9-STABLE - NFS - NetAPP:

2013-02-12 Thread Rick Macklem
Marc Fournier wrote: Just reset server, so any further details will have to be 'next time' … but, just did a csup and am rebuilding … the following three files were modified since last build: grep nfs /tmp/output Edit src/sys/fs/nfs/nfs_commonsubs.c Edit src/sys/fs/nfsclient/nfs_clrpcops.c

Re: 9-STABLE - NFS - NetAPP:

2013-02-10 Thread Rick Macklem
Marc Fournier wrote: Hi John … Does this help? root@io:~ # ps auxl | grep du root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx /vm/2799 0 81426 0 20 0 newnfs root 12353 0.0 0.1 16176 5104 ?? D Sat03AM 0:05.41 du -skx /vm/2799 0 91597 0 20 0 newnfs root 64529 0.0 0.1 16176 5164

Re: 9-STABLE - NFS - NetAPP:

2013-02-10 Thread Marc Fournier
On 2013-02-10, at 4:31 PM, Rick Macklem rmack...@uoguelph.ca wrote: Marc Fournier wrote: Hi John … Does this help? root@io:~ # ps auxl | grep du root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx /vm/2799 0 81426 0 20 0 newnfs root 12353 0.0 0.1 16176 5104 ?? D Sat03AM 0:05.41

Re: 9-STABLE - NFS - NetAPP:

2013-02-10 Thread Marc Fournier
Just reset server, so any further details will have to be 'next time' … but, just did a csup and am rebuilding … the following three files were modified since last build: grep nfs /tmp/output Edit src/sys/fs/nfs/nfs_commonsubs.c Edit src/sys/fs/nfsclient/nfs_clrpcops.c Edit

Re: 9-STABLE - NFS - NetAPP:

2013-02-09 Thread Marc Fournier
Hi John … Does this help? root@io:~ # ps auxl | grep du root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx /vm/2799 0 81426 0 20 0 newnfs root12353 0.0 0.1 16176 5104 ?? DSat03AM 0:05.41 du -skx /vm/2799 0 91597 0 20 0 newnfs root

Re: 9-STABLE - NFS - NetAPP:

2013-02-09 Thread Marc Fournier
Thanks … # procstat -kk 64529 PIDTID COMM TDNAME KSTACK 64529 100963 du -mi_switch+0x186 sleepq_wait+0x42 __lockmgr_args+0x5cb nfs_lock1+0x4a VOP_LOCK1_APV+0x46 _vn_lock+0x47 vget+0x70 cache_lookup_times+0x54f

Re: 9-STABLE - NFS - NetAPP:

2013-01-20 Thread John Baldwin
On Sunday, January 20, 2013 01:10:29 AM Hub- Marketing wrote: On 2013-01-19, at 4:57 AM, John Baldwin j...@freebsd.org wrote: On Tuesday, December 18, 2012 11:58:36 PM Hub- Marketing wrote: I'm running a few servers sitting on top of a NetAPP file server … everything runs great, but

Re: 9-STABLE - NFS - NetAPP:

2013-01-20 Thread Marc Fournier
Yup, saw those commits …am going through the servers and doing upgrades on them … will report on any issues post-upgrade … thx On 2013-01-20, at 6:47 AM, John Baldwin j...@freebsd.org wrote: On Sunday, January 20, 2013 01:10:29 AM Hub- Marketing wrote: On 2013-01-19, at 4:57 AM, John

Re: 9-STABLE - NFS - NetAPP:

2013-01-19 Thread John Baldwin
On Tuesday, December 18, 2012 11:58:36 PM Hub- Marketing wrote: I'm running a few servers sitting on top of a NetAPP file server … everything runs great, but periodically I'm getting: nfs_getpages: error 13 vm_fault: pager read error, pid 11355 (https) Are you using interruptible mounts

Re: 9-STABLE - NFS - NetAPP:

2013-01-19 Thread Hub- Marketing
On 2013-01-19, at 4:57 AM, John Baldwin j...@freebsd.org wrote: On Tuesday, December 18, 2012 11:58:36 PM Hub- Marketing wrote: I'm running a few servers sitting on top of a NetAPP file server … everything runs great, but periodically I'm getting: nfs_getpages: error 13 vm_fault: pager

Re: 9-STABLE - NFS - NetAPP:

2012-12-19 Thread Rick Macklem
Hub-Marketing wrote: I'm running a few servers sitting on top of a NetAPP file server … everything runs great, but periodically I'm getting: nfs_getpages: error 13 vm_fault: pager read error, pid 11355 (https) 13 is EACCES. This message means that the Netapp server is replying EACCES to a

9-STABLE - NFS - NetAPP:

2012-12-18 Thread Hub- Marketing
I'm running a few servers sitting on top of a NetAPP file server … everything runs great, but periodically I'm getting: nfs_getpages: error 13 vm_fault: pager read error, pid 11355 (https) errors on my screen … not always same pid … the annoying part is that it seems to always affect the same