On Friday, February 15, 2013 11:31:11 pm Marc Fournier wrote:
Trying the patch now … but what do you mean by using 'SIGSTOP'? I generally
do a 'kill -HUP' then when that doesn't work 'kill -9' … should Iuse -STOP
instead of 9?
No. This patch only helps if you are using kill -STOP to pause
2days, 6hrs since reboot with new kernel, server shows unreachable:
# ssh mercury
ssh_exchange_identification: Connection closed by remote host
although runtime shows it is up:
mercuryup 2+06:17, 0 users, load 0.63, 0.69, 0.70
Remote console shows:
I could press
According to /var/log/messages, everything seems to have been running (at least
against the local file system) up until the reboot:
===
Feb 18 12:00:00 mercury kernel: bce1: promiscuous mode disabled
Feb 18 12:00:00 mercury kernel: bce1: promiscuous mode enabled
Feb 18 12:13:55 mercury syslogd:
Marc Fournier wrote:
On 2013-02-15, at 7:21 AM, Rick Macklem rmack...@uoguelph.ca wrote:
Righto. Thanks jhb and kib for looking at this.
Btw John, PBDRY still gets set for sleeps in the sys/rpc code.
However,
as far as I can tell, it just sets TDF_SBDRY when it is already set
and
On 2013-02-15, at 7:21 AM, Rick Macklem rmack...@uoguelph.ca wrote:
Righto. Thanks jhb and kib for looking at this.
Btw John, PBDRY still gets set for sleeps in the sys/rpc code. However,
as far as I can tell, it just sets TDF_SBDRY when it is already set
and seems harmless. (Since this
On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote:
Marc Fournier wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote:
The pid that is in T state for the ps auxlH.
Different server, last kernel update on Jan 22nd, https process this
time instead
On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote:
On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote:
Marc Fournier wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote:
The pid that is in T state for the ps auxlH.
Different
Konstantin Belousov wrote:
On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote:
On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote:
Marc Fournier wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca
wrote:
The pid that is in T state
On Friday, February 15, 2013 10:21:11 am Rick Macklem wrote:
Konstantin Belousov wrote:
On Fri, Feb 15, 2013 at 08:44:43AM -0500, John Baldwin wrote:
On Thursday, February 14, 2013 10:05:56 pm Rick Macklem wrote:
Marc Fournier wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem
Trying the patch now … but what do you mean by using 'SIGSTOP'? I generally do
a 'kill -HUP' then when that doesn't work 'kill -9' … should Iuse -STOP instead
of 9?
On 2013-02-15, at 5:44 AM, John Baldwin j...@freebsd.org wrote:
I think this is the right idea, but in HEAD with the
Marc Fournier wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote:
The pid that is in T state for the ps auxlH.
Different server, last kernel update on Jan 22nd, https process this
time instead of du last time.
I've attached:
ps auxlH
ps auxlH of just the
On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote:
Btw Marc, if you just want this problem to go away, I suspect getting rid
of the intr mount option would do that.
Am more interested in fixing the problem (if possible) then just masking it,
but ...
Based on the man page
Marc Fournier wrote:
On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote:
Btw Marc, if you just want this problem to go away, I suspect
getting rid
of the intr mount option would do that.
Am more interested in fixing the problem (if possible) then just
masking it, but
On 2013-02-14, at 16:24 , Rick Macklem rmack...@uoguelph.ca wrote:
Marc Fournier wrote:
On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote:
Btw Marc, if you just want this problem to go away, I suspect
getting rid
of the intr mount option would do that.
Am more
Marc Fournier wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote:
The pid that is in T state for the ps auxlH.
Different server, last kernel update on Jan 22nd, https process this
time instead of du last time.
I've attached:
ps auxlH
ps auxlH of just the
Marc Fournier wrote:
On 2013-02-14, at 16:24 , Rick Macklem rmack...@uoguelph.ca wrote:
Marc Fournier wrote:
On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca
wrote:
Btw Marc, if you just want this problem to go away, I suspect
getting rid
of the intr mount option
On Tue, Feb 12, 2013 at 08:50:39PM -0500, Rick Macklem wrote:
Marc Fournier wrote:
Just reset server, so any further details will have to be 'next time'
??? but, just did a csup and am rebuilding ??? the following three files
were modified since last build:
grep nfs /tmp/output
Edit
Konstantin Belousov wrote:
On Tue, Feb 12, 2013 at 08:50:39PM -0500, Rick Macklem wrote:
Marc Fournier wrote:
Just reset server, so any further details will have to be 'next
time'
??? but, just did a csup and am rebuilding ??? the following three
files
were modified since last
On 2013-02-13, at 14:50 , Rick Macklem rmack...@uoguelph.ca wrote:
He does get the odd error reported by nfs_getpages() and I don't
think we've isolated why yet. The error is 13 (EACCES), but jhb@
thought it might be because of the bug he fixed where the krpc
reported EACCES for the EINTR
On Wed, Feb 13, 2013 at 05:50:13PM -0500, Rick Macklem wrote:
I got it resent from him. I've attached it to this post, just in case you
are interested in taking a look at it.
I do not see the voffset wchains surprising. All of them seems to occur
in the multithreading process. The usual reason
On 2013-02-13, at 15:16 , Konstantin Belousov kostik...@gmail.com wrote:
On Wed, Feb 13, 2013 at 05:50:13PM -0500, Rick Macklem wrote:
I got it resent from him. I've attached it to this post, just in case you
are interested in taking a look at it.
I do not see the voffset wchains
Marc Fournier wrote:
On 2013-02-13, at 14:50 , Rick Macklem rmack...@uoguelph.ca wrote:
He does get the odd error reported by nfs_getpages() and I don't
think we've isolated why yet. The error is 13 (EACCES), but jhb@
thought it might be because of the bug he fixed where the krpc
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote:
The pid that is in T state for the ps auxlH.
Different server, last kernel update on Jan 22nd, https process this time
instead of du last time.
I've attached:
ps auxlH
ps auxlH of just the processes that are in TJ
Note that checking the console, there are no errors pertaining to this on it …
On 2013-02-13, at 9:26 PM, Marc Fournier scra...@hub.org wrote:
On 2013-02-13, at 3:54 PM, Rick Macklem rmack...@uoguelph.ca wrote:
The pid that is in T state for the ps auxlH.
Different server, last
I don't know if this provides any benefit, but I just shut down all the VPSs on
that server, so that all the 'noise' is removed from the ps listing, which I've
attached …
On 2013-02-13, at 9:31 PM, Marc Fournier scra...@hub.org wrote:
Note that checking the console, there are no errors
Marc Fournier wrote:
Just reset server, so any further details will have to be 'next time'
… but, just did a csup and am rebuilding … the following three files
were modified since last build:
grep nfs /tmp/output
Edit src/sys/fs/nfs/nfs_commonsubs.c
Edit src/sys/fs/nfsclient/nfs_clrpcops.c
Marc Fournier wrote:
Hi John …
Does this help?
root@io:~ # ps auxl | grep du
root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx /vm/2799 0
81426 0 20 0 newnfs
root 12353 0.0 0.1 16176 5104 ?? D Sat03AM 0:05.41 du -skx /vm/2799 0
91597 0 20 0 newnfs
root 64529 0.0 0.1 16176 5164
On 2013-02-10, at 4:31 PM, Rick Macklem rmack...@uoguelph.ca wrote:
Marc Fournier wrote:
Hi John …
Does this help?
root@io:~ # ps auxl | grep du
root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx /vm/2799 0
81426 0 20 0 newnfs
root 12353 0.0 0.1 16176 5104 ?? D Sat03AM 0:05.41
Just reset server, so any further details will have to be 'next time' … but,
just did a csup and am rebuilding … the following three files were modified
since last build:
grep nfs /tmp/output
Edit src/sys/fs/nfs/nfs_commonsubs.c
Edit src/sys/fs/nfsclient/nfs_clrpcops.c
Edit
Hi John …
Does this help?
root@io:~ # ps auxl | grep du
root 1054 0.0 0.1 16176 6600 ?? D 3:15AM 0:05.38 du -skx
/vm/2799 0 81426 0 20 0 newnfs
root12353 0.0 0.1 16176 5104 ?? DSat03AM 0:05.41 du -skx
/vm/2799 0 91597 0 20 0 newnfs
root
Thanks …
# procstat -kk 64529
PIDTID COMM TDNAME KSTACK
64529 100963 du -mi_switch+0x186 sleepq_wait+0x42
__lockmgr_args+0x5cb nfs_lock1+0x4a VOP_LOCK1_APV+0x46 _vn_lock+0x47 vget+0x70
cache_lookup_times+0x54f
On Sunday, January 20, 2013 01:10:29 AM Hub- Marketing wrote:
On 2013-01-19, at 4:57 AM, John Baldwin j...@freebsd.org wrote:
On Tuesday, December 18, 2012 11:58:36 PM Hub- Marketing wrote:
I'm running a few servers sitting on top of a NetAPP file server …
everything runs great, but
Yup, saw those commits …am going through the servers and doing upgrades on them
… will report on any issues post-upgrade …
thx
On 2013-01-20, at 6:47 AM, John Baldwin j...@freebsd.org wrote:
On Sunday, January 20, 2013 01:10:29 AM Hub- Marketing wrote:
On 2013-01-19, at 4:57 AM, John
On Tuesday, December 18, 2012 11:58:36 PM Hub- Marketing wrote:
I'm running a few servers sitting on top of a NetAPP file server …
everything runs great, but periodically I'm getting:
nfs_getpages: error 13
vm_fault: pager read error, pid 11355 (https)
Are you using interruptible mounts
On 2013-01-19, at 4:57 AM, John Baldwin j...@freebsd.org wrote:
On Tuesday, December 18, 2012 11:58:36 PM Hub- Marketing wrote:
I'm running a few servers sitting on top of a NetAPP file server …
everything runs great, but periodically I'm getting:
nfs_getpages: error 13
vm_fault: pager
Hub-Marketing wrote:
I'm running a few servers sitting on top of a NetAPP file server …
everything runs great, but periodically I'm getting:
nfs_getpages: error 13
vm_fault: pager read error, pid 11355 (https)
13 is EACCES. This message means that the Netapp server is
replying EACCES to a
I'm running a few servers sitting on top of a NetAPP file server … everything
runs great, but periodically I'm getting:
nfs_getpages: error 13
vm_fault: pager read error, pid 11355 (https)
errors on my screen … not always same pid … the annoying part is that it seems
to always affect the same
37 matches
Mail list logo