Re: kernel process [nfscl] high cpu

2017-10-15 Thread Rick Macklem
Well, a couple of comments:
1 - I have no idea if NFSv4 mounts (or any NFS mount for that matter)
 will work correctly within jails. (I don't do jails and know nothing
 about them. Or, at least very little about them.)
2 - I do know that the "nfsuserd" daemon is badly broken when
 jails are in use.
 - There are two things you can do about this.
1 - Set vfs.nfsd.enable_stringtouid=1 on the server, plus do not run
  the nfsuserd daemon on either the client or server. (This will
  require a small change to /etc/rc.d/nfsd to avoid it being started
  at boot. See the /etc/rc.d/nfsd script in head/current. The change
  wasn't MFC'd because it was considered a POLA violation.)
   --> This causes the user/group strings to be just the numbers on the
 wire and nfsuserd is no longer needed.
(The Linux client will do this by default.)
 If it exists on the client, also set vfs.nfs.enable_uidtostring=1.
 (I think this is in stable/11, but not 11.1.)
OR
2 - Carry the patches in head/current as r320698, r320757 and
 r320758 over to the 11.1 sources and build the kernel plus
 nfsuserd from these patched sources. (These patches make
 the nfsuserd daemon use an AF_LOCAL socket, which allows it
 to work within a jail.)

As noted at the beginning, I know that nfsuserd breaks when jails
are in use, but I do not know what other NFSv4 related things break
within jails, so fixing the nfsuserd situation may not resolve your problems.

rick


From: Fabian Freyer <fabian.fre...@physik.tu-berlin.de>
Sent: Sunday, October 15, 2017 4:45:00 PM
To: freebsd-stable@freebsd.org
Cc: Rick Macklem; li...@searchy.net; z...@physik.tu-berlin.de; 
st...@physik.tu-berlin.de
Subject: Re: kernel process [nfscl] high cpu

Hi,

(I'm not on this list, please CC me in future replies.
My apologies to those who get this message twice, I had a typo in the
To: header, and have to re-send it to the list.)

sorry for reviving such an old thread, but I've run into this problem
lately as well, on a 11.1-RELEASE-p1 jailhost mounting NFSv4.0 mounts
into jails.

On 24.09.2015 23:17, Rick Macklem wrote:
> Frank de Bot wrote:
>> Rick Macklem wrote:
>>> Frank de Bot wrote:
>>>> Rick Macklem wrote:
>>>>> Frank de Bot wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
>>>>>> jail.
>>>>>> Because it's a server only to test, there is a low load. But the
>>>>>> [nfscl]
>>>>>> process is hogging a CPU after a while. This happens pretty fast,
>>>>>> within
>>>>>> 1 or 2 days. I'm noticing the high CPU of the process when I want to
>>>>>> do
>>>>>> some test after a little while (those 1 or 2 days).

Here's my ps ax | grep nfscl:
# ps ax | grep nfscl
1  -  DL  932:08.74 [nfscl]
11572  -  DL  442:27.42 [nfscl]
30396  -  DL  933:44.13 [nfscl]
35902  -  DL  442:08.70 [nfscl]
40881  -  DL  938:56.04 [nfscl]
43276  -  DL  932:38.88 [nfscl]
49178  -  DL  934:24.77 [nfscl]
56314  -  DL  935:21.55 [nfscl]
60085  -  DL  936:37.11 [nfscl]
71788  -  DL  933:10.96 [nfscl]
82001  -  DL  934:45.76 [nfscl]
86222  -  DL  931:42.94 [nfscl]
92353  -  DL 1186:53.38 [nfscl]
21105 20  S+0:00.00 grep nfscl

And this is on a 12-core with Hyperthreading:
# uptime
7:28PM  up 11 days,  4:50, 4 users, load averages: 25.49, 21.91, 20.25

Most of this load is being generated by the nfscl threads.

>>>>>> My jail.conf look like:
>>>>>>
>>>>>> exec.start = "/bin/sh /etc/rc";
>>>>>> exec.stop = "/bin/sh /etc/rc.shutdown";
>>>>>> exec.clean;
>>>>>> mount.devfs;
>>>>>> exec.consolelog = "/var/log/jail.$name.log";
>>>>>> #mount.fstab = "/usr/local/etc/jail.fstab.$name";
>>>>>>
>>>>>> test01 {
>>>>>> host.hostname = "test01_hosting";
>>>>>> ip4.addr = somepublicaddress;
>>>>>> ip4.addr += someprivateaddress;
>>>>>>
>>>>>> mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
>>>>>>nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
>>>>>> mount +=  "10.13.37.2:/tank/hosting/test
>>>>>> /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
>>>>>>  0   0";
>>>>>>
&

Re: kernel process [nfscl] high cpu

2017-10-15 Thread Fabian Freyer
Hi,

(I'm not on this list, please CC me in future replies.
My apologies to those who get this message twice, I had a typo in the
To: header, and have to re-send it to the list.)

sorry for reviving such an old thread, but I've run into this problem
lately as well, on a 11.1-RELEASE-p1 jailhost mounting NFSv4.0 mounts
into jails.

On 24.09.2015 23:17, Rick Macklem wrote:
> Frank de Bot wrote:
>> Rick Macklem wrote:
>>> Frank de Bot wrote:
 Rick Macklem wrote:
> Frank de Bot wrote:
>> Hi,
>>
>> On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
>> jail.
>> Because it's a server only to test, there is a low load. But the
>> [nfscl]
>> process is hogging a CPU after a while. This happens pretty fast,
>> within
>> 1 or 2 days. I'm noticing the high CPU of the process when I want to
>> do
>> some test after a little while (those 1 or 2 days).

Here's my ps ax | grep nfscl:
# ps ax | grep nfscl
1  -  DL  932:08.74 [nfscl]
11572  -  DL  442:27.42 [nfscl]
30396  -  DL  933:44.13 [nfscl]
35902  -  DL  442:08.70 [nfscl]
40881  -  DL  938:56.04 [nfscl]
43276  -  DL  932:38.88 [nfscl]
49178  -  DL  934:24.77 [nfscl]
56314  -  DL  935:21.55 [nfscl]
60085  -  DL  936:37.11 [nfscl]
71788  -  DL  933:10.96 [nfscl]
82001  -  DL  934:45.76 [nfscl]
86222  -  DL  931:42.94 [nfscl]
92353  -  DL 1186:53.38 [nfscl]
21105 20  S+0:00.00 grep nfscl

And this is on a 12-core with Hyperthreading:
# uptime
7:28PM  up 11 days,  4:50, 4 users, load averages: 25.49, 21.91, 20.25

Most of this load is being generated by the nfscl threads.

>> My jail.conf look like:
>>
>> exec.start = "/bin/sh /etc/rc";
>> exec.stop = "/bin/sh /etc/rc.shutdown";
>> exec.clean;
>> mount.devfs;
>> exec.consolelog = "/var/log/jail.$name.log";
>> #mount.fstab = "/usr/local/etc/jail.fstab.$name";
>>
>> test01 {
>> host.hostname = "test01_hosting";
>> ip4.addr = somepublicaddress;
>> ip4.addr += someprivateaddress;
>>
>> mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
>>nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
>> mount +=  "10.13.37.2:/tank/hosting/test
>> /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
>>  0   0";
>>
>> path = "/opt/jails/test01";
>> }
>>
>> Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
>> same
>> result. In the readonly nfs share there are symbolic links point to
>> the
>> read-write share for logging, storing .run files, etc. When I monitor
>> my
>> network interface with tcpdump, there is little nfs traffic, only
>> when I
>> do try to access the shares there is activity.
>>
>> What is causing nfscl to run around in circles, hogging the CPU (it
>> makes the system slow to respond too) or how can I found out what's
>> the
>> cause?
>>
> Well, the nfscl does server->client RPCs referred to as callbacks. I
> have no idea what the implications of running it in a jail is, but I'd
> guess that these server->client RPCs get blocked somehow, etc...
> (The NFSv4.0 mechanism requires a separate IP address that the server
>  can connect to on the client. For NFSv4.1, it should use the same
>  TCP connection as is used for the client->server RPCs. The latter
>  seems like it should work, but there is probably some glitch.)
>
> ** Just run without the nfscl daemon (it is only needed for delegations
> or
> pNFS).

 How can I disable the nfscl daemon?

>>> Well, the daemon for the callbacks is called nfscbd.
>>> You should check via "ps ax", to see if you have it running.
>>> (For NFSv4.0 you probably don't want it running, but for NFSv4.1 you
>>>  do need it. pNFS won't work at all without it, but unless you have a
>>>  server that supports pNFS, it won't work anyhow. Unless your server is
>>>  a clustered Netapp Filer, you should probably not have the "pnfs" option.)
>>>
>>> To run the "nfscbd" daemon you can set:
>>> nfscbd_enable="TRUE"
>>> in your /etc/rc.conf will start it on boot.
>>> Alternately, just type "nfscbd" as root.
>>>
>>> The "nfscl" thread is always started when an NFSv4 mount is done. It does
>>> an assortment of housekeeping things, including a Renew op to make sure the
>>> lease doesn't expire. If for some reason the jail blocks these Renew RPCs,
>>> it will try to do them over and over and ... because having the lease
>>> expire is bad news for NFSv4. How could you tell?
>>> Well, capturing packets between the client and server, then looking at them
>>> in wireshark is probably the only way. (Or maybe a large count for Renew
>>> in the output from "nfsstat -e".)
>>>
>>> "nfscbd" is optional for NFSv4.0. Without it, you simply don't do
>>> callbacks/delegations.
>>> 

Re: kernel process [nfscl] high cpu

2015-12-15 Thread Rick Macklem
Frank de Bot wrote:
> Rick Macklem wrote:
> > Frank de Bot wrote:
> >> Rick Macklem wrote:
> >>> Frank de Bot wrote:
>  Hi,
> 
>  On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
>  jail.
>  Because it's a server only to test, there is a low load. But the
>  [nfscl]
>  process is hogging a CPU after a while. This happens pretty fast,
>  within
>  1 or 2 days. I'm noticing the high CPU of the process when I want to
>  do
>  some test after a little while (those 1 or 2 days).
> 
>  My jail.conf look like:
> 
>  exec.start = "/bin/sh /etc/rc";
>  exec.stop = "/bin/sh /etc/rc.shutdown";
>  exec.clean;
>  mount.devfs;
>  exec.consolelog = "/var/log/jail.$name.log";
>  #mount.fstab = "/usr/local/etc/jail.fstab.$name";
> 
>  test01 {
>  host.hostname = "test01_hosting";
>  ip4.addr = somepublicaddress;
>  ip4.addr += someprivateaddress;
> 
>  mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
> nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
>  mount +=  "10.13.37.2:/tank/hosting/test
>  /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
>   0   0";
> 
>  path = "/opt/jails/test01";
>  }
> 
>  Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
>  same
>  result. In the readonly nfs share there are symbolic links point to
>  the
>  read-write share for logging, storing .run files, etc. When I monitor
>  my
>  network interface with tcpdump, there is little nfs traffic, only
>  when I
>  do try to access the shares there is activity.
> 
>  What is causing nfscl to run around in circles, hogging the CPU (it
>  makes the system slow to respond too) or how can I found out what's
>  the
>  cause?
> 
> >>> Well, the nfscl does server->client RPCs referred to as callbacks. I
> >>> have no idea what the implications of running it in a jail is, but I'd
> >>> guess that these server->client RPCs get blocked somehow, etc...
> >>> (The NFSv4.0 mechanism requires a separate IP address that the server
> >>>  can connect to on the client. For NFSv4.1, it should use the same
> >>>  TCP connection as is used for the client->server RPCs. The latter
> >>>  seems like it should work, but there is probably some glitch.)
> >>>
> >>> ** Just run without the nfscl daemon (it is only needed for delegations
> >>> or
> >>> pNFS).
> >>
> >> How can I disable the nfscl daemon?
> >>
> > Well, the daemon for the callbacks is called nfscbd.
> > You should check via "ps ax", to see if you have it running.
> > (For NFSv4.0 you probably don't want it running, but for NFSv4.1 you
> >  do need it. pNFS won't work at all without it, but unless you have a
> >  server that supports pNFS, it won't work anyhow. Unless your server is
> >  a clustered Netapp Filer, you should probably not have the "pnfs" option.)
> > 
> > To run the "nfscbd" daemon you can set:
> > nfscbd_enable="TRUE"
> > in your /etc/rc.conf will start it on boot.
> > Alternately, just type "nfscbd" as root.
> > 
> > The "nfscl" thread is always started when an NFSv4 mount is done. It does
> > an assortment of housekeeping things, including a Renew op to make sure the
> > lease doesn't expire. If for some reason the jail blocks these Renew RPCs,
> > it will try to do them over and over and ... because having the lease
> > expire is bad news for NFSv4. How could you tell?
> > Well, capturing packets between the client and server, then looking at them
> > in wireshark is probably the only way. (Or maybe a large count for Renew
> > in the output from "nfsstat -e".)
> > 
> > "nfscbd" is optional for NFSv4.0. Without it, you simply don't do
> > callbacks/delegations.
> > For NFSv4.1 it is pretty much required, but doesn't need a separate
> > server->client TCP
> > connection.
> > --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at least as a
> > starting point.
> > 
> > And as I said before, none of this is tested within jails, so I have no
> > idea
> > what effect the jails have. Someone who understands jails might have some
> > insight
> > w.r.t. this?
> > 
> > rick
> > 
> 
> Since last time I haven't tried to use pnfs and just sticked with
> nfsv4.0. nfscbd is not running. The server is now running 10.2. The
> number of renews is not very high (56k, getattr is for example 283M)
> View with wireshark, renew calls look good ,the nfs status is ok.
> 
> Is there a way to know what [nfscl] is active with?
> 
> I do understand nfs + jails could have issues, but I like to understand
> them.
> 
It is conceivable that this high load is caused by the problem identified in
PR#205193, where jails can't talk to the nfsuserd because 127.0.0.1 gets
translated to another ip address for the machine.

The attached patches are the same ones as in the PR, which change 

Re: kernel process [nfscl] high cpu

2015-09-24 Thread Frank de Bot
Rick Macklem wrote:
> Frank de Bot wrote:
>> Hi,
>>
>> On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
>> jail.
>> Because it's a server only to test, there is a low load. But the
>> [nfscl]
>> process is hogging a CPU after a while. This happens pretty fast,
>> within
>> 1 or 2 days. I'm noticing the high CPU of the process when I want to
>> do
>> some test after a little while (those 1 or 2 days).
>>
>> My jail.conf look like:
>>
>> exec.start = "/bin/sh /etc/rc";
>> exec.stop = "/bin/sh /etc/rc.shutdown";
>> exec.clean;
>> mount.devfs;
>> exec.consolelog = "/var/log/jail.$name.log";
>> #mount.fstab = "/usr/local/etc/jail.fstab.$name";
>>
>> test01 {
>> host.hostname = "test01_hosting";
>> ip4.addr = somepublicaddress;
>> ip4.addr += someprivateaddress;
>>
>> mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
>>nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
>> mount +=  "10.13.37.2:/tank/hosting/test
>> /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
>>  0   0";
>>
>> path = "/opt/jails/test01";
>> }
>>
>> Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
>> same
>> result. In the readonly nfs share there are symbolic links point to
>> the
>> read-write share for logging, storing .run files, etc. When I monitor
>> my
>> network interface with tcpdump, there is little nfs traffic, only
>> when I
>> do try to access the shares there is activity.
>>
>> What is causing nfscl to run around in circles, hogging the CPU (it
>> makes the system slow to respond too) or how can I found out what's
>> the
>> cause?
>>
> Well, the nfscl does server->client RPCs referred to as callbacks. I
> have no idea what the implications of running it in a jail is, but I'd
> guess that these server->client RPCs get blocked somehow, etc...
> (The NFSv4.0 mechanism requires a separate IP address that the server
>  can connect to on the client. For NFSv4.1, it should use the same
>  TCP connection as is used for the client->server RPCs. The latter
>  seems like it should work, but there is probably some glitch.)
> 
> ** Just run without the nfscl daemon (it is only needed for delegations or 
> pNFS).

How can I disable the nfscl daemon?


> 
> Since a big Netapp filer (the cluster ones) are about the only servers
> that currently support pNFS (no FreeBSD server support yet), you can
> probably forget about pNFS (I'd get rid of the "pnfs" mount option).
> It also won't work unless this callback path is working.
> 
> As for delegations, they aren't required for NFSv4.[0-1] to work correctly
> and aren't enabled by default on the FreeBSD server.
> --> Running without the nfscl daemon will just ensure no delegations
> are issued, even if enabled on the server.
> 
> rick
> 
>>
>> Regards,
>>
>> Frank de Bot
>> ___
>> freebsd-stable@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to
>> "freebsd-stable-unsubscr...@freebsd.org"
>>
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
> 

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kernel process [nfscl] high cpu

2015-09-24 Thread Frank de Bot
Rick Macklem wrote:
> Frank de Bot wrote:
>> Rick Macklem wrote:
>>> Frank de Bot wrote:
 Hi,

 On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
 jail.
 Because it's a server only to test, there is a low load. But the
 [nfscl]
 process is hogging a CPU after a while. This happens pretty fast,
 within
 1 or 2 days. I'm noticing the high CPU of the process when I want to
 do
 some test after a little while (those 1 or 2 days).

 My jail.conf look like:

 exec.start = "/bin/sh /etc/rc";
 exec.stop = "/bin/sh /etc/rc.shutdown";
 exec.clean;
 mount.devfs;
 exec.consolelog = "/var/log/jail.$name.log";
 #mount.fstab = "/usr/local/etc/jail.fstab.$name";

 test01 {
 host.hostname = "test01_hosting";
 ip4.addr = somepublicaddress;
 ip4.addr += someprivateaddress;

 mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
 mount +=  "10.13.37.2:/tank/hosting/test
 /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
  0   0";

 path = "/opt/jails/test01";
 }

 Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
 same
 result. In the readonly nfs share there are symbolic links point to
 the
 read-write share for logging, storing .run files, etc. When I monitor
 my
 network interface with tcpdump, there is little nfs traffic, only
 when I
 do try to access the shares there is activity.

 What is causing nfscl to run around in circles, hogging the CPU (it
 makes the system slow to respond too) or how can I found out what's
 the
 cause?

>>> Well, the nfscl does server->client RPCs referred to as callbacks. I
>>> have no idea what the implications of running it in a jail is, but I'd
>>> guess that these server->client RPCs get blocked somehow, etc...
>>> (The NFSv4.0 mechanism requires a separate IP address that the server
>>>  can connect to on the client. For NFSv4.1, it should use the same
>>>  TCP connection as is used for the client->server RPCs. The latter
>>>  seems like it should work, but there is probably some glitch.)
>>>
>>> ** Just run without the nfscl daemon (it is only needed for delegations or
>>> pNFS).
>>
>> How can I disable the nfscl daemon?
>>
> Well, the daemon for the callbacks is called nfscbd.
> You should check via "ps ax", to see if you have it running.
> (For NFSv4.0 you probably don't want it running, but for NFSv4.1 you
>  do need it. pNFS won't work at all without it, but unless you have a
>  server that supports pNFS, it won't work anyhow. Unless your server is
>  a clustered Netapp Filer, you should probably not have the "pnfs" option.)
> 
> To run the "nfscbd" daemon you can set:
> nfscbd_enable="TRUE"
> in your /etc/rc.conf will start it on boot.
> Alternately, just type "nfscbd" as root.
> 
> The "nfscl" thread is always started when an NFSv4 mount is done. It does
> an assortment of housekeeping things, including a Renew op to make sure the
> lease doesn't expire. If for some reason the jail blocks these Renew RPCs,
> it will try to do them over and over and ... because having the lease
> expire is bad news for NFSv4. How could you tell?
> Well, capturing packets between the client and server, then looking at them
> in wireshark is probably the only way. (Or maybe a large count for Renew
> in the output from "nfsstat -e".)
> 
> "nfscbd" is optional for NFSv4.0. Without it, you simply don't do 
> callbacks/delegations.
> For NFSv4.1 it is pretty much required, but doesn't need a separate 
> server->client TCP
> connection.
> --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at least as a 
> starting point.
> 
> And as I said before, none of this is tested within jails, so I have no idea
> what effect the jails have. Someone who understands jails might have some 
> insight
> w.r.t. this?
> 
> rick
> 

Since last time I haven't tried to use pnfs and just sticked with
nfsv4.0. nfscbd is not running. The server is now running 10.2. The
number of renews is not very high (56k, getattr is for example 283M)
View with wireshark, renew calls look good ,the nfs status is ok.

Is there a way to know what [nfscl] is active with?

I do understand nfs + jails could have issues, but I like to understand
them.


Frank

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: kernel process [nfscl] high cpu

2015-09-24 Thread Rick Macklem
Frank de Bot wrote:
> Rick Macklem wrote:
> > Frank de Bot wrote:
> >> Hi,
> >>
> >> On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
> >> jail.
> >> Because it's a server only to test, there is a low load. But the
> >> [nfscl]
> >> process is hogging a CPU after a while. This happens pretty fast,
> >> within
> >> 1 or 2 days. I'm noticing the high CPU of the process when I want to
> >> do
> >> some test after a little while (those 1 or 2 days).
> >>
> >> My jail.conf look like:
> >>
> >> exec.start = "/bin/sh /etc/rc";
> >> exec.stop = "/bin/sh /etc/rc.shutdown";
> >> exec.clean;
> >> mount.devfs;
> >> exec.consolelog = "/var/log/jail.$name.log";
> >> #mount.fstab = "/usr/local/etc/jail.fstab.$name";
> >>
> >> test01 {
> >> host.hostname = "test01_hosting";
> >> ip4.addr = somepublicaddress;
> >> ip4.addr += someprivateaddress;
> >>
> >> mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
> >>nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
> >> mount +=  "10.13.37.2:/tank/hosting/test
> >> /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
> >>  0   0";
> >>
> >> path = "/opt/jails/test01";
> >> }
> >>
> >> Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
> >> same
> >> result. In the readonly nfs share there are symbolic links point to
> >> the
> >> read-write share for logging, storing .run files, etc. When I monitor
> >> my
> >> network interface with tcpdump, there is little nfs traffic, only
> >> when I
> >> do try to access the shares there is activity.
> >>
> >> What is causing nfscl to run around in circles, hogging the CPU (it
> >> makes the system slow to respond too) or how can I found out what's
> >> the
> >> cause?
> >>
> > Well, the nfscl does server->client RPCs referred to as callbacks. I
> > have no idea what the implications of running it in a jail is, but I'd
> > guess that these server->client RPCs get blocked somehow, etc...
> > (The NFSv4.0 mechanism requires a separate IP address that the server
> >  can connect to on the client. For NFSv4.1, it should use the same
> >  TCP connection as is used for the client->server RPCs. The latter
> >  seems like it should work, but there is probably some glitch.)
> > 
> > ** Just run without the nfscl daemon (it is only needed for delegations or
> > pNFS).
> 
> How can I disable the nfscl daemon?
> 
Well, the daemon for the callbacks is called nfscbd.
You should check via "ps ax", to see if you have it running.
(For NFSv4.0 you probably don't want it running, but for NFSv4.1 you
 do need it. pNFS won't work at all without it, but unless you have a
 server that supports pNFS, it won't work anyhow. Unless your server is
 a clustered Netapp Filer, you should probably not have the "pnfs" option.)

To run the "nfscbd" daemon you can set:
nfscbd_enable="TRUE"
in your /etc/rc.conf will start it on boot.
Alternately, just type "nfscbd" as root.

The "nfscl" thread is always started when an NFSv4 mount is done. It does
an assortment of housekeeping things, including a Renew op to make sure the
lease doesn't expire. If for some reason the jail blocks these Renew RPCs,
it will try to do them over and over and ... because having the lease
expire is bad news for NFSv4. How could you tell?
Well, capturing packets between the client and server, then looking at them
in wireshark is probably the only way. (Or maybe a large count for Renew
in the output from "nfsstat -e".)

"nfscbd" is optional for NFSv4.0. Without it, you simply don't do 
callbacks/delegations.
For NFSv4.1 it is pretty much required, but doesn't need a separate 
server->client TCP
connection.
--> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at least as a 
starting point.

And as I said before, none of this is tested within jails, so I have no idea
what effect the jails have. Someone who understands jails might have some 
insight
w.r.t. this?

rick

> 
> > 
> > Since a big Netapp filer (the cluster ones) are about the only servers
> > that currently support pNFS (no FreeBSD server support yet), you can
> > probably forget about pNFS (I'd get rid of the "pnfs" mount option).
> > It also won't work unless this callback path is working.
> > 
> > As for delegations, they aren't required for NFSv4.[0-1] to work correctly
> > and aren't enabled by default on the FreeBSD server.
> > --> Running without the nfscl daemon will just ensure no delegations
> > are issued, even if enabled on the server.
> > 
> > rick
> > 
> >>
> >> Regards,
> >>
> >> Frank de Bot
> >> ___
> >> freebsd-stable@freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> To unsubscribe, send any mail to
> >> "freebsd-stable-unsubscr...@freebsd.org"
> >>
> > ___
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable

Re: kernel process [nfscl] high cpu

2015-09-24 Thread Rick Macklem
Frank de Bot wrote:
> Rick Macklem wrote:
> > Frank de Bot wrote:
> >> Rick Macklem wrote:
> >>> Frank de Bot wrote:
>  Hi,
> 
>  On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
>  jail.
>  Because it's a server only to test, there is a low load. But the
>  [nfscl]
>  process is hogging a CPU after a while. This happens pretty fast,
>  within
>  1 or 2 days. I'm noticing the high CPU of the process when I want to
>  do
>  some test after a little while (those 1 or 2 days).
> 
>  My jail.conf look like:
> 
>  exec.start = "/bin/sh /etc/rc";
>  exec.stop = "/bin/sh /etc/rc.shutdown";
>  exec.clean;
>  mount.devfs;
>  exec.consolelog = "/var/log/jail.$name.log";
>  #mount.fstab = "/usr/local/etc/jail.fstab.$name";
> 
>  test01 {
>  host.hostname = "test01_hosting";
>  ip4.addr = somepublicaddress;
>  ip4.addr += someprivateaddress;
> 
>  mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
> nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
>  mount +=  "10.13.37.2:/tank/hosting/test
>  /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
>   0   0";
> 
>  path = "/opt/jails/test01";
>  }
> 
>  Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
>  same
>  result. In the readonly nfs share there are symbolic links point to
>  the
>  read-write share for logging, storing .run files, etc. When I monitor
>  my
>  network interface with tcpdump, there is little nfs traffic, only
>  when I
>  do try to access the shares there is activity.
> 
>  What is causing nfscl to run around in circles, hogging the CPU (it
>  makes the system slow to respond too) or how can I found out what's
>  the
>  cause?
> 
> >>> Well, the nfscl does server->client RPCs referred to as callbacks. I
> >>> have no idea what the implications of running it in a jail is, but I'd
> >>> guess that these server->client RPCs get blocked somehow, etc...
> >>> (The NFSv4.0 mechanism requires a separate IP address that the server
> >>>  can connect to on the client. For NFSv4.1, it should use the same
> >>>  TCP connection as is used for the client->server RPCs. The latter
> >>>  seems like it should work, but there is probably some glitch.)
> >>>
> >>> ** Just run without the nfscl daemon (it is only needed for delegations
> >>> or
> >>> pNFS).
> >>
> >> How can I disable the nfscl daemon?
> >>
> > Well, the daemon for the callbacks is called nfscbd.
> > You should check via "ps ax", to see if you have it running.
> > (For NFSv4.0 you probably don't want it running, but for NFSv4.1 you
> >  do need it. pNFS won't work at all without it, but unless you have a
> >  server that supports pNFS, it won't work anyhow. Unless your server is
> >  a clustered Netapp Filer, you should probably not have the "pnfs" option.)
> > 
> > To run the "nfscbd" daemon you can set:
> > nfscbd_enable="TRUE"
> > in your /etc/rc.conf will start it on boot.
> > Alternately, just type "nfscbd" as root.
> > 
> > The "nfscl" thread is always started when an NFSv4 mount is done. It does
> > an assortment of housekeeping things, including a Renew op to make sure the
> > lease doesn't expire. If for some reason the jail blocks these Renew RPCs,
> > it will try to do them over and over and ... because having the lease
> > expire is bad news for NFSv4. How could you tell?
> > Well, capturing packets between the client and server, then looking at them
> > in wireshark is probably the only way. (Or maybe a large count for Renew
> > in the output from "nfsstat -e".)
> > 
> > "nfscbd" is optional for NFSv4.0. Without it, you simply don't do
> > callbacks/delegations.
> > For NFSv4.1 it is pretty much required, but doesn't need a separate
> > server->client TCP
> > connection.
> > --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at least as a
> > starting point.
> > 
> > And as I said before, none of this is tested within jails, so I have no
> > idea
> > what effect the jails have. Someone who understands jails might have some
> > insight
> > w.r.t. this?
> > 
> > rick
> > 
> 
> Since last time I haven't tried to use pnfs and just sticked with
> nfsv4.0. nfscbd is not running. The server is now running 10.2. The
> number of renews is not very high (56k, getattr is for example 283M)
> View with wireshark, renew calls look good ,the nfs status is ok.
> 
> Is there a way to know what [nfscl] is active with?
> 
Btw, I'm an old-school debugger, which means I'd add a bunch of "printf()s"
to the function called nfscl_renewthread() in sys/fs/nfsclient/nfs_clstate.c.
(That's the nfscl thread. It should only do the "for(;;)" loop once/sec, but
 if you get lots of loop iterations, you might be able to isolate why via 
printf()s.)

You did say it was a test system. 

Re: kernel process [nfscl] high cpu

2015-09-24 Thread Rick Macklem
Frank de Bot wrote:
> Rick Macklem wrote:
> > Frank de Bot wrote:
> >> Rick Macklem wrote:
> >>> Frank de Bot wrote:
>  Hi,
> 
>  On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
>  jail.
>  Because it's a server only to test, there is a low load. But the
>  [nfscl]
>  process is hogging a CPU after a while. This happens pretty fast,
>  within
>  1 or 2 days. I'm noticing the high CPU of the process when I want to
>  do
>  some test after a little while (those 1 or 2 days).
> 
>  My jail.conf look like:
> 
>  exec.start = "/bin/sh /etc/rc";
>  exec.stop = "/bin/sh /etc/rc.shutdown";
>  exec.clean;
>  mount.devfs;
>  exec.consolelog = "/var/log/jail.$name.log";
>  #mount.fstab = "/usr/local/etc/jail.fstab.$name";
> 
>  test01 {
>  host.hostname = "test01_hosting";
>  ip4.addr = somepublicaddress;
>  ip4.addr += someprivateaddress;
> 
>  mount = "10.13.37.2:/tank/hostingbase  /opt/jails/test01
> nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0";
>  mount +=  "10.13.37.2:/tank/hosting/test
>  /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
>   0   0";
> 
>  path = "/opt/jails/test01";
>  }
> 
>  Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
>  same
>  result. In the readonly nfs share there are symbolic links point to
>  the
>  read-write share for logging, storing .run files, etc. When I monitor
>  my
>  network interface with tcpdump, there is little nfs traffic, only
>  when I
>  do try to access the shares there is activity.
> 
>  What is causing nfscl to run around in circles, hogging the CPU (it
>  makes the system slow to respond too) or how can I found out what's
>  the
>  cause?
> 
> >>> Well, the nfscl does server->client RPCs referred to as callbacks. I
> >>> have no idea what the implications of running it in a jail is, but I'd
> >>> guess that these server->client RPCs get blocked somehow, etc...
> >>> (The NFSv4.0 mechanism requires a separate IP address that the server
> >>>  can connect to on the client. For NFSv4.1, it should use the same
> >>>  TCP connection as is used for the client->server RPCs. The latter
> >>>  seems like it should work, but there is probably some glitch.)
> >>>
> >>> ** Just run without the nfscl daemon (it is only needed for delegations
> >>> or
> >>> pNFS).
> >>
> >> How can I disable the nfscl daemon?
> >>
> > Well, the daemon for the callbacks is called nfscbd.
> > You should check via "ps ax", to see if you have it running.
> > (For NFSv4.0 you probably don't want it running, but for NFSv4.1 you
> >  do need it. pNFS won't work at all without it, but unless you have a
> >  server that supports pNFS, it won't work anyhow. Unless your server is
> >  a clustered Netapp Filer, you should probably not have the "pnfs" option.)
> > 
> > To run the "nfscbd" daemon you can set:
> > nfscbd_enable="TRUE"
> > in your /etc/rc.conf will start it on boot.
> > Alternately, just type "nfscbd" as root.
> > 
> > The "nfscl" thread is always started when an NFSv4 mount is done. It does
> > an assortment of housekeeping things, including a Renew op to make sure the
> > lease doesn't expire. If for some reason the jail blocks these Renew RPCs,
> > it will try to do them over and over and ... because having the lease
> > expire is bad news for NFSv4. How could you tell?
> > Well, capturing packets between the client and server, then looking at them
> > in wireshark is probably the only way. (Or maybe a large count for Renew
> > in the output from "nfsstat -e".)
> > 
> > "nfscbd" is optional for NFSv4.0. Without it, you simply don't do
> > callbacks/delegations.
> > For NFSv4.1 it is pretty much required, but doesn't need a separate
> > server->client TCP
> > connection.
> > --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at least as a
> > starting point.
> > 
> > And as I said before, none of this is tested within jails, so I have no
> > idea
> > what effect the jails have. Someone who understands jails might have some
> > insight
> > w.r.t. this?
> > 
> > rick
> > 
> 
> Since last time I haven't tried to use pnfs and just sticked with
> nfsv4.0. nfscbd is not running. The server is now running 10.2. The
> number of renews is not very high (56k, getattr is for example 283M)
> View with wireshark, renew calls look good ,the nfs status is ok.
> 
> Is there a way to know what [nfscl] is active with?
> 
Not that I can think of. When I do "ps axHl" I see it in DL state and not
doing much of anything. (You could try setting "sysctl vfs.nfs.debuglevel=4",
but I don't think you'll see anything syslog'd that is useful?)
This is what I'd expect for an NFSv4.0 mount without the nfscbd running.

Basically, when the nfscbd isn't running the server shouldn't 

Re: kernel process [nfscl] high cpu

2015-05-02 Thread Rick Macklem
Frank de Bot wrote:
 Hi,
 
 On a 10.1-RELEASE-p9 server I have several NFS mounts used for a
 jail.
 Because it's a server only to test, there is a low load. But the
 [nfscl]
 process is hogging a CPU after a while. This happens pretty fast,
 within
 1 or 2 days. I'm noticing the high CPU of the process when I want to
 do
 some test after a little while (those 1 or 2 days).
 
 My jail.conf look like:
 
 exec.start = /bin/sh /etc/rc;
 exec.stop = /bin/sh /etc/rc.shutdown;
 exec.clean;
 mount.devfs;
 exec.consolelog = /var/log/jail.$name.log;
 #mount.fstab = /usr/local/etc/jail.fstab.$name;
 
 test01 {
 host.hostname = test01_hosting;
 ip4.addr = somepublicaddress;
 ip4.addr += someprivateaddress;
 
 mount = 10.13.37.2:/tank/hostingbase  /opt/jails/test01
nfs nfsv4,minorversion=1,pnfs,ro,noatime0   0;
 mount +=  10.13.37.2:/tank/hosting/test
 /opt/jails/test01/opt   nfs nfsv4,minorversion=1,pnfs,noatime
  0   0;
 
 path = /opt/jails/test01;
 }
 
 Last test was with NFS 4.1, I also worked with NFS 4.(0) with the
 same
 result. In the readonly nfs share there are symbolic links point to
 the
 read-write share for logging, storing .run files, etc. When I monitor
 my
 network interface with tcpdump, there is little nfs traffic, only
 when I
 do try to access the shares there is activity.
 
 What is causing nfscl to run around in circles, hogging the CPU (it
 makes the system slow to respond too) or how can I found out what's
 the
 cause?
 
Well, the nfscl does server-client RPCs referred to as callbacks. I
have no idea what the implications of running it in a jail is, but I'd
guess that these server-client RPCs get blocked somehow, etc...
(The NFSv4.0 mechanism requires a separate IP address that the server
 can connect to on the client. For NFSv4.1, it should use the same
 TCP connection as is used for the client-server RPCs. The latter
 seems like it should work, but there is probably some glitch.)

** Just run without the nfscl daemon (it is only needed for delegations or 
pNFS).

Since a big Netapp filer (the cluster ones) are about the only servers
that currently support pNFS (no FreeBSD server support yet), you can
probably forget about pNFS (I'd get rid of the pnfs mount option).
It also won't work unless this callback path is working.

As for delegations, they aren't required for NFSv4.[0-1] to work correctly
and aren't enabled by default on the FreeBSD server.
-- Running without the nfscl daemon will just ensure no delegations
are issued, even if enabled on the server.

rick

 
 Regards,
 
 Frank de Bot
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 freebsd-stable-unsubscr...@freebsd.org
 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org