Re: [ceph-users] mds0: Client failing to respond to cache pressure

Eric Eastman Sun, 12 Jul 2015 20:51:07 -0700

In the last email, I stated the clients were not mounted using the
ceph file system kernel driver. Re-checking the client systems,  the
file systems are mounted, but all the IO is going through Ganesha NFS
using the ceph file system library interface.


On Sun, Jul 12, 2015 at 9:02 PM, Eric Eastman
<eric.east...@keepertech.com> wrote:
> Hi John,
>
> I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all
> nodes.  This system is using 4 Ceph FS client systems. They all have
> the kernel driver version of CephFS loaded, but none are mounting the
> file system. All 4 clients are using the libcephfs VFS interface to
> Ganesha NFS (V2.2.0-2) and Samba (Version 4.3.0pre1-GIT-0791bb0) to
> share out the Ceph file system.
>
> # ceph -s
>     cluster 6d8aae1e-1125-11e5-a708-001b78e265be
>      health HEALTH_WARN
>             4 near full osd(s)
>             mds0: Client ede-c2-gw01 failing to respond to cache pressure
>             mds0: Client ede-c2-gw02:cephfs failing to respond to cache 
> pressure
>             mds0: Client ede-c2-gw03:cephfs failing to respond to cache 
> pressure
>      monmap e1: 3 mons at
> {ede-c2-mon01=10.15.2.121:6789/0,ede-c2-mon02=10.15.2.122:6789/0,ede-c2-mon03=10.15.2.123:6789/0}
>             election epoch 8, quorum 0,1,2
> ede-c2-mon01,ede-c2-mon02,ede-c2-mon03
>      mdsmap e912: 1/1/1 up {0=ede-c2-mds03=up:active}, 2 up:standby
>      osdmap e272: 8 osds: 8 up, 8 in
>       pgmap v225264: 832 pgs, 4 pools, 188 GB data, 5173 kobjects
>             212 GB used, 48715 MB / 263 GB avail
>                  832 active+clean
>   client io 1379 kB/s rd, 20653 B/s wr, 98 op/s
>
> # cat /proc/version
> Linux version 4.1.0-040100-generic (kernel@gomeisa) (gcc version 4.6.3
> (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201506220235 SMP Mon Jun 22 06:36:19
> UTC 2015
>
> # ceph -v
> ceph version 9.0.1 (997b3f998d565a744bfefaaf34b08b891f8dbf64)
>
> The systems are all running Ubuntu Trusty that has been upgraded to
> the 4.1 kernel. This is all physical machines and no VMs.  The test
> run that caused the problem was create and verifying 5 million small
> files.
>
> We have some tools that flag when Ceph is in a WARN state so it would
> be nice to get rid of this warning.
>
> Please let me know what additional information you need.
>
> Thanks,
>
> Eric
>
> On Fri, Jul 10, 2015 at 4:19 AM, 谷枫 <feiche...@gmail.com> wrote:
>> Thank you John,
>> All my server is ubuntu14.04 with 3.16 kernel.
>> Not all of clients appear this problem, the cluster seems functioning well
>> now.
>> As you say,i will change the mds_cache_size to 500000 from 100000 to take a
>> test, thanks again!
>>
>> 2015-07-10 17:00 GMT+08:00 John Spray <john.sp...@redhat.com>:
>>>
>>>
>>> This is usually caused by use of older kernel clients.  I don't remember
>>> exactly what version it was fixed in, but iirc we've seen the problem with
>>> 3.14 and seen it go away with 3.18.
>>>
>>> If your system is otherwise functioning well, this is not a critical error
>>> -- it just means that the MDS might not be able to fully control its memory
>>> usage (i.e. it can exceed mds_cache_size).
>>>
>>> John
>>>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] mds0: Client failing to respond to cache pressure

Reply via email to