Re: [ceph-users] mds "Behing on trimming"

2016-03-23 Thread Dzianis Kahanovich
(mistake, copy to list)

John Spray пишет:

>> Looks happened both time at night - probably on long backup/write operations
>> (something like compressed local root backup to cephfs). Also all local 
>> mounts
>> inside cluster (fuse) moved to automout to reduce clients pressure. Still 5
>> permanent kernel clients.
>>
>> Now I remount all but 1 kernel clients. Message persist.
> 
> There's probably a reason you haven't already done this, but the next
> logical debug step would be to try unmounting that last kernel client
> (and mention what version it is)

4.5.0. This VM now finally was deadlocked in some places (may be there problem
from same roots) and hard restarted, now mounted again. Message persists.

Just near week ago I remove some of additional mount options. Started from old
days (when VMs was on same servers with cluster) I mounts with
"wsize=131072,rsize=131072,write_congestion_kb=128,readdir_max_bytes=131072"
(and net.ipv4.tcp_notsent_lowat = 131072) to conserve RAM. Obtaining good
servers for VMs I remove it. May be better turn it back for better congestion
quantum.

-- 
WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds "Behing on trimming"

2016-03-21 Thread Dzianis Kahanovich
PS Now I stop this mds, active migrated and warning removed. Cannot try more.

Dzianis Kahanovich пишет:
> John Spray пишет:
> 
>>> Looks happened both time at night - probably on long backup/write operations
>>> (something like compressed local root backup to cephfs). Also all local 
>>> mounts
>>> inside cluster (fuse) moved to automout to reduce clients pressure. Still 5
>>> permanent kernel clients.
>>>
>>> Now I remount all but 1 kernel clients. Message persist.
>>
>> There's probably a reason you haven't already done this, but the next
>> logical debug step would be to try unmounting that last kernel client
>> (and mention what version it is)
> 
> 4.5.0. This VM now finally was deadlocked in some places (may be there problem
> from same roots) and hard restarted, now mounted again. Message persists.
> 
> Just near week ago I remove some of additional mount options. Started from old
> days (when VMs was on same servers with cluster) I mounts with
> "wsize=131072,rsize=131072,write_congestion_kb=128,readdir_max_bytes=131072"
> (and net.ipv4.tcp_notsent_lowat = 131072) to conserve RAM. Obtaining good
> servers for VMs I remove it. May be better turn it back for better congestion
> quantum.
> 


-- 
WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mds "Behing on trimming"

2016-03-21 Thread John Spray
On Mon, Mar 21, 2016 at 7:44 AM, Dzianis Kahanovich
 wrote:
> I have (second time) stuck mds warning: "Behind on trimming (63/30)". Looks
> working. What it mean and how to avoid it? And how to fix (exclude 
> stop/migrate
> active mds)?

The MDS has a metadata journal, whose length is measured in
"segments", and it is trimmed when the number of segments gets greater
than a certain limit.  The warning is telling you that the journal is
meant to be trimmed after 30 segments, but you currently have 63
segments.

This can happen when something (including a client) is failing to
properly clean up after itself, and leaving extra references to
something in one of the older segments.  In fact, a bug in the kernel
client was the original motivation for adding this warning message.

John

> Looks happened both time at night - probably on long backup/write operations
> (something like compressed local root backup to cephfs). Also all local mounts
> inside cluster (fuse) moved to automout to reduce clients pressure. Still 5
> permanent kernel clients.
>
> Now I remount all but 1 kernel clients. Message persist.

There's probably a reason you haven't already done this, but the next
logical debug step would be to try unmounting that last kernel client
(and mention what version it is)

John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com