Re: [ceph-users] CephFS MDS stuck (failed to rdlock when getattr / lookup)

2018-04-15 Thread Oliver Freyermuth
Am 15.04.2018 um 23:04 schrieb John Spray: > On Fri, Apr 13, 2018 at 5:16 PM, Oliver Freyermuth > wrote: >> Dear Cephalopodians, >> >> in our cluster (CentOS 7.4, EC Pool, Snappy compression, Luminous 12.2.4), >> we often have all (~40) clients accessing one file in

Re: [ceph-users] CephFS MDS stuck (failed to rdlock when getattr / lookup)

2018-04-15 Thread John Spray
On Fri, Apr 13, 2018 at 5:16 PM, Oliver Freyermuth wrote: > Dear Cephalopodians, > > in our cluster (CentOS 7.4, EC Pool, Snappy compression, Luminous 12.2.4), > we often have all (~40) clients accessing one file in readonly mode, even > with multiple processes per

Re: [ceph-users] High TCP retransmission rates, only with Ceph

2018-04-15 Thread Robert Stanford
I should have been more clear. The TCP retransmissions are on the OSD host. On Sun, Apr 15, 2018 at 1:48 PM, Paweł Sadowski wrote: > On 04/15/2018 08:18 PM, Robert Stanford wrote: > >> >> Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts >> (8 disks, 8

Re: [ceph-users] High TCP retransmission rates, only with Ceph

2018-04-15 Thread Paweł Sadowski
On 04/15/2018 08:18 PM, Robert Stanford wrote:  Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts (8 disks, 8 OSD daemons, one of 3 OSD hosts).  When I benchmark radosgw with cosbench I see high TCP retransmission rates (from sar -n ETCP 1).  I don't see this with iperf. 

[ceph-users] High TCP retransmission rates, only with Ceph

2018-04-15 Thread Robert Stanford
Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts (8 disks, 8 OSD daemons, one of 3 OSD hosts). When I benchmark radosgw with cosbench I see high TCP retransmission rates (from sar -n ETCP 1). I don't see this with iperf. Why would Ceph, but not iperf, cause high TCP

[ceph-users] ZeroDivisionError: float division by zero in /usr/lib/ceph/mgr/dashboard/module.py (12.2.4)

2018-04-15 Thread Nicolas Huillard
Hi, I'm not sure if this have been solved since 12.2.4. The same code occurs in a different file in Github https://github.com/ceph/ceph/blob/ 50412f7e9c2691ec10132c8bf9310a05a40e9f9d/src/pybind/mgr/status/module.p y The ZeroDivisionError occurs when the dashboard is open, and there is a network