Hi Mohamad!
On 31/12/2018 19:30, Mohamad Gebai wrote:
> On 12/31/18 4:51 AM, Marcus Murwall wrote:
>> What you say does make sense though as I also get the feeling that the
>> osds are just waiting for something. Something that never happens and
>> the request finally timeout...
>
> So the OSDs a
On 12/31/18 4:51 AM, Marcus Murwall wrote:
> What you say does make sense though as I also get the feeling that the
> osds are just waiting for something. Something that never happens and
> the request finally timeout...
So the OSDs are just completely idle? If not, try using strace and/or
perf to
Hi Mohamad
The network is 2x25Gbit interface bonded for the cluster network and I
see no signs of congestion. Also if I benchmark against a replicated
pool I can't recreate these issues. I can push a lot more data against a
replicated pool and everything works just fine. If it was a network
c
Hi Marcus,
On 12/27/18 4:21 PM, Marcus Murwall wrote:
> Hey Mohamad
>
> I work with Florian on this issue.
> Just reinstalled the ceph cluster and triggered the error again.
> Looking at iostat -x 1 there is basically no activity at all against
> any of the osds.
> We get blocked ops all over the
What is happening on the individual nodes when you reach that point
(iostat -x 1 on the OSD nodes)? Also, what throughput do you get when
benchmarking the replicated pool?
I guess one way to start would be by looking at ongoing operations at
the OSD level:
ceph daemon osd.X dump_blocked_ops
ceph
Hi everyone,
We have a Luminous cluster (12.2.10) on Ubuntu Xenial, though we have
also observed the same behavior on 12.2.7 on Bionic (download.ceph.com
doesn't build Luminous packages for Bionic, and 12.2.7 is the latest
distro build).
The primary use case for this cluster is radosgw. 6 OSD nod