Are you seeing "peering" PGs when the blocked requests are happening? That's 
what we see regularly when starting OSDs.

I'm not sure this can be solved completely (and whether there are major 
improvements in newer Ceph versions), but it can be sped up by
1) making sure you have free (and not dirtied or fragmented) memory on the node 
where you are starting the OSD
        - that means dropping caches before starting the OSD if you have lots 
of "free" RAM that is used for VFS cache
2) starting the OSDs one by one instead of booting several of them
3) if you pin the OSDs to CPUs/cores, do that after the OSD is in - I found it 
to be best to pin the OSD to a cgroup limited to one NUMA node and then limit 
it to a subset of cores after it has run a bit. OSD tends to use hundreds of % 
of CPU when booting
4) you could possibly prewarm cache for the OSD in /var/lib/ceph/osd...

It's unclear to me whether MONs influence this somehow (the peering stage) but 
I have observed their CPU usage and IO also spikes when OSDs are started, so 
make sure they are not under load.

Jan


> On 09 Dec 2015, at 11:03, Christian Kauhaus <k...@flyingcircus.io> wrote:
> 
> Hi,
> 
> I'm getting blocked requests (>30s) every time when an OSD is set to "in" in
> our clusters. Once this has happened, backfills run smoothly.
> 
> I have currently no idea where to start debugging. Has anyone a hint what to
> examine first in order to narrow this issue?
> 
> TIA
> 
> Christian
> 
> -- 
> Dipl-Inf. Christian Kauhaus <>< · k...@flyingcircus.io · +49 345 219401-0
> Flying Circus Internet Operations GmbH · http://flyingcircus.io
> Forsterstraße 29 · 06112 Halle (Saale) · Deutschland
> HR Stendal 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to