[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Anthony D'Atri
Very large omaps can take quite a while. > >> You meta data PGs *are* backfilling. It is the "61 keys/s" statement in the >> ceph status output in the recovery I/O line. If this is too slow, increase >> osd_max_backfills and osd_recovery_max_active. >> >> Or just have some coffee ... > > > I

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Anthony D'Atri
Parallelism. The backfill/recovery tunables control how many recovery ops a given OSD will perform. If you’re adding a new OSD, naturally it is the bottleneck. For other forms of data movement, early on one has multiple OSDs reading and writing independently. Toward the end, increasingly few

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Eugen Block
Yeah we also noticed decreasing recovery speed if it comes to the last PGs, but we never put up a theory. I think your explanation makes sense. Next time I'll try with much higher values, thanks for sharing that. Regards, Eugen Zitat von Frank Schilder : I did a lot of data movement late

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Frank Schilder
I did a lot of data movement lately and my observation is, that backfill is very fast (high bandwidth and many thousand keys/s) as long as this is many-to-many OSDs. The number of OSD participating slowly decreases over time until there is only 1 disk left that is written to. This becomes really

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Eugen Block
You meta data PGs *are* backfilling. It is the "61 keys/s" statement in the ceph status output in the recovery I/O line. If this is too slow, increase osd_max_backfills and osd_recovery_max_active. Or just have some coffee ... I already had increased osd_max_backfills and osd_recovery_max_

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Frank Schilder
You meta data PGs *are* backfilling. It is the "61 keys/s" statement in the ceph status output in the recovery I/O line. If this is too slow, increase osd_max_backfills and osd_recovery_max_active. Or just have some coffee ... Best regards, = Frank Schilder AIT Risø Campus Bygn

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-10 Thread Eugen Block
Please ignore my email, the PGs have eventually recovered, it just took way more time than expected or observed for the other PGs. I'll try to be more patient next time. ;-) ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an ema