[ceph-users] after update to 14.2.16 osd daemons begin to crash

2021-02-17 Thread Boris Behrens
Hi, currently we experience osd daemon crashes and I can't pin the issue. I hope someone can help me with it. * We operate multiple cluster (440 SSD - 1PB, 36 SSD - 126TB, 40SSD 100TB, 84HDD - 680TB) * All clusters were updated around the same time (2021-02-03) * We restarted ALL ceph daemons

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
I've outed osd.18 and osd.54 and let it sync for some time and now the problem is gone. *shrugs Thank you for the hints. Am Mo., 8. Feb. 2021 um 14:46 Uhr schrieb Boris Behrens : > Hi, > sure > > ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF > -1 672.684

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
priate? I've seen stuck PGs because of OSD weight imbalance. Is > the OSD in the correct subtree? > > > Zitat von Boris Behrens : > > > Hi Eugen, > > > > I've set it to 0 but the "degraded objects" count does not go down. > > > > Am Mo.

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
Hi Eugen, I've set it to 0 but the "degraded objects" count does not go down. Am Mo., 8. Feb. 2021 um 14:23 Uhr schrieb Eugen Block : > Hi, > > one option would be to decrease (or set to 0) the primary-affinity of > osd.14 and see if that brings the pg back. > > Regards, > Eugen > > -- Die

[ceph-users] one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
+undersized+degraded+remapped+backfilling2m 301810'52824786 301818:1346392477 [18,14,54]p18 [14]p14 2021-01-02 06:28:01.124161 2021-01-02 06:28:01.124161 PG_AVAILABILITY Reduced data availability: 1 pg inactive, 1 pg down pg 10.1a is down, acting [18,54] Cheers Boris -- Die

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
67.2196 TiB pool 44 7.9 MiB 2.01k 68 MiB 096 TiB pool 5519 B 2 36 KiB 096 TiB Am Di., 16. Juni 2020 um 14:13 Uhr schrieb Dan van der Ster : > > On Tue, Jun 16, 2020 at 2:00 PM Boris B

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
See inline comments Am Di., 16. Juni 2020 um 13:29 Uhr schrieb Zhenshi Zhou : > > I did this on my cluster and there was a huge number of pg rebalanced. > I think setting this option to 'on' is a good idea if it's a brand new > cluster. > On our new cluster we enabled them, but not on our

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
ize_ratio or > target_size_bytes accordingly. > > BTW, do you have some feeling that your 17000 PGs are currently not > correctly proportioned for your cluster? > > -- Dan > > On Tue, Jun 16, 2020 at 11:31 AM Boris Behrens wrote: > > > > Hi, > > > >

[ceph-users] enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
Hi, I would like to enable the pg_autoscaler on our nautilus cluster. Someone told me that I should be really really careful to NOT have customer impact. Maybe someone can share some experience on this? The Cluster got 455 OSDs on 19 hosts with ~17000 PGs and ~1petabyte raw storage where ~600TB

[ceph-users] Re: MAX AVAIL goes up when I reboot an OSD node

2020-05-29 Thread Boris Behrens
Hi Sinan, this happens with any node, and any single OSD. On Fri, May 29, 2020 at 10:09 AM si...@turka.nl wrote: > > Does this happen with any random node or specific to 1 node? > > If specific to 1 node, does this node holds more data compared to other nodes > (ceph osd df)? > > Sinan Polat >

[ceph-users] Re: MAX AVAIL goes up when I reboot an OSD node

2020-05-29 Thread Boris Behrens
Well, this happens when any OSD goes offline. (I stopped a single OSD service on one of our OSD nodes) On Fri, May 29, 2020 at 8:44 AM KervyN wrote: > > Hi Eugene, > no. The mgr services are located on our mon servers. > > This happens when I reboot any OSD node. > > >

[ceph-users] MAX AVAIL goes up when I reboot an OSD node

2020-05-28 Thread Boris Behrens
s. Cheers and thanks in advance Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Bluestore cache parameter precedence

2020-02-04 Thread Boris Epstein
to a certain value. Thanks in advance. Boris. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

<    1   2   3   4