[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2022-04-19 Thread Kai Stian Olstad

On 18.04.2022 21:35, Wesley Dillingham wrote:
If you mark an osd "out" but not down / you dont stop the daemon do the 
PGs

go remapped or do they go degraded then as well?


First I made sure the balancer was active, then I marked one osd "out", 
"ceph osd out 34" and check status every 2 seconds for 2 minutes, no 
degraded messages.
The only new messages in ceph -s was 12 remapped pgs and "11 
active-remapped+backfilling" and "1 active+remapped+backfill_wait"


Previously I had to set all osd(15 disks) on a host to out and there was 
no issue with PG in degraded state.



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2022-04-18 Thread Wesley Dillingham
If you mark an osd "out" but not down / you dont stop the daemon do the PGs
go remapped or do they go degraded then as well?

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Thu, Apr 14, 2022 at 5:15 AM Kai Stian Olstad 
wrote:

> On 29.03.2022 14:56, Sandor Zeestraten wrote:
> > I was wondering if you ever found out anything more about this issue.
>
> Unfortunately no, so I turned it off.
>
>
> > I am running into similar degradation issues while running rados bench
> > on a
> > new 16.2.6 cluster.
> > In our case it's with a replicated pool, but the degradation problems
> > also
> > go away when we turn off the balancer.
>
> So this goes a long way of confirming there are something wrong with the
> balancer since we now see it on two different installation.
>
>
> --
> Kai Stian Olstad
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2022-04-14 Thread Kai Stian Olstad

On 29.03.2022 14:56, Sandor Zeestraten wrote:

I was wondering if you ever found out anything more about this issue.


Unfortunately no, so I turned it off.


I am running into similar degradation issues while running rados bench 
on a

new 16.2.6 cluster.
In our case it's with a replicated pool, but the degradation problems 
also

go away when we turn off the balancer.


So this goes a long way of confirming there are something wrong with the 
balancer since we now see it on two different installation.



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-22 Thread Kai Stian Olstad

On 21.09.2021 09:11, Kobi Ginon wrote:

for sure the balancer affects the status


Of course, but setting several PG to degraded is something else.



i doubt that your customers will be writing so many objects in the same
rate of the Test.


I only need 2 host running rados bench to get several PG in degrade 
state.




maybe you need to play with the balancer configuration a bit.


Maybe, but a balancer should not set the cluster health to warning with 
several PG in degraded state.
It should be possible to do this cleanly, copy data and delete the 
source when copy is OK.




Could start with this
The balancer mode can be changed to crush-compat mode, which is 
backward

compatible with older clients, and will make small changes to the data
distribution over time to ensure that OSDs are equally utilized.
https://docs.ceph.com/en/latest/rados/operations/balancer/


I will probably just turn it off before I set the cluster in production.



side note: i m using indeed an old version of ceph ( nautilus)+ blancer
configured
and runs rado benchmarks , but did not saw such a problem.
on the other hand i m not using pg_autoscaler
i set the pools PG number in advanced according to assumption of the
percentage each pool will be using
Could be that you do use this Mode and the combination of auto scaler 
and

balancer is what reveals this issue


If you look at my initial post you will se that the pool is created with 
--autoscale-mode=off
The cluster is running 16.2.5 and is empty except for one pool with one 
PG created by Cephadm.



--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-20 Thread Kai Stian Olstad

On 17.09.2021 16:10, Eugen Block wrote:
Since I'm trying to test different erasure encoding plugin and  
technique I don't want the balancer active.
So I tried setting it to none as Eguene suggested, and to my  surprise 
I did not get any degraded messages at all, and the cluster  was in 
HEALTH_OK the whole time.


Interesting, maybe the balancer works differently now? Or it works
differently under heavy load?


It would be strange that the balancer normal operation is to put the 
cluster in degraded mode.




The only suspicious lines I see are these:

 Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug
2021-09-17T06:30:01.402+ 7f66b0329700  1 heartbeat_map
reset_timeout 'Monitor::cpu_tp thread 0x7f66b0329700' had timed out
after 0.0s

But I'm not sure if this is related. The out OSDs shouldn't have any
impact on this test.

Did you monitor the network saturation during these tests with iftop
or something similar?


I did not, so I rerun the test this morning.

All the servers have 2x25Gbit/s NIC in bonding with LACP 802.3ad 
layer3+4.


The peak on the active monitor was 27 Mbit/s and less on the other 2 
monitors.
I also checked the CPU(Xeon 5222 3.8 GHz) and non of the cores was 
saturated,

and network statistics show no errors or drops.


So perhaps there is a bug in the balancer code?

--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-17 Thread Eugen Block
Since I'm trying to test different erasure encoding plugin and  
technique I don't want the balancer active.
So I tried setting it to none as Eguene suggested, and to my  
surprise I did not get any degraded messages at all, and the cluster  
was in HEALTH_OK the whole time.


Interesting, maybe the balancer works differently now? Or it works  
differently under heavy load?


The logs you provided indeed mention the balancer many times in lines  
like these:


 Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug  
2021-09-17T06:30:01.322+ 7f66afb28700  0 mon.pech-mon-1@0(leader)  
e7 handle_command mon_command({"prefix": "osd pg-upmap-items",  
"format": "json", "pgid": "12.309", "id": [311, 344]} v 0) v1


The only suspicious lines I see are these:

 Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug  
2021-09-17T06:30:01.402+ 7f66b0329700  1 heartbeat_map  
reset_timeout 'Monitor::cpu_tp thread 0x7f66b0329700' had timed out  
after 0.0s


But I'm not sure if this is related. The out OSDs shouldn't have any  
impact on this test.


Did you monitor the network saturation during these tests with iftop  
or something similar?



Zitat von Kai Stian Olstad :


On 16.09.2021 15:51, Josh Baergen wrote:

I assume it's the balancer module. If you write lots of data quickly
into the cluster the distribution can vary and the balancer will try
to even out the placement.


The balancer won't cause degradation, only misplaced objects.


Since I'm trying to test different erasure encoding plugin and  
technique I don't want the balancer active.
So I tried setting it to none as Eguene suggested, and to my  
surprise I did not get any degraded messages at all, and the cluster  
was in HEALTH_OK the whole time.




   Degraded data redundancy: 260/11856050 objects degraded
(0.014%), 1 pg degraded


That status definitely indicates that something is wrong. Check your
cluster logs on your mons (/var/log/ceph/ceph.log) for the cause; my
guess is that you have OSDs flapping (rapidly going down and up again)
due to either overload (disk or network) or some sort of
misconfiguration.


So I enabled the balancer and run the rados bench again and the  
degraded messages is back.


I guess the equivalent log to /var/log/ceph/ceph.log in Cephadm is
  journalctl -u  
ceph-b321e76e-da3a-11eb-b75c-4f948441...@mon.pech-mon-1.service


There are no messages about osd being marked down, so I don't  
understand why this is happening.

I probably need to raise some verbose value.

I have attach the log from journalctl, it start at 06:30:00 when I  
started the rados bench and included a few lines after the first  
degrade message at 06:31.06.
Just be aware that 15 OSD is set to out, since I have some problem  
with the a HBA on one host, all test has been done with those 15 OSD  
in status out.


--
Kai Stian Olstad




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-17 Thread Kai Stian Olstad

On 16.09.2021 15:51, Josh Baergen wrote:

I assume it's the balancer module. If you write lots of data quickly
into the cluster the distribution can vary and the balancer will try
to even out the placement.


The balancer won't cause degradation, only misplaced objects.


Since I'm trying to test different erasure encoding plugin and technique 
I don't want the balancer active.
So I tried setting it to none as Eguene suggested, and to my surprise I 
did not get any degraded messages at all, and the cluster was in 
HEALTH_OK the whole time.




Degraded data redundancy: 260/11856050 objects degraded
(0.014%), 1 pg degraded


That status definitely indicates that something is wrong. Check your
cluster logs on your mons (/var/log/ceph/ceph.log) for the cause; my
guess is that you have OSDs flapping (rapidly going down and up again)
due to either overload (disk or network) or some sort of
misconfiguration.


So I enabled the balancer and run the rados bench again and the degraded 
messages is back.


I guess the equivalent log to /var/log/ceph/ceph.log in Cephadm is
  journalctl -u 
ceph-b321e76e-da3a-11eb-b75c-4f948441...@mon.pech-mon-1.service


There are no messages about osd being marked down, so I don't understand 
why this is happening.

I probably need to raise some verbose value.

I have attach the log from journalctl, it start at 06:30:00 when I 
started the rados bench and included a few lines after the first degrade 
message at 06:31.06.
Just be aware that 15 OSD is set to out, since I have some problem with 
the a HBA on one host, all test has been done with those 15 OSD in 
status out.


--
Kai Stian OlstadSep 17 06:30:00 pech-mon-1 conmon[1337]: debug 2021-09-17T06:29:59.994+ 
7f66b232d700  0 log_channel(cluster) log [INF] : overall HEALTH_OK
Sep 17 06:30:00 pech-mon-1 conmon[1337]: cluster 
2021-09-17T06:29:59.317530+ mgr.pech-mon-1.ptrsea
Sep 17 06:30:00 pech-mon-1 conmon[1337]:  (mgr.245802) 345745 : cluster [DBG] 
pgmap v347889: 1025 pgs: 1025 active+clean; 0 B data, 73 TiB used, 2.8 PiB / 
2.9 PiB avail
Sep 17 06:30:00 pech-mon-1 conmon[1337]: cluster 
2021-09-17T06:30:00.000143+ mon.pech-mon-1 (mon.0) 1166236 : 
Sep 17 06:30:00 pech-mon-1 conmon[1337]: cluster [INF] overall HEALTH_OK
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.318+ 
7f66afb28700  0 mon.pech-mon-1@0(leader) e7 handle_command 
mon_command({"prefix": "osd pg-upmap-items", "format": "json", "pgid": "12.6d", 
"id": [293, 327]} v 0) v1
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.318+ 
7f66afb28700  0 log_channel(audit) log [INF] : from='mgr.245802 
10.0.1.10:0/136830414' entity='mgr.pech-mon-1.ptrsea' cmd=[{"prefix": "osd 
pg-upmap-items", "format": "json", "pgid": "12.6d", "id": [293, 327]}]: dispatch
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.318+ 
7f66afb28700  0 mon.pech-mon-1@0(leader) e7 handle_command 
mon_command({"prefix": "osd pg-upmap-items", "format": "json", "pgid": 
"12.144", "id": [307, 351]} v 0) v1
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.318+ 
7f66afb28700  0 log_channel(audit) log [INF] : from='mgr.245802 
10.0.1.10:0/136830414' entity='mgr.pech-mon-1.ptrsea' cmd=[{"prefix": "osd 
pg-upmap-items", "format": "json", "pgid": "12.144", "id": [307, 351]}]: 
dispatch
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.322+ 
7f66afb28700  0 mon.pech-mon-1@0(leader) e7 handle_command 
mon_command({"prefix": "osd pg-upmap-items", "format": "json", "pgid": 
"12.17d", "id": [144, 136]} v 0) v1
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.322+ 
7f66afb28700  0 log_channel(audit) log [INF] : from='mgr.245802 
10.0.1.10:0/136830414' entity='mgr.pech-mon-1.ptrsea' cmd=[{"prefix": "osd 
pg-upmap-items", "format": "json", "pgid": "12.17d", "id": [144, 136]}]: 
dispatch
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.322+ 
7f66afb28700  0 mon.pech-mon-1@0(leader) e7 handle_command 
mon_command({"prefix": "osd pg-upmap-items", "format": "json", "pgid": 
"12.1a2", "id": [199, 189]} v 0) v1
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.322+ 
7f66afb28700  0 log_channel(audit) log [INF] : from='mgr.245802 
10.0.1.10:0/136830414' entity='mgr.pech-mon-1.ptrsea' cmd=[{"prefix": "osd 
pg-upmap-items", "format": "json", "pgid": "12.1a2", "id": [199, 189]}]: 
dispatch
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.322+ 
7f66afb28700  0 mon.pech-mon-1@0(leader) e7 handle_command 
mon_command({"prefix": "osd pg-upmap-items", "format": "json", "pgid": 
"12.1e1", "id": [289, 344]} v 0) v1
Sep 17 06:30:01 pech-mon-1 conmon[1337]: debug 2021-09-17T06:30:01.322+ 
7f66afb28700  0 log_channel(audit) log [INF] : from='mgr.245802 
10.0.1.10:0/136830414' entity='mgr.pech-mon-1.ptrsea' cmd=[{"prefix": "osd 
pg-upmap-items", "format": "json", "pgid": "12.1e1", "id": [289, 344]}]: 
dispatch
Sep 17 06:30:01 

[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-16 Thread Eugen Block
You’re absolutely right, of course, the balancer wouldn’t cause  
degraded PGs. Flapping OSDs seems very likely here.



Zitat von Josh Baergen :


I assume it's the balancer module. If you write lots of data quickly
into the cluster the distribution can vary and the balancer will try
to even out the placement.


The balancer won't cause degradation, only misplaced objects.


Degraded data redundancy: 260/11856050 objects degraded
(0.014%), 1 pg degraded


That status definitely indicates that something is wrong. Check your
cluster logs on your mons (/var/log/ceph/ceph.log) for the cause; my
guess is that you have OSDs flapping (rapidly going down and up again)
due to either overload (disk or network) or some sort of
misconfiguration.

Josh




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-16 Thread Josh Baergen
> I assume it's the balancer module. If you write lots of data quickly
> into the cluster the distribution can vary and the balancer will try
> to even out the placement.

The balancer won't cause degradation, only misplaced objects.

> Degraded data redundancy: 260/11856050 objects degraded
> (0.014%), 1 pg degraded

That status definitely indicates that something is wrong. Check your
cluster logs on your mons (/var/log/ceph/ceph.log) for the cause; my
guess is that you have OSDs flapping (rapidly going down and up again)
due to either overload (disk or network) or some sort of
misconfiguration.

Josh
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-16 Thread Eugen Block

Hi,

I assume it's the balancer module. If you write lots of data quickly  
into the cluster the distribution can vary and the balancer will try  
to even out the placement. You can check the status with


ceph balancer status

and disable it if necessary:

ceph balancer mode none

Regards,
Eugen


Zitat von Kai Stian Olstad :


Hi

I'm testing a Ceph cluster with "rados bench", it's an empty Cephadm  
install that only has one pool device_health_metrics.


Create a pool with 1024 pg on the hdd devices(15 servers has HDDs  
and 13 has SSDs)
ceph osd pool create pool-ec32-isa-reed_sol_van-hdd 1024 1024  
erasue ec32-isa-reed_sol_van-hdd --autoscale-mode=off


I then run "rados bench" from the 13 SSD hosts at the same time.
rados bench -p pool-ec32-isa-reed_sol_van-hdd 600 write --no-cleanup

After just a few seconds "ceph -s" starts to reports degraded data redundancy

Here is some examples during the 10 minutes testing period
Degraded data redundancy: 260/11856050 objects degraded  
(0.014%), 1 pg degraded
Degraded data redundancy: 260/1856050 objects degraded (0.014%),  
1 pg degraded

Degraded data redundancy: 1 pg undersized
Degraded data redundancy: 1688/3316225 objects degraded  
(0.051%), 3 pgs degraded
Degraded data redundancy: 5457/7005845 objects degraded  
(0.078%), 3 pgs degraded, 9 pgs undersized

Degraded data redundancy: 1 pg undersized
Degraded data redundancy: 4161/7005845 objects degraded  
(0.059%), 3 pgs degraded
Degraded data redundancy: 4315/7005845 objects degraded  
(0.062%), 2 pgs degraded, 4 pgs undersized



So my question is, it normal that Ceph report degraded under normal use?
or do I have a problem somewhere that I need to investigate?


--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io