Thanks Guys,

I was out of Office, I will try your suggestions and get back to you.

And extending the Cluster is something that I will do in the near Future, I 
just thought it would be better to get the Cluster Health back to “Normal” 
first.

Thanks,
Herbert


Von: ceph-users [mailto:[email protected]] Im Auftrag von Vadim 
Bulst
Gesendet: Dienstag, 12. Juni 2018 22:34
An: [email protected]
Betreff: Re: [ceph-users] Problems with CephFS


Well Herbert,

as Paul mentioned. You should reconfigure the threshold of your osds first and 
reweight second. Paul has sent you some hints.

Jewel Documentation:

http://docs.ceph.com/docs/jewel/rados/

osd backfill full ratio
Description:

Refuse to accept backfill requests when the Ceph OSD Daemon’s full ratio is 
above this value.

Type:

Float

Default:

0.85


You could put this into your config with an value of 0.9  on all osd-servers 
and restart the osd-daemons. Don't forget "ceph osd set noout".
After restarting the daemons "ceph osd unset noout" resync should take place 
instandly. Now set reweight on osd 1,0,2 to a value like 0.9.
"ceph osd reweight 1 0.9" and so on.

Herbert, you really should extend your cluster! And Or evacuate your data and 
rebuild it from scratch.

Cheers,

Vadim

On 12.06.2018 16:42, Steininger, Herbert wrote:

Hi,



Thanks Guys for your Answers.



'ceph osd df' gives me:

[root@pcl241 ceph]# ceph osd df

ID WEIGHT   REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS

 1 18.18999  1.00000 18625G 15705G  2919G 84.32 1.04 152

 0 18.18999  1.00000 18625G 15945G  2680G 85.61 1.06 165

 3 18.18999  1.00000 18625G 14755G  3870G 79.22 0.98 162

 4 18.18999  1.00000 18625G 14503G  4122G 77.87 0.96 158

 2 18.18999  1.00000 18625G 15965G  2660G 85.72 1.06 165

 5 18.18999  1.00000 21940G 16054G  5886G 73.17 0.91 159

               TOTAL   112T 92929G 22139G 80.76

MIN/MAX VAR: 0.91/1.06  STDDEV: 4.64





And



[root@pcl241 ceph]# ceph osd df tree

ID  WEIGHT    REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS TYPE NAME

 -1 109.13992        -      0      0      0     0    0   0 root default

 -2         0        -      0      0      0     0    0   0     host 
A1214-2950-01

 -3         0        -      0      0      0     0    0   0     host 
A1214-2950-02

 -4         0        -      0      0      0     0    0   0     host 
A1214-2950-04

 -5         0        -      0      0      0     0    0   0     host 
A1214-2950-05

 -6         0        -      0      0      0     0    0   0     host 
A1214-2950-03

 -7  18.18999        - 18625G 15705G  2919G 84.32 1.04   0     host cuda002

  1  18.18999  1.00000 18625G 15705G  2919G 84.32 1.04 152         osd.1

 -8  18.18999        - 18625G 15945G  2680G 85.61 1.06   0     host cuda001

  0  18.18999  1.00000 18625G 15945G  2680G 85.61 1.06 165         osd.0

 -9  18.18999        - 18625G 14755G  3870G 79.22 0.98   0     host cuda005

  3  18.18999  1.00000 18625G 14755G  3870G 79.22 0.98 162         osd.3

-10  18.18999        - 18625G 14503G  4122G 77.87 0.96   0     host cuda003

  4  18.18999  1.00000 18625G 14503G  4122G 77.87 0.96 158         osd.4

-11  18.18999        - 18625G 15965G  2660G 85.72 1.06   0     host cuda004

  2  18.18999  1.00000 18625G 15965G  2660G 85.72 1.06 165         osd.2

-12  18.18999        - 21940G 16054G  5886G 73.17 0.91   0     host 
A1214-2950-06

  5  18.18999  1.00000 21940G 16054G  5886G 73.17 0.91 159         osd.5

-13         0        -      0      0      0     0    0   0     host pe9

                 TOTAL   112T 92929G 22139G 80.76

MIN/MAX VAR: 0.91/1.06  STDDEV: 4.64

[root@pcl241 ceph]#





Is it wise to reduce the weight?

Thanks,

Best,

Herbert







-----Ursprüngliche Nachricht-----

Von: ceph-users [mailto:[email protected]] Im Auftrag von Vadim 
Bulst

Gesendet: Dienstag, 12. Juni 2018 11:16

An: [email protected]<mailto:[email protected]>

Betreff: Re: [ceph-users] Problems with CephFS



Hi Herbert,



could you please run "ceph osd df"?



Cheers,



Vadim





On 12.06.2018 11:06, Steininger, Herbert wrote:

Hi Guys,



i've inherited a CephFS-Cluster, I'm fairly new to CephFS.

The Cluster was down and I managed somehow to bring it up again.

But now there are some Problems that I can't fix that easily.

This is what 'ceph -s' is giving me as Info:

[root@pcl241 ceph]# ceph -s

     cluster cde1487e-f930-417a-9403-28e9ebf406b8

      health HEALTH_WARN

             2 pgs backfill_toofull

             1 pgs degraded

             1 pgs stuck degraded

             2 pgs stuck unclean

             1 pgs stuck undersized

             1 pgs undersized

             recovery 260/29731463 objects degraded (0.001%)

             recovery 798/29731463 objects misplaced (0.003%)

             2 near full osd(s)

             crush map has legacy tunables (require bobtail, min is firefly)

             crush map has straw_calc_version=0

      monmap e8: 3 mons at 
{cephcontrol=172.22.12.241:6789/0,slurmbackup=172.22.20.4:6789/0,slurmmaster=172.22.20.3:6789/0}

             election epoch 48, quorum 0,1,2 cephcontrol,slurmmaster,slurmbackup

       fsmap e2288: 1/1/1 up {0=pcl241=up:active}

      osdmap e10865: 6 osds: 6 up, 6 in; 2 remapped pgs

             flags nearfull

       pgmap v14103169: 320 pgs, 3 pools, 30899 GB data, 9678 kobjects

             92929 GB used, 22139 GB / 112 TB avail

             260/29731463 objects degraded (0.001%)

             798/29731463 objects misplaced (0.003%)

                  316 active+clean

                    2 active+clean+scrubbing+deep

                    1 active+undersized+degraded+remapped+backfill_toofull

                    1 active+remapped+backfill_toofull

[root@pcl241 ceph]#





[root@pcl241 ceph]# ceph osd tree

ID  WEIGHT    TYPE NAME              UP/DOWN REWEIGHT PRIMARY-AFFINITY

  -1 109.13992 root default

  -2         0     host A1214-2950-01

  -3         0     host A1214-2950-02

  -4         0     host A1214-2950-04

  -5         0     host A1214-2950-05

  -6         0     host A1214-2950-03

  -7  18.18999     host cuda002

   1  18.18999         osd.1               up  1.00000          1.00000

  -8  18.18999     host cuda001

   0  18.18999         osd.0               up  1.00000          1.00000

  -9  18.18999     host cuda005

   3  18.18999         osd.3               up  1.00000          1.00000

-10  18.18999     host cuda003

   4  18.18999         osd.4               up  1.00000          1.00000

-11  18.18999     host cuda004

   2  18.18999         osd.2               up  1.00000          1.00000

-12  18.18999     host A1214-2950-06

   5  18.18999         osd.5               up  1.00000          1.00000

-13         0     host pe9









Could someone please put me in the right Direction about what to do to fix the 
Problems?

It seems that two OSD are full, but how can I solve that, if I don't have 
additionally hardware available?

Also it seems that the Cluster has different ceph-versions running (Hammer and 
Jewel), how to solve that?

Ceph-(mds/-mon/-osd) is running on Scientific Linux.

If more Info is needed, just let me know.



Thanks in Advance,

Steininger Herbert



---

Herbert Steininger

Leiter EDV

Administrator

Max-Planck-Institut für Psychiatrie - EDV Kraepelinstr.  2-10

80804 München

Tel      +49 (0)89 / 30622-368

Mail   [email protected]<mailto:[email protected]> 
Web  http://www.psych.mpg.de





_______________________________________________

ceph-users mailing list

[email protected]<mailto:[email protected]>

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--

Vadim Bulst



Universität Leipzig / URZ

04109  Leipzig, Augustusplatz 10



phone: ++49-341-97-33380

mail:    [email protected]<mailto:[email protected]>





_______________________________________________

ceph-users mailing list

[email protected]<mailto:[email protected]>

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--

---

Vadim Bulst



Universität Leipzig / URZ

04109  Leipzig, Augustusplatz 10



phone: +49-341-97-33380

mail:    [email protected]<mailto:[email protected]>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to