date:20231219

[ceph-users] Re: v18.2.1 Reef released

2023-12-19 Thread Robert W. Eckert

Yes- I was on Ceph 18.2.0 - I had to update the ceph.repo file in 
/etc/yum.repos.d to point to 18.2.1 to get the latest ceph client.  
Mean while the initial pull using --image worked flawlessly, so all my services 
were updated.

- Rob

-Original Message-
From: Matthew Vernon  
Sent: Tuesday, December 19, 2023 4:32 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: v18.2.1 Reef released

On 19/12/2023 06:37, Eugen Block wrote:
> Hi,
> 
> I thought the fix for that would have made it into 18.2.1. It was 
> marked as resolved two months ago 
> (https://tracker.ceph.com/issues/63150,
> https://github.com/ceph/ceph/pull/53922).

Presumably that will only take effect once ceph orch is version 18.2.1 (whereas 
the reporter is still on 18.2.0)? i.e. one has to upgrade to
18.2.1 before this bug will be fixed and so the upgrade _to_ 18.2.1 is still 
affected.

Regards,

Matthew
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Logging control

2023-12-19 Thread Tim Holloway

OK. Found some loglevel overrides in the monitor and reset them.

Restarted the mgr and monitor just in case.

Still getting a lot of stuff that looks like this.

Dec 19 17:10:51 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: debug 2023-12-19T22:10:51.314+
7f36d7291700  4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443]
[default] Manual compact>
Dec 19 17:10:51 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: debug 2023-12-19T22:10:51.314+
7f36d7291700  4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443]
[default] Manual compact>
Dec 19 17:10:51 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: debug 2023-12-19T22:10:51.314+
7f36d7291700  4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443]
[default] Manual compact>
Dec 19 17:10:51 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: debug 2023-12-19T22:10:51.314+
7f36d7291700  4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443]
[default] Manual compact>
Dec 19 17:10:51 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: cluster 2023-12-19T22:10:49.542670+
mgr.xyz1 (mgr.6889303) 177 : cluster [DBG] pgmap v160: 649 pgs: 1
active+clean+scrubbin>
Dec 19 17:10:51 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:51.542+
7fa1d9fc7700  0 log_channel(cluster) log [DBG] : pgmap v161: 649 pgs: 1
active+clean+scrubbi>
Dec 19 17:10:52 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: cluster 2023-12-19T22:10:51.543403+
mgr.xyz1 (mgr.6889303) 178 : cluster [DBG] pgmap v161: 649 pgs: 1
active+clean+scrubbin>
Dec 19 17:10:52 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:52.748+
7fa1c74a2700  0 [progress INFO root] Processing OSDMap change
20239..20239
Dec 19 17:10:53 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:53.544+
7fa1d9fc7700  0 log_channel(cluster) log [DBG] : pgmap v162: 649 pgs: 1
active+clean+scrubbi>
Dec 19 17:10:54 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: cluster 2023-12-19T22:10:53.545649+
mgr.xyz1 (mgr.6889303) 179 : cluster [DBG] pgmap v162: 649 pgs: 1
active+clean+scrubbin>
Dec 19 17:10:55 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:55.545+
7fa1d9fc7700  0 log_channel(cluster) log [DBG] : pgmap v163: 649 pgs: 1
active+clean+scrubbi>
Dec 19 17:10:55 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: debug 2023-12-19T22:10:55.834+
7f36de29f700  1 mon.xyz1@1(peon).osd e20239 _set_new_cache_sizes
cache_size:1020054731 inc_a>
Dec 19 17:10:56 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: cluster 2023-12-19T22:10:55.546197+
mgr.xyz1 (mgr.6889303) 180 : cluster [DBG] pgmap v163: 649 pgs: 1
active+clean+scrubbin>
Dec 19 17:10:57 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:57.546+
7fa1d9fc7700  0 log_channel(cluster) log [DBG] : pgmap v164: 649 pgs: 1
active+clean+scrubbi>
Dec 19 17:10:57 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:57.751+
7fa1c74a2700  0 [progress INFO root] Processing OSDMap change
20239..20239
Dec 19 17:10:58 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: cluster 2023-12-19T22:10:57.547657+
mgr.xyz1 (mgr.6889303) 181 : cluster [DBG] pgmap v164: 649 pgs: 1
active+clean+scrubbin>
Dec 19 17:10:59 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:10:59.548+
7fa1d9fc7700  0 log_channel(cluster) log [DBG] : pgmap v165: 649 pgs: 1
active+clean+scrubbi>
Dec 19 17:11:00 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: :::10.0.1.2 - -
[19/Dec/2023:22:11:00] "GET /metrics HTTP/1.1" 200 215073 ""
"Prometheus/2.33.4"
Dec 19 17:11:00 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:11:00.762+
7fa1b6e42700  0 [prometheus INFO cherrypy.access.140332751105776]
:::10.0.1.2 - - [19/De>
Dec 19 17:11:00 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: debug 2023-12-19T22:11:00.835+
7f36de29f700  1 mon.xyz1@1(peon).osd e20239 _set_new_cache_sizes
cache_size:1020054731 inc_a>
Dec 19 17:11:00 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mon-xyz1[1906961]: cluster 2023-12-19T22:10:59.549152+
mgr.xyz1 (mgr.6889303) 182 : cluster [DBG] pgmap v165: 649 pgs: 1
active+clean+scrubbin>
Dec 19 17:11:01 xyz1.mousetech.com ceph-278fcd86-0861-11ee-a7df-
9c5c8e86cf8f-mgr-xyz1[1906755]: debug 2023-12-19T22:11:01.548+
7fa1d9fc7700  0

[ceph-users] Re: Logging control

2023-12-19 Thread Tim Holloway

The problem with "ceph daemon" is that the results I listed DID come
from running the command on the same machine as the daemon.

But "ceph tell" seems to be more promising.

There's more to the story, since I tried to do blind brute-force
adjustments and they failed also, but let me see if ceph tell gives a
better idea of what I should be doing.

   Tim
On Tue, 2023-12-19 at 16:21 -0500, Wesley Dillingham wrote:
> "ceph daemon" commands need to be run local to the machine where the
> daemon is running. So in this case if you arent on the node where
> osd.1 lives it wouldnt work. "ceph tell" should work anywhere there
> is a client.admin key.
> 
> 
> Respectfully,
> 
> Wes Dillingham
> w...@wesdillingham.com
> LinkedIn
> 
> 
> On Tue, Dec 19, 2023 at 4:02 PM Tim Holloway 
> wrote:
> > Ceph version is Pacific (16.2.14), upgraded from a sloppy Octopus.
> > 
> > I ran afoul of all the best bugs in Octopus, and in the process
> > switched on a lot of stuff better left alone, including some
> > detailed
> > debug logging. Now I can't turn it off.
> > 
> > I am confidently informed by the documentation that the first step
> > would be the command:
> > 
> > ceph daemon osd.1 config show | less
> > 
> > But instead of config information I get back:
> > 
> > Can't get admin socket path: unable to get conf option admin_socket
> > for
> > osd: b"error parsing 'osd': expected string of the form TYPE.ID,
> > valid
> > types are: auth, mon, osd, mds, mgr, client\n"
> > 
> > Which seems to be kind of insane.
> > 
> > Attempting to get daemon config info on a monitor on that machine
> > gives:
> > 
> > admin_socket: exception getting command descriptions: [Errno 2] No
> > such
> > file or directory
> > 
> > Which doesn't help either.
> > 
> > Anyone got an idea?
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Logging control

2023-12-19 Thread Wesley Dillingham

"ceph daemon" commands need to be run local to the machine where the daemon
is running. So in this case if you arent on the node where osd.1 lives it
wouldnt work. "ceph tell" should work anywhere there is a client.admin key.


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Dec 19, 2023 at 4:02 PM Tim Holloway  wrote:

> Ceph version is Pacific (16.2.14), upgraded from a sloppy Octopus.
>
> I ran afoul of all the best bugs in Octopus, and in the process
> switched on a lot of stuff better left alone, including some detailed
> debug logging. Now I can't turn it off.
>
> I am confidently informed by the documentation that the first step
> would be the command:
>
> ceph daemon osd.1 config show | less
>
> But instead of config information I get back:
>
> Can't get admin socket path: unable to get conf option admin_socket for
> osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid
> types are: auth, mon, osd, mds, mgr, client\n"
>
> Which seems to be kind of insane.
>
> Attempting to get daemon config info on a monitor on that machine
> gives:
>
> admin_socket: exception getting command descriptions: [Errno 2] No such
> file or directory
>
> Which doesn't help either.
>
> Anyone got an idea?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Logging control

2023-12-19 Thread Josh Baergen

I would start with "ceph tell osd.1 config diff", as I find that
output the easiest to read when trying to understand where various
config overrides are coming from. You almost never need to use "ceph
daemon" in Octopus+ systems since "ceph tell" should be able to access
pretty much all commands for daemons from any node.

Josh

On Tue, Dec 19, 2023 at 2:02 PM Tim Holloway  wrote:
>
> Ceph version is Pacific (16.2.14), upgraded from a sloppy Octopus.
>
> I ran afoul of all the best bugs in Octopus, and in the process
> switched on a lot of stuff better left alone, including some detailed
> debug logging. Now I can't turn it off.
>
> I am confidently informed by the documentation that the first step
> would be the command:
>
> ceph daemon osd.1 config show | less
>
> But instead of config information I get back:
>
> Can't get admin socket path: unable to get conf option admin_socket for
> osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid
> types are: auth, mon, osd, mds, mgr, client\n"
>
> Which seems to be kind of insane.
>
> Attempting to get daemon config info on a monitor on that machine
> gives:
>
> admin_socket: exception getting command descriptions: [Errno 2] No such
> file or directory
>
> Which doesn't help either.
>
> Anyone got an idea?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Logging control

2023-12-19 Thread Tim Holloway

Ceph version is Pacific (16.2.14), upgraded from a sloppy Octopus.

I ran afoul of all the best bugs in Octopus, and in the process
switched on a lot of stuff better left alone, including some detailed
debug logging. Now I can't turn it off.

I am confidently informed by the documentation that the first step
would be the command:

ceph daemon osd.1 config show | less

But instead of config information I get back:

Can't get admin socket path: unable to get conf option admin_socket for
osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid
types are: auth, mon, osd, mds, mgr, client\n"

Which seems to be kind of insane.

Attempting to get daemon config info on a monitor on that machine
gives:

admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory

Which doesn't help either.

Anyone got an idea?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] No User + Dev Monthly Meetup this week - Happy Holidays!

2023-12-19 Thread Laura Flores

Hi all,

A quick reminder that the User + Dev Monthly Meetup that was scheduled for
this week December 21 is cancelled due to the holidays.

The User + Dev Monthly Meetup will resume in the new year on January 18. If
you have a topic you'd like to present at an upcoming meetup, you're
welcome to submit it here:
https://docs.google.com/forms/d/e/1FAIpQLSdboBhxVoBZoaHm8xSmeBoemuXoV_rmh4vJDGBrp6d-D3-BlQ/viewform?usp=sf_link

Wishing everyone a happy holiday season!

Laura Flores
-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Can not activate some OSDs after upgrade (bad crc on label)

2023-12-19 Thread Huseyin Cotuk

Hello again,

We understood that the issue arises from a hardware crash with the help of Dan 
van der Ster and Mykola from Clyso. After upgrading Ceph, we encountered an 
unexpected crash resulted with a reboot. 

After comparing the first blocks of running and failed OSDs, we found that HW 
crash caused a corruption on the first 23 bytes of block devices. 

First few bytes of the block device of a failed OSD contains:

:          
0010:    0a63 3965 6533 6566 362d  ...c9ee3ef6-
0020: 3733 6437 2d34 3032 392d 3963 6436 2d30  73d7-4029-9cd6-0
0030: 3836 6363 3935 6432 6632 370a 0201 a901  86cc95d2f27.
0040:  c9ee 3ef6 73d7 4029 9cd6 086c c95d  >.s.@)...l.]
0050: 2f27  4002 4707  8ceb 6665 c827  /'..@.G.fe.'
0060: 0409 0400  6d61 696e 0d00  0a00  ..main……

and a running OSD contains:

: 626c 7565 7374 6f72 6520 626c 6f63 6b20  bluestore block 
0010: 6465 7669 6365 0a38 6637 3732 3532 312d  device.8f772521-
0020: 6535 3663 2d34 6135 622d 6239 3763 2d31  e56c-4a5b-b97c-1
0030: 6233 3630 6439 6266 6135 340a 0201 a901  b360d9bfa54.
0040:  8f77 2521 e56c 4a5b b97c 1b36 0d9b  ...w%!.lJ[.|.6..
0050: fa54  4002 4707  c8eb 6665 cd4c  .t...@.g.fe.L
0060: 6233 0400  6d61 696e 0d00  0a00  b3main……

It turned out that the first 23 bytes of data is corrupted during HW crash. So 
we copied the first 23 bytes of this data from a running OSD with the following 
command:

dd if=/dev/ceph-block-21/block-21 of=/root/header.21.dat bs=23 count=1

Then we copied the exact 23 bytes to the every failed OSD block device after 
backup and the problem is resolved. 

for i in {12..20} ; do dd if=/dev/ceph-block-$i/block-$i of=/root/backup.$i.1M 
bs=1M count=1 ;  dd if=/root/header.21.dat of=/dev/ceph-block-$i/block-$i bs=23 
count=1 ; done 

At the end of the day, it turned out that the lsiutil tool is not compatible 
with our kernel and caused the crash. The following link contains the detailed 
information. 

https://support.huawei.com/enterprise/en/knowledge/KB101578https://support.huawei.com/enterprise/en/knowledge/KB101578

I want to thank to Dan and Mykola from Clyso and appreciate their help.

BR,
Huseyin Cotuk
hco...@gmail.com
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v18.2.1 Reef released

2023-12-19 Thread Berger Wolfgang

Hi Community,
I'd like to report that Ceph (cephadm managed, RGW S3) works perfectly fine in 
my Debian 12 based LAB environment (Multi-Site Setup).
Huge thanks to all involved.
BR
Wolfgang

-Ursprüngliche Nachricht-
Von: Eugen Block  
Gesendet: Dienstag, 19. Dezember 2023 10:50
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: v18.2.1 Reef released

Right, that makes sense.

Zitat von Matthew Vernon :

> On 19/12/2023 06:37, Eugen Block wrote:
>> Hi,
>>
>> I thought the fix for that would have made it into 18.2.1. It was 
>> marked as resolved two months ago 
>> (https://tracker.ceph.com/issues/63150,
>> https://github.com/ceph/ceph/pull/53922).
>
> Presumably that will only take effect once ceph orch is version
> 18.2.1 (whereas the reporter is still on 18.2.0)? i.e. one has to 
> upgrade to 18.2.1 before this bug will be fixed and so the upgrade 
> _to_ 18.2.1 is still affected.
>
> Regards,
>
> Matthew
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Can not activate some OSDs after upgrade (bad crc on label)

2023-12-19 Thread Huseyin Cotuk

Hello Cephers,

I have two identical Ceph clusters with 32 OSDs each, running radosgw with EC. 
They were running Octopus on Ubuntu 20.04. 

On one of these clusters, I have upgraded OS to Ubuntu 22.04 and Ceph version 
is upgraded to Quincy 17.2.6. This cluster completed the process without any 
issue and it works as expected. 

On the second cluster, I followed the same procedure and upgraded the cluster. 
After upgrade 9 of 32 OSDs can not be activated. AFAIU, the label of these OSDs 
can not be read. ceph-volume lvm activate {osd.id} {osd_fsid} command fails as 
below:

 stderr: failed to read label for /dev/ceph-block-13/block-13: (5) Input/output 
error
 stderr: 2023-12-19T11:46:25.310+0300 7f088cd7ea80 -1 
bluestore(/dev/ceph-block-13/block-13) _read_bdev_label bad crc on label, 
expected 2340927273 != actual 2067505886

All ceph-bluestore-tool and ceph-object-storetool commands fail with the same 
message, so I can not try repair, fsck or migrate. 

# ceph-bluestore-tool  repair --deep yes --path /var/lib/ceph/osd/ceph-13/
failed to load os-type: (2) No such file or directory
2023-12-19T13:57:06.551+0300 7f39b1635a80 -1 
bluestore(/var/lib/ceph/osd/ceph-13/block) _read_bdev_label bad crc on label, 
expected 2340927273 != actual 2067505886

I also tried show label with bluestore-tool without success. 

# ceph-bluestore-tool show-label --dev /dev/ceph-block-13/block-13
unable to read label for /dev/ceph-block-13/block-13: (5) Input/output error
2023-12-19T14:01:19.668+0300 7fdcdd111a80 -1 
bluestore(/dev/ceph-block-13/block-13) _read_bdev_label bad crc on label, 
expected 2340927273 != actual 2067505886

I can get the information including osd_fsif, block_uuid of all failed OSDs via 
ceph-volume lvm list like below. 

== osd.13 ==

  [block]   /dev/ceph-block-13/block-13

  block device  /dev/ceph-block-13/block-13
  block uuidjFaTba-ln5r-muQd-7Ef9-3tWe-JwvO-qW9nqi
  cephx lockbox secret  
  cluster fsid  4e7e7d1c-22db-49c7-9f24-5a75cd3a3b9f
  cluster name  ceph
  crush device classNone
  encrypted 0
  osd fsid  c9ee3ef6-73d7-4029-9cd6-086cc95d2f27
  osd id13
  osdspec affinity  
  type  block
  vdo   0
  devices   /dev/mapper/mpathb

All vgs and lvs look healthy. 

# lvdisplay ceph-block-13/block-13
  --- Logical volume ---
  LV Path/dev/ceph-block-13/block-13
  LV Nameblock-13
  VG Nameceph-block-13
  LV UUIDjFaTba-ln5r-muQd-7Ef9-3tWe-JwvO-qW9nqi
  LV Write Accessread/write
  LV Creation host, time ank-backup01, 2023-11-29 10:41:53 +0300
  LV Status  available
  # open 0
  LV Size<7.28 TiB
  Current LE 1907721
  Segments   1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device   253:33

This is a single node cluster running only radosgw. The environment is as 
follows:

# ceph -v 
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)

# lsb_release -a 
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 22.04.3 LTS
Release:22.04
Codename:   jammy

# ceph osd crush rule dump  
[
{
"rule_id": 0,
"rule_name": "osd_replicated_rule",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "choose_firstn",
"num": 0,
"type": "osd"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "default.rgw.buckets.data",
"type": 3,
"steps": [
{
"op": "set_chooseleaf_tries",
"num": 5
},
{
"op": "set_choose_tries",
"num": 100
},
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "choose_indep",
"num": 0,
"type": "osd"
},
{
"op": "emit"
}
]
}
]

ID  CLASS  WEIGHT TYPE NAME  STATUS REWEIGHT  PRI-AFF
-1 226.29962  root default   
-3 226.29962  host ank-backup01  
 0hdd7.2  osd.0 up   1.0  1.0
 1hdd7.2  osd.1 up   1.0  1.0
 2hdd7.2  osd.2 up   1.0  1.0
 3hdd7.2

[ceph-users] Re: v18.2.1 Reef released

2023-12-19 Thread Eugen Block


Right, that makes sense.

Zitat von Matthew Vernon :


On 19/12/2023 06:37, Eugen Block wrote:

Hi,

I thought the fix for that would have made it into 18.2.1. It was  
marked as resolved two months ago  
(https://tracker.ceph.com/issues/63150,  
https://github.com/ceph/ceph/pull/53922).


Presumably that will only take effect once ceph orch is version  
18.2.1 (whereas the reporter is still on 18.2.0)? i.e. one has to  
upgrade to 18.2.1 before this bug will be fixed and so the upgrade  
_to_ 18.2.1 is still affected.


Regards,

Matthew
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v18.2.1 Reef released

2023-12-19 Thread Matthew Vernon


On 19/12/2023 06:37, Eugen Block wrote:

Hi,

I thought the fix for that would have made it into 18.2.1. It was marked 
as resolved two months ago (https://tracker.ceph.com/issues/63150, 
https://github.com/ceph/ceph/pull/53922).


Presumably that will only take effect once ceph orch is version 18.2.1 
(whereas the reporter is still on 18.2.0)? i.e. one has to upgrade to 
18.2.1 before this bug will be fixed and so the upgrade _to_ 18.2.1 is 
still affected.


Regards,

Matthew
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephadm Adding OSD wal device on a new

2023-12-19 Thread Eugen Block


Hi,

first, I'd recommend to use drivegroups [1] to apply OSD  
specifications to entire hosts instead of manually adding an OSD  
daemon. If you run 'ceph orch daemon add osd hostname:/dev/nvme0n1'  
then the OSD is already fully deployed, meaning wal, db and data  
device are all on the same disk (which is totally fine for NVMEs or  
SSDs, depending on the use-case). In that case you would need to move  
it to your target device by using the ceph-bluestore-tool:


$ ceph-bluestore-tool bluefs-bdev-new-wal --path osd path --dev-target  
new-device


The question is, if you already use NVMEs as OSD devices, where's the  
point in separating wal to different NVMEs? That doesn't make much  
sense to me, but if I misunderstand your setup, please clarify.


[1]  
https://docs.ceph.com/en/quincy/cephadm/services/osd/#advanced-osd-service-specifications



Zitat von "Adiga, Anantha" :


Hi,

After adding a node to the cluster (3 nodes) with cephadm, how do I  
add OSDs with the same configuration on the other nodes ?

The other nodes have
12 drives for data osd-block AND 2 drives for wal osd-wal. There are  
6 LVs in each wal disk for the 12 data drives.

I have added the ODS with
   ceph orch daemon add osd hostname:/dev/nvme0n1

How do I attach the wal devices to the OSDs?
I have the WAL volumes created
nvme3n1   
 259:50 349.3G  0 disk
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--dad0df4e--149a--4e80--b451--79f9b81838b8   253:12   0  58.2G  0  
lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--a5f2e93a--7bf0--4904--a233--3946b855c764   253:13   0  58.2G  0  
lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--cc949e1b--2560--4d38--bc27--558550881726   253:14   0  58.2G  0  
lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--8846f50e--7e92--4f66--a738--ce3a89650019   253:15   0  58.2G  0  
lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--6d646762--483a--40ca--8c51--ea54e0684a94   253:16   0  58.2G  0  
lvm
`-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--74e58163--de1d--4062--a658--5b0356d43a87   253:17   0  58.2G  0  
lvm


How do I attach the wal volumes to the OSDs


osd.36 nvme0n1  259:10   5.8T  0 disk
`-ceph--3df7c5c3--c2c0--4498--9e17--2af79e448abc-osd--block--804b50ea--d44c--4cad--9177--8d722f737df9 253:00   5.8T  0  
lvm
Osd.37 nvme1n1  259:30   5.8T  0 disk
`-ceph--30858acc--c48b--4a08--bb98--4c9b59112c59-osd--block--0a3a198b--66ec--4ed9--94da--fb171e190e38 253:10   5.8T  0  
lvm

nvme3n1



Thank you,

Anantha



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Support of SNMP on CEPH ansible

2023-12-19 Thread Eugen Block


Hi,

I don't have an answer for the SNMP part, I guess you could just bring  
up your own snmp daemon and configure it to your needs. As for the  
orchestrator backend you have these three options (I don't know what  
"test_orchestrator" does but it doesn't sound like it should be used  
in production):


enum_allowed=['cephadm', 'rook', 'test_orchestrator'],

If you intend to use the orchestrator I suggest to move to cephadm  
(you can convert an existing cluster by following this guide:  
https://docs.ceph.com/en/latest/cephadm/adoption/). Although the  
orchestrator is "on" it requires a backend.


Regards,
Eugen

Zitat von Lokendra Rathour :


Hi Team,
please help in the reference of the issue raised.


Best Regards,
Lokendra

On Wed, Dec 13, 2023 at 2:33 PM Kushagr Gupta 
wrote:


Hi Team,

*Environment:*
We have deployed a ceph setup using ceph-ansible.
Ceph-version: 18.2.0
OS: Almalinux 8.8
We have a 3 node-setup.

*Queries:*

1. Is SNMP supported for ceph-ansible?Is there some other way to setup
SNMP gateway for the ceph cluster?
2. Do we have a procedure to set backend for ceph-orchestrator via
ceph-ansible? Which backend to use?
3. Are there any CEPH MIB files which work independent of prometheus.


*Description:*
We are trying to perform SNMP monitoring for the ceph cluster using the
following link:

1.
https://docs.ceph.com/en/quincy/cephadm/services/snmp-gateway/#:~:text=Ceph's%20SNMP%20integration%20focuses%20on,a%20designated%20SNMP%20management%20platform
.
2.
https://www.ibm.com/docs/en/storage-ceph/7?topic=traps-deploying-snmp-gateway

But when we try to follow the steps mentioned in the above link, we get
the following error when we try to run any "ceph orch" we get the following
error:
"Error ENOENT: No orchestrator configured (try `ceph orch set backend`)"

After going through following links:
1.
https://www.ibm.com/docs/en/storage-ceph/5?topic=operations-use-ceph-orchestrator
2.
https://forum.proxmox.com/threads/ceph-mgr-orchestrator-enabled-but-showing-missing.119145/
3. https://docs.ceph.com/en/latest/mgr/orchestrator_modules/
I think since we have deployed the cluster using ceph-ansible, we can't
use the ceph-orch commands.
When we checked in the cluster, the following are the enabled modules:
"
[root@storagenode1 ~]# ceph mgr module ls
MODULE
balancer   on (always on)
crash  on (always on)
devicehealth   on (always on)
orchestrator   on (always on)
pg_autoscaler  on (always on)
progress   on (always on)
rbd_supporton (always on)
status on (always on)
telemetry  on (always on)
volumeson (always on)
alerts on
iostat on
nfson
prometheus on
restfulon
dashboard  -
influx -
insights   -
localpool  -
mds_autoscaler -
mirroring  -
osd_perf_query -
osd_support-
rgw-
selftest   -
snap_schedule  -
stats  -
telegraf   -
test_orchestrator  -
zabbix -
[root@storagenode1 ~]#
"
As can be seen above, orchestrator is on.

Also, We were exploring more about snmp and as per the file:
"/etc/prometheus/ceph/ceph_default_alerts.yml" on the ceph storage, the
OIDs in the file represents the OID for ceph components via prometheus.
For example:
for the following OID: 1.3.6.1.4.1.50495.1.2.1.2.1
[root@storagenode3 ~]# snmpwalk -v 2c -c 209ijvfwer0df92jd -O e 10.0.1.36
1.3.6.1.4.1.50495.1.2.1.2.1
CEPH-MIB::promHealthStatusError = No Such Object available on this agent
at this OID
[root@storagenode3 ~]#

Kindly help us for the same.

Thanks and regards,
Kushagra Gupta




--
~ Lokendra
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v18.2.1 Reef released

[ceph-users] Re: Logging control

[ceph-users] Re: Logging control

[ceph-users] Re: Logging control

[ceph-users] Re: Logging control

[ceph-users] Logging control

[ceph-users] No User + Dev Monthly Meetup this week - Happy Holidays!

[ceph-users] Re: Can not activate some OSDs after upgrade (bad crc on label)

[ceph-users] Re: v18.2.1 Reef released

[ceph-users] Can not activate some OSDs after upgrade (bad crc on label)

[ceph-users] Re: v18.2.1 Reef released

[ceph-users] Re: v18.2.1 Reef released

[ceph-users] Re: cephadm Adding OSD wal device on a new

[ceph-users] Re: Support of SNMP on CEPH ansible

14 matches

Site Navigation

Mail list logo

Footer information