[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Venky Shankar
Hi Yuri, On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein wrote: > > I've updated all approvals and merged PRs in the tracker and it looks > like we are ready for gibba, LRC upgrades pending approval/update from > Venky. The smoke test failure is caused by missing (kclient) patches in Ubuntu

[ceph-users] mds hit find_exports balancer runs too long

2023-11-09 Thread zxcs
Hi, Experts, we have a CephFS cluster running with 16.2.*, and enable multi active mds, found somehow mds complain some info as below: mds.*.bal find_exports balancer runs too long and we already set below config mds_bal_interval = 30 mds_bal_sample_interval = 12 and then we can

[ceph-users] Re: Ceph Dashboard - Community News Sticker [Feedback]

2023-11-09 Thread Nizamudeen A
Thank you everyone for the feedback! It's always good to hear if something gives value or not to the UI and to users before we go ahead and start doing it. And btw, if people are wondering whether we are short on features, the short answer is no. Along with Multi-Cluster Management & monitoring

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-09 Thread Xiubo Li
On 11/10/23 00:18, Frank Schilder wrote: Hi Xiubo, I will try to answer questions from all your 3 e-mails here together with some new information we have. New: The problem occurs in newer python versions when using the shutil.copy function. There is also a function shutil.copy2 for which

[ceph-users] Re: MDS stuck in rejoin

2023-11-09 Thread Xiubo Li
On 11/9/23 23:41, Frank Schilder wrote: Hi Xiubo, great! I'm not sure if we observed this particular issue, but we did have the oldest_client_tid updates not advancing message in a context that might re related. If this fix is not too large, it would be really great if it could be included

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Yuri Weinstein
I've updated all approvals and merged PRs in the tracker and it looks like we are ready for gibba, LRC upgrades pending approval/update from Venky. On Thu, Nov 9, 2023 at 1:31 PM Radoslaw Zarzynski wrote: > > rados approved! > > Details are here: >

[ceph-users] Re: ceph fs (meta) data inconsistent

2023-11-09 Thread Frank Schilder
Hi Xiubo, I will try to answer questions from all your 3 e-mails here together with some new information we have. New: The problem occurs in newer python versions when using the shutil.copy function. There is also a function shutil.copy2 for which the problem does not show up. Copy2 behaves a

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Yuri Weinstein
On Wed, Nov 8, 2023 at 6:33 AM Travis Nielsen wrote: > > Yuri, we need to add this issue as a blocker for 18.2.1. We discovered this > issue after the release of 17.2.7, and don't want to hit the same blocker in > 18.2.1 where some types of OSDs are failing to be created in new clusters, or >

[ceph-users] Re: MDS stuck in rejoin

2023-11-09 Thread Frank Schilder
Hi Xiubo, great! I'm not sure if we observed this particular issue, but we did have the oldest_client_tid updates not advancing message in a context that might re related. If this fix is not too large, it would be really great if it could be included in the last Pacific point release. Best

[ceph-users] Re: Redeploy ceph orch OSDs after reboot, but don't mark as 'unmanaged'

2023-11-09 Thread Janek Bevendorff
I meant this one: https://tracker.ceph.com/issues/55395 Ah, alright, almost forgot about that one. Is there an "unmanaged: true" statement in this output? ceph orch ls osd --export No, it only contains the managed services that I configured. Just out of curiosity, is there a

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Casey Bodley
On Wed, Nov 8, 2023 at 11:10 AM Yuri Weinstein wrote: > > We merged 3 PRs and rebuilt "reef-release" (Build 2) > > Seeking approvals/reviews for: > > smoke - Laura, Radek 2 jobs failed in "objectstore/bluestore" tests > (see Build 2) > rados - Neha, Radek, Travis, Ernesto, Adam King > rgw - Casey

[ceph-users] Re: HDD cache

2023-11-09 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Redeploy ceph orch OSDs after reboot, but don't mark as 'unmanaged'

2023-11-09 Thread Eugen Block
I meant this one: https://tracker.ceph.com/issues/55395 Is there an "unmanaged: true" statement in this output? ceph orch ls osd --export Just out of curiosity, is there a "service_name" in your unit.meta for that OSD? grep service_name /var/lib/ceph/{fsid}/osd.{id}/unit.meta Zitat von

[ceph-users] Re: Redeploy ceph orch OSDs after reboot, but don't mark as 'unmanaged'

2023-11-09 Thread Janek Bevendorff
Hi Eugen, I stopped one OSD (which was deployed by ceph orch before) and this is what the MGR log says: 2023-11-09T13:35:36.941+ 7f067f1f0700  0 [cephadm DEBUG cephadm.services.osd] osd id 96 daemon already exists Before and after that are JSON dumps of the LVM properties of all OSDs.

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Eugen Block
It's the '#' character, everything after (including '#' itself) is cut off. I tried with single and double quotes which also failed. But as I already said, use a simple password and then change it within grafana. That way you also don't have the actual password lying around in clear text

[ceph-users] Re: Ceph Dashboard - Community News Sticker [Feedback]

2023-11-09 Thread Anthony D'Atri
IMHO we don't need yet another place to look for information, especially one that some operators never see. ymmv. > >> Hello, >> >> We wanted to get some feedback on one of the features that we are planning >> to bring in for upcoming releases. >> >> On the Ceph GUI, we thought it could be

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Eugen Block
I just tried it on a 17.2.6 test cluster, although I don't have a stack trace the complicated password doesn't seem to be applied (don't know why yet). But since it's an "initial" password you can choose something simple like "admin", and during the first login you are asked to change it

[ceph-users] Re: Stretch mode size

2023-11-09 Thread Sake Ceph
I believe they are working on it or want to work on it to revert from a stretched cluster, because of the reason you mention: if the other datacenter is totally burned down, you maybe want for the time being switch to one datacenter setup. Best regards, Sake > Op 09-11-2023 11:18 CET schreef

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Sake Ceph
I tried everything at this point, even waited a hour, still no luck. Got it 1 time accidentally working, but with a placeholder for a password. Tried with correct password, nothing and trying again with the placeholder didn't work anymore. So I thought to switch the manager, maybe something

[ceph-users] Re: Ceph Dashboard - Community News Sticker [Feedback]

2023-11-09 Thread Reto Gysi
Hi, No, I don't think it's not very useful at best, and bad at worst. In the IT organizations I've worked so far, any systems that actually store data, were in the highest security zone, where no connection incoming/or outgoing to the internet was allowed. Our systems couldn't even resolve

[ceph-users] Re: Memory footprint of increased PG number

2023-11-09 Thread Eugen Block
I was going through the hardware recommendations for a customer and wanted to cite the memory section from the current docs [1]: Setting the osd_memory_target below 2GB is not recommended. eph may fail to keep the memory consumption under 2GB and extremely slow performance is likely.

[ceph-users] Re: one cephfs volume becomes very slow

2023-11-09 Thread Eugen Block
Do you see a high disk utilization? Are those OSDs hdd-only or at least have their db on SSDs? I'd say the HDDs are the bottleneck. There was a recent thread [1] where Zakhar explained nicely how many IOPS you can expect from a hdd-only cluster. Maybe that helps. [1]

[ceph-users] High iowait when using Ceph NVME

2023-11-09 Thread Huy Nguyen
Hi, Currently, I'm testing Ceph v17.2.7 with NVMe. When mapping an rbd image to physical compute host, "fio bs=4k iodepth=128 randwrite" give 150k IOPS. I have a VM that located within the compute host, and the fio give ~40k IOPS and with 50 %iowait. I know there is a bottleneck, I'm not sure

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Eugen Block
Usually, removing the grafana service should be enough. I also have this directory (custom_config_files/grafana.) but it's empty. Can you confirm that after running 'ceph orch rm grafana' the service is actually gone ('ceph orch ls grafana')? The directory underneath

[ceph-users] Stretch mode size

2023-11-09 Thread Eugen Block
Hi, I'd like to ask for confirmation how I understand the docs on stretch mode [1]. It requires exact size 4 for the rule? Other sizes are not supported/won't work, for example size 6? Are there clusters out there which use this stretch mode? Once stretch mode is enabled, it's not possible

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-09 Thread Venky Shankar
Hi Yuri, On Wed, Nov 8, 2023 at 4:10 PM Venky Shankar wrote: > > Hi Yuri, > > On Wed, Nov 8, 2023 at 2:32 AM Yuri Weinstein wrote: > > > > 3 PRs above mentioned were merged and I am returning some tests: > > https://pulpito.ceph.com/?sha1=55e3239498650453ff76a9b06a37f1a6f488c8fd > > > > Still

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Sake Ceph
Using podman version 4.4.1 on RHEL 8.8, Ceph 17.2.7 I used 'podman system prune -a -f' and 'podman volume prune -f' to cleanup files, but this leaves a lot of files over in /var/lib/containers/storage/overlay and a empty folder /var/lib/ceph//custom_config_files/grafana.. Found those files

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Eugen Block
What doesn't work exactly? For me it did... Zitat von Sake Ceph : To bad, that doesn't work :( Op 09-11-2023 09:07 CET schreef Sake Ceph : Hi, Well to get promtail working with Loki, you need to setup a password in Grafana. But promtail wasn't working with the 17.2.6 release, the URL was

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Sake Ceph
To bad, that doesn't work :( > Op 09-11-2023 09:07 CET schreef Sake Ceph : > > > Hi, > > Well to get promtail working with Loki, you need to setup a password in > Grafana. > But promtail wasn't working with the 17.2.6 release, the URL was set to > containers.local. So I stopped using it,

[ceph-users] Re: Crush map & rule

2023-11-09 Thread David C.
(I wrote it freehand, test before applying) If your goal is to have a replication of 3 on a row and to be able to switch to the secondary row, then you need 2 roles and you change the crush rule on the pool side : rule primary_location { (...) step take primary class ssd step chooseleaf

[ceph-users] Re: Ceph Dashboard - Community News Sticker [Feedback]

2023-11-09 Thread Daniel Baumann
On 11/9/23 07:35, Nizamudeen A wrote: > On the Ceph GUI, we thought it could be interesting to show information > regarding the community events, ceph release information like others have already said, it's not the right place to put that information for lots of reasons. one more to add: putting

[ceph-users] Re: Redeploy ceph orch OSDs after reboot, but don't mark as 'unmanaged'

2023-11-09 Thread Eugen Block
Hi Janek, I don't really have a solution, but I tend to disagree that 'ceph cephadm osd activate' looks for OSDs to create. It's specifically stated in the docs that it's for existing OSDs to activate, and it did work in my test environment. I also commented the tracker issue you

[ceph-users] Re: HDD cache

2023-11-09 Thread Konstantin Shalygin
Hi Peter, > On Nov 8, 2023, at 20:32, Peter wrote: > > Anyone experienced this can advise? You can try: * check for current cache status smartctl -x /dev/sda | grep "Write cache" * turn off write cache smartctl -s wcache-sct,off,p /dev/sda * check again smartctl -x /dev/sda | grep "Write

[ceph-users] Re: Memory footprint of increased PG number

2023-11-09 Thread Eugen Block
Hi, I don't think increasing the PGs has an impact on the OSD's memory, at least I'm not aware of such reports and haven't seen it myself. But your cluster could get in trouble as it already is, only 24 GB for 16 OSDs is too low. It can work (and apparently does) when everything is calm,

[ceph-users] Re: Help needed with Grafana password

2023-11-09 Thread Sake Ceph
Hi, Well to get promtail working with Loki, you need to setup a password in Grafana. But promtail wasn't working with the 17.2.6 release, the URL was set to containers.local. So I stopped using it, but forgot to click on save in KeePass :( I didn't configure anything special in Grafana, the