[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-08-31 Thread Josh Baergen
Yeah, I would suggest inspecting your CRUSH tree. Unfortunately the grep above removed that information from 'df tree', but from the information you provided there does appear to be a significant imbalance remaining. Josh On Tue, Aug 31, 2021 at 6:02 PM mhnx wrote: > > Hello Josh! > > I use

[ceph-users] Re: tcmu-runner crashing on 16.2.5

2021-08-31 Thread Paul Giralt (pgiralt)
However, the gwcli command is still showing the other two gateways which are no longer enabled anymore. Where does this list of gateways get stored? All this configurations are stored in the "gateway.conf" object in "rbd" pool. How do I access this object? Is it a file or some kind of object

[ceph-users] Re: podman daemons in error state - where to find logs?

2021-08-31 Thread 胡 玮文
With cephadm, You can find the logs with “journalctl” command outside of the container. Or you can change config to use traditional log files: ceph config set global log_to_file true > 在 2021年9月1日,09:50,Nigel Williams 写道: > > to answer my own question, the logs are meant to be in >

[ceph-users] Re: podman daemons in error state - where to find logs?

2021-08-31 Thread Nigel Williams
to answer my own question, the logs are meant to be in /var/log/ceph//... however on this host they were all zero length. On Tue, 31 Aug 2021 at 20:51, Nigel Williams wrote: > > Where to find more detailed logs? or do I need to adjust a log-level > first? thanks. > >

[ceph-users] Re: tcmu-runner crashing on 16.2.5

2021-08-31 Thread Paul Giralt (pgiralt)
Thank you. This is exactly what I was looking for. If I’m understanding correctly, what gets listed as the “owner” is what gets advertised via ALUA as the primary path, but the lock owner indicates which gateway currently owns the lock for that image and is allowed to pass traffic for that

[ceph-users] Re: tcmu-runner crashing on 16.2.5

2021-08-31 Thread Paul Giralt (pgiralt)
Xiubo, Thank you for all the help so far. I was finally able to figure out what the trigger for the issue was and how to make sure it doesn’t happen - at least not in a steady state. There is still the possibility of running into the bug in a failover scenario of some kind, but at least for

[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-08-31 Thread mhnx
Hello Josh! I use balancer active - crush-compat. Balance is done and there are no remapped pgs at ceph -s ceph osd df tree | grep 'CLASS\|ssd' ID CLASS WEIGHT REWEIGHT SIZERAW USE DATAOMAPMETA AVAIL %USE VAR PGS STATUS TYPE NAME 19 ssd0.87320 1.0 894 GiB 402

[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-08-31 Thread Josh Baergen
Hi there, Could you post the output of "ceph osd df tree"? I would highly suspect that this is a result of imbalance, and that's the easiest way to see if that's the case. It would also confirm that the new disks have taken on PGs. Josh On Tue, Aug 31, 2021 at 10:50 AM mhnx wrote: > > I'm

[ceph-users] Re: nautilus cluster down by loss of 2 mons

2021-08-31 Thread Marcel Kuiper
During normal operation the size is under 1G. After the network ordeal it was 65G. I gave the last mon all diskspace I could find under /var/lib/ceph and started the mon again. it is now reaching 90G and still growing Does anyone have an idea howmuch disk free would be needed to get the job

[ceph-users] Re: nautilus cluster down by loss of 2 mons

2021-08-31 Thread Frank Schilder
Hi Mac, when I started with ceph, there was a page with hardware recommendations (https://docs.ceph.com/en/mimic/start/hardware-recommendations/) and some reference configurations from big vendors to look at. Firstly, this page seems to have disappeared in latest and, secondly, I have to agree

[ceph-users] Re: cephadm Pacific bootstrap hangs waiting for mon

2021-08-31 Thread Matthew Pounsett
On Tue, 31 Aug 2021 at 03:24, Arnaud MARTEL wrote: > > Hi Matthew, > > I dont' know if it will be helpful but I had the same problem using debian 10 > and the solution was to install docker from docker.io and not from the debian > package (too old). > Ah, that makes sense. Thanks! > Arnaud >

[ceph-users] nautilus cluster down by loss of 2 mons

2021-08-31 Thread Marcel Kuiper
Hi We have a nautilus cluster that was plagued by a network failure. One of the monitors fell out of quorum Once the network settled down and all osds were back online again we got that mon synchronizing However the filesystem suddenly exploded in a minute or so from 63G usage to 93G

[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

2021-08-31 Thread Frank Schilder
Hi Dan, unfortunately, the file/directory names were generated like one would do for temporary files. No clue about their location. I would need to find such a file while it exists. Of course, I could execute a find on the snapshot ... Just kidding. The large omap count is going down already,

[ceph-users] Re: MDS daemons stuck in resolve, please help

2021-08-31 Thread Frank Schilder
Hi Dan, I'm running mimic latest version. Thanks for the link to the PR, this looks good. Directory pinning does not work in mimic, I had another case on that. The required xattribs are not implemented although documented. The default load balancing seems to work quite well for us - I saw the

[ceph-users] Re: Missing OSD in SSD after disk failure

2021-08-31 Thread Eric Fahnle
Hi David, no problem, thanks for your help! Went through your commands, here are the results -4 Servers with OSDs -Server "nubceph04" has 2 osd (osd.0 and osd.7 in /dev/sdb and /dev/sdc respectively, and db_device in /dev/sdd) # capture "db device" and raw device associated with OSD (just for

[ceph-users] Re: nautilus cluster down by loss of 2 mons

2021-08-31 Thread Marc
Could someone also explain the logics behind the decision to dump so much data to the disk. Especially in container environments with resource limits this is not really nice. > -Original Message- > Sent: Tuesday, 31 August 2021 19:16 > To: ceph-users@ceph.io > Subject: [ceph-users]

[ceph-users] Re: [Ceph Dashboard] Alert configuration.

2021-08-31 Thread Ernesto Puerta
Hi Lokendra, The alerts are configured in and triggered by Alertmanager, so I don't see any way to have alerts without Alertmanager. Kind Regards, Ernesto On Tue, Aug 24, 2021 at 7:25 AM Lokendra Rathour wrote: > Hi Daniel, > Thanks for the response !! > If we talk about dashboard alerts,

[ceph-users] After adding New Osd's, Pool Max Avail did not changed.

2021-08-31 Thread mhnx
I'm using Nautilus 14.2.16 I was have 20 ssd OSD in my cluster and I added 10 more. " Each SSD=960GB" The Size increased to *(26TiB)* as expected but the Replicated (3) Pool Max Avail didn't changed *(3.5TiB)*. I've increased pg_num and PG rebalance is also done. Do I need any special treatment

[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

2021-08-31 Thread Frank Schilder
Dear Dan and Patrick, the find didn't return anything. With this and the info below, am I right to assume that these were temporary working directories that got caught in a snapshot (we use rolling snapshots)? I would really appreciate any ideas on how to find out the original file system

[ceph-users] Re: MDS daemons stuck in resolve, please help

2021-08-31 Thread Frank Schilder
I seem to be hit by the problem discussed here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/AOYWQSONTFROPB4DXVYADWW7V25C3G6Z/ In my case, what helped getting the cash size growth somewhat under control was ceph config set mds mds_recall_max_caps 1 I'm not sure about

[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

2021-08-31 Thread Dan van der Ster
Hi, I don't know how to find a full path from a dir object. But perhaps you can make an educated guess based on what you see in: rados listomapkeys --pool=con-fs2-meta1 1000eec35f5.0100 | head -n 100 Those should be the directory entries. (s/_head//) -- Dan On Tue, Aug 31, 2021 at 2:31 PM

[ceph-users] Re: MDS daemons stuck in resolve, please help

2021-08-31 Thread Dan van der Ster
Hi Frank, It helps if you start threads reminding us which version you're running. During nautilus the caps recall issue (which is AFAIK the main cause of mds cache overruns) should be solved with this PR: https://github.com/ceph/ceph/pull/39134/files If you're not running >= 14.2.17 then you

[ceph-users] Re: radosgw manual deployment

2021-08-31 Thread Eugen Block
How exactly did you create the rgw(s), realms, users etc.? I have single node (pacific) where connecting the dashboard worked just fine. Basically this is what I did: # create realm, zonegroup, zone radosgw-admin realm create --rgw-realm=pacific-realm --default radosgw-admin zonegroup create

[ceph-users] Re: Very beginner question for cephadm: config file for bootstrap and osd_crush_chooseleaf_type

2021-08-31 Thread Ignacio García
Just for experimenting, which are those single host defaults? Maybe these?: mon_allow_pool_size_one = 1 osd_pool_default_size = 1 Ignacio El 30/8/21 a las 17:31, Sebastian Wagner escribió: Try running `cephadm bootstrap --single-host-defaults` Am 20.08.21 um 18:23 schrieb Eugen Block: Hi,

[ceph-users] podman daemons in error state - where to find logs?

2021-08-31 Thread Nigel Williams
Ubuntu 20.04.3, Octopus 152.13, cephadm + podman After a routine reboot, all OSDs on a host did not come up, after a few iterations of cephadm deploy, and fixing the missing config file, the daemons remain in the error state but neither journalctl / systemctl show any log errors other than exit

[ceph-users] Re: Dashboard no longer listening on all interfaces after upgrade to 16.2.5

2021-08-31 Thread Ernesto Puerta
Hi Oliver, This issue has already been discussed in this mailing list ([1] and [2]

[ceph-users] Re: A practical approach to efficiently store 100 billions small objects in Ceph

2021-08-31 Thread Loïc Dachary
Hi, I incorrectly wrote that EOS[0] was using Ceph and RDB. I edited the object storage design[1] to remove that and avoid any confusion. Now I'll try to figure out what led me to make this significant mistake. My apologies for misrepresenting EOS. Cheers [0]

[ceph-users] Re: cephadm Pacific bootstrap hangs waiting for mon

2021-08-31 Thread Arnaud MARTEL
Hi Matthew, I dont' know if it will be helpful but I had the same problem using debian 10 and the solution was to install docker from docker.io and not from the debian package (too old). Arnaud - Mail original - De: "Matthew Pounsett" À: "ceph-users" Envoyé: Lundi 30 Août 2021