Re: [ceph-users] MDS crash - FAILED assert(omap_num_objs <= MAX_OBJECTS)

2019-12-05 Thread Stefan Kooman
Hi, Quoting Yan, Zheng (uker...@gmail.com): > Please check if https://github.com/ceph/ceph/pull/32020 works Thanks! 13.2.6 with this patch is running production now. We will continue the cleanup process that *might* have triggered this tomorrow morning. Gr. Stefan -- | BIT BV

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Joe Comeau
Just a note that we use SUSE for our Ceph/Vmware system this is the general ceph docs for vmware/iscsi https://docs.ceph.com/docs/master/rbd/iscsi-initiator-esx/ this is the SUSE docs https://documentation.suse.com/ses/6/html/ses-all/cha-ceph-iscsi.html they differ I'll tell you what

Re: [ceph-users] 2 different ceph-users lists?

2019-12-05 Thread Rodrigo Severo - Fábrica
Em qui., 5 de dez. de 2019 às 16:38, Marc Roos escreveu: > > > > ceph-users@lists.ceph.com is old one, why this is, I also do not know Ok Marc. Thanks for your information. Rodrigo ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] 2 different ceph-users lists?

2019-12-05 Thread Marc Roos
ceph-users@lists.ceph.com is old one, why this is, I also do not know https://www.mail-archive.com/search?l=all=ceph -Original Message- From: Rodrigo Severo - Fábrica [mailto:rodr...@fabricadeideias.com] Sent: donderdag 5 december 2019 20:37 To: ceph-users@lists.ceph.com;

[ceph-users] 2 different ceph-users lists?

2019-12-05 Thread Rodrigo Severo - Fábrica
Hi, Are there 2 different ceph-users list? ceph-users@lists.ceph.com and ceph-us...@ceph.io Why? What's the difference? Regards, Rodrigo Severo ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread DHilsbos
Patrick; I agree with Ranjan, though not in the particulars. The issue is that "oversized" is ambiguous, though undersized is also ambiguous. I personally prefer unambiguous error messages which also suggest solutions, like: "1 MDSs reporting cache exceeds 'mds cache memory limit,' of: ." My

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Philip Brown
Okay then.. how DO you load balance across ceph iscsi gateways? You said "check the docs", but as far as I can tell, that info isnt in there. Or at least not in the logical place, such as the iscsi gateway setup pages, under https://docs.ceph.com/docs/master/rbd/iscsi-targets/ - Original

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Patrick Donnelly
On Thu, Dec 5, 2019 at 9:45 AM Ranjan Ghosh wrote: > Ah, that seems to have fixed it. Hope it stays that way. I've raised it > to 4 GB. Thanks to you both! Just be aware the warning could come back. You just moved the goal posts. The 1GB default is probably too low for most deployments, I have

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Paul Emmerich
No, you obviously don't need multiple pools for load balancing. -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Dec 5, 2019 at 6:46 PM Philip Brown wrote: > Hmm...

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Ranjan Ghosh
Ah, I understand now. Makes a lot of sense. Well, we have a LOT of small files so that might be the reason. I'll keep an eye on it whether the message shows up again. Thank you! Ranjan Am 05.12.19 um 19:40 schrieb Patrick Donnelly: > On Thu, Dec 5, 2019 at 9:45 AM Ranjan Ghosh wrote: >> Ah,

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Philip Brown
Hmm... I reread through the docs in and around https://docs.ceph.com/docs/master/rbd/iscsi-targets/ and it mentions about iscsi multipathing through multiple CEPH storage gateways... but it doesnt seem to say anything about needing multiple POOLS. when you wrote, " 1 pool per storage class

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Ranjan Ghosh
Hi, Ah, that seems to have fixed it. Hope it stays that way. I've raised it to 4 GB. Thanks to you both! Although I have to say that the message is IMHO *very* misleading: "1 MDSs report oversized cache" sounds to me like the cache is too large (i.e. wasting RAM unnecessarily). Shouldn't the

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Paul Emmerich
ceph-iscsi doesn't support round-robin multi-pathing; so you need at least one LUN per gateway to utilize all of them. Please see https://docs.ceph.com for basics about RBDs and pools. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH

Re: [ceph-users] best pool usage for vmware backing

2019-12-05 Thread Philip Brown
Interesting. I thought when you defined a pool, and then defined an RBD within that pool.. that any auto-replication stayed within that pool? So what kind of "load balancing" do you mean? I'm confused. - Original Message - From: "Paul Emmerich" To: "Philip Brown" Cc: "ceph-users"

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Nathan Fish
MDS cache size scales with the number of files recently opened by clients. if you have RAM to spare, increase "mds cache memory limit". I have raised mine from the default of 1GiB to 32GiB. My rough estimate is 2.5kiB per inode in recent use. On Thu, Dec 5, 2019 at 10:39 AM Ranjan Ghosh wrote:

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Eugen Block
Hi, can you provide more details? ceph daemon mds. cache status ceph config show mds. | grep mds_cache_memory_limit Regards, Eugen Zitat von Ranjan Ghosh : Okay, now, after I settled the issue with the oneshot service thanks to the amazing help of Paul and Richard (thanks again!), I still

Re: [ceph-users] HEALTH_WARN 1 MDSs report oversized cache

2019-12-05 Thread Ranjan Ghosh
Okay, now, after I settled the issue with the oneshot service thanks to the amazing help of Paul and Richard (thanks again!), I still wonder: What could I do about that MDS warning: === health: HEALTH_WARN 1 MDSs report oversized cache === If anybody has any ideas? I tried googling it, of

Re: [ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Ranjan Ghosh
Hi Richard, Ah, I think I understand, now, brilliant. It's *supposed* to do exactly that. Mount it once on boot and then just exit. So everything is working as intended. Great. Thanks Ranjan Am 05.12.19 um 15:18 schrieb Richard: > On 2019-12-05 7:19 AM, Ranjan Ghosh wrote: >> Why is my

Re: [ceph-users] Shall host weight auto reduce on hdd failure?

2019-12-05 Thread Milan Kupcevic
On 2019-12-05 02:33, Janne Johansson wrote: > Den tors 5 dec. 2019 kl 00:28 skrev Milan Kupcevic > mailto:milan_kupce...@harvard.edu>>: > > > > There is plenty of space to take more than a few failed nodes. But the > question was about what is going on inside a node with a few failed >

Re: [ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Ranjan Ghosh
Hi Paul, thanks for the explanation. I didn't know about the JSON file yet. That's certainly good to know. What I still don't understand, though: Why is my service marked inactivate/dead? Shouldn't it be running? If I run: systemctl start

Re: [ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Paul Emmerich
The ceph-volume services make sure that the right partitions are mounted at /var/lib/ceph/osd/ceph-X In "simple" mode the service gets the necessary information from a json file (long-hex-string.json) in /etc/ceph ceph-volume simple scan/activate create the json file and systemd unit. ceph-disk

[ceph-users] What does the ceph-volume@simple-crazyhexstuff SystemD service do? And what to do about oversized MDS cache?

2019-12-05 Thread Ranjan Ghosh
Hi all, After upgrading to Ubuntu 19.10 and consequently from Mimic to Nautilus, I had a mini-shock when my OSDs didn't come up. Okay, I should have read the docs more closely, I had to do: # ceph-volume simple scan /dev/sdb1 # ceph-volume simple activate --all Hooray. The OSDs came back to

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-12-05 Thread Florian Haas
On 02/12/2019 16:48, Florian Haas wrote: > Doc patch PR is here, for anyone who would feels inclined to review: > > https://github.com/ceph/ceph/pull/31893 Landed, here's the new documentation: https://docs.ceph.com/docs/master/rbd/rbd-exclusive-locks/ Thanks everyone for chiming in, and

Re: [ceph-users] Is a scrub error (read_error) on a primary osd safe to repair?

2019-12-05 Thread Caspar Smit
Konstantin, Thanks for your answer, i will run a ceph pg repair. Could you maybe elaborate globally how this repair process works? Does it just try to re-read the read_error osd? IIRC there was a time when a ceph pg repair wasn't considered 'safe' because it just copied the primary osd shard