On Tue, Jul 14, 2015 at 10:53 AM, Jan Schermer <[email protected]> wrote: > Thank you for your reply. > Comments inline. > > I’m still hoping to get some more input, but there are many people running > ceph on ext4, and it sounds like it works pretty good out of the box. Maybe > I’m overthinking this, then?
I think so — somebody did a lot of work making sure we were well-tuned on the standard filesystems; I believe it was David. -Greg > > Jan > >> On 13 Jul 2015, at 21:04, Somnath Roy <[email protected]> wrote: >> >> <<inline >> >> -----Original Message----- >> From: ceph-users [mailto:[email protected]] On Behalf Of Jan >> Schermer >> Sent: Monday, July 13, 2015 2:32 AM >> To: [email protected] >> Subject: Re: [ceph-users] xattrs vs omap >> >> Sorry for reviving an old thread, but could I get some input on this, pretty >> please? >> >> ext4 has 256-byte inodes by default (at least according to docs) but the >> fragment below says: >> OPTION(filestore_max_inline_xattr_size_other, OPT_U32, 512) >> >> The default 512b is too much if the inode is just 256b, so shouldn’t that be >> 256b in case people use the default ext4 inode size? >> >> Anyway, is it better to format ext4 with larger inodes (say 2048b) and set >> filestore_max_inline_xattr_size_other=1536, or leave it at defaults? >> [Somnath] Why 1536 ? why not 1024 or any power of 2 ? I am not seeing any >> harm though, but, curious. > > AFAIK there is other information in the inode other than xattrs, also you > need to count the xattra labels into this - so if I want to store 1536B of > “values” it would cost more, and there still needs to be some space left. > >> (As I understand it, on ext4 xattrs ale limited to one block, inode size + >> something can spill to one different inode - maybe someone knows better). >> >> >> [Somnath] The xttr size ("_") is now more than 256 bytes and it will spill >> over, so, bigger inode size will be good. But, I would suggest do your >> benchmark before putting it into production. >> > > Good poin and I am going to do that, but I’d like to avoid the guesswork. > Also, not all patterns are always replicable…. > >> Is filestore_max_inline_xattr_size and absolute limit, or is it >> filestore_max_inline_xattr_size*filestore_max_inline_xattrs in reality? >> >> [Somnath] The *_size is tracking the xttr size per attribute and >> *inline_xattrs keep track of max number of inline attributes allowed. So, if >> a xattr size is > *_size , it will go to omap and also if the total number >> of xattra > *inline_xattrs , it will go to omap. >> If you are only using rbd, the number of inline xattrs will be always 2 and >> it will not cross that default max limit. > > If I’m reading this correctly then with my setting of > filestore_max_inline_xattr_size_other=1536, it could actually consume 3072B > (2 xattrs), so I should in reality use 4K inodes…? > > >> >> Does OSD do the sane thing if for some reason the xattrs do not fit? What >> are the performance implications of storing the xattrs in leveldb? >> >> [Somnath] Even though I don't have the exact numbers, but, it has a >> significant overhead if the xattrs go to leveldb. >> >> And lastly - what size of xattrs should I really expect if all I use is RBD >> for OpenStack instances? (No radosgw, no cephfs, but heavy on rbd image and >> pool snapshots). This overhead is quite large >> >> [Somnath] It will be 2 xattrs, default "_" will be little bigger than 256 >> bytes and "_snapset" is small depends on number of snaps/clones, but >> unlikely will cross 256 bytes range. > > I have few pool snapshots and lots (hundreds) of (nested) snapshots for rbd > volumes. Does this come into play somehow? > >> >> My plan so far is to format the drives like this: >> mkfs.ext4 -I 2048 -b 4096 -i 524288 -E stride=32,stripe-width=256 (2048b >> inode, 4096b block size, one inode for 512k of space and set >> filestore_max_inline_xattr_size_other=1536 >> [Somnath] Not much idea on ext4, sorry.. >> >> Does that make sense? >> >> Thanks! >> >> Jan >> >> >> >>> On 02 Jul 2015, at 12:18, Jan Schermer <[email protected]> wrote: >>> >>> Does anyone have a known-good set of parameters for ext4? I want to try it >>> as well but I’m a bit worried what happnes if I get it wrong. >>> >>> Thanks >>> >>> Jan >>> >>> >>> >>>> On 02 Jul 2015, at 09:40, Nick Fisk <[email protected]> wrote: >>>> >>>>> -----Original Message----- >>>>> From: ceph-users [mailto:[email protected]] On >>>>> Behalf Of Christian Balzer >>>>> Sent: 02 July 2015 02:23 >>>>> To: Ceph Users >>>>> Subject: Re: [ceph-users] xattrs vs omap >>>>> >>>>> On Thu, 2 Jul 2015 00:36:18 +0000 Somnath Roy wrote: >>>>> >>>>>> It is replaced with the following config option.. >>>>>> >>>>>> // Use omap for xattrs for attrs over // >>>>>> filestore_max_inline_xattr_size or >>>>>> OPTION(filestore_max_inline_xattr_size, OPT_U32, 0) //Override >>>>>> OPTION(filestore_max_inline_xattr_size_xfs, OPT_U32, 65536) >>>>>> OPTION(filestore_max_inline_xattr_size_btrfs, OPT_U32, 2048) >>>>>> OPTION(filestore_max_inline_xattr_size_other, OPT_U32, 512) >>>>>> >>>>>> // for more than filestore_max_inline_xattrs attrs >>>>>> OPTION(filestore_max_inline_xattrs, OPT_U32, 0) //Override >>>>>> OPTION(filestore_max_inline_xattrs_xfs, OPT_U32, 10) >>>>>> OPTION(filestore_max_inline_xattrs_btrfs, OPT_U32, 10) >>>>>> OPTION(filestore_max_inline_xattrs_other, OPT_U32, 2) >>>>>> >>>>>> >>>>>> If these limits crossed, xattrs will be stored in omap.. >>>>>> >>>>> Sounds fair. >>>>> >>>>> Since I only use RBD I don't think it will ever exceed this. >>>> >>>> Possibly, see my thread about performance difference between new and >>>> old pools. Still not quite sure what's going on, but for some reasons >>>> some of the objects behind RBD's have larger xattrs which is causing >>>> really poor performance. >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Chibi >>>>>> For ext4, you can use either filestore_max*_other or >>>>>> filestore_max_inline_xattrs/ filestore_max_inline_xattr_size. I any >>>>>> case, later two will override everything. >>>>>> >>>>>> Thanks & Regards >>>>>> Somnath >>>>>> >>>>>> -----Original Message----- >>>>>> From: Christian Balzer [mailto:[email protected]] >>>>>> Sent: Wednesday, July 01, 2015 5:26 PM >>>>>> To: Ceph Users >>>>>> Cc: Somnath Roy >>>>>> Subject: Re: [ceph-users] xattrs vs omap >>>>>> >>>>>> >>>>>> Hello, >>>>>> >>>>>> On Wed, 1 Jul 2015 15:24:13 +0000 Somnath Roy wrote: >>>>>> >>>>>>> It doesn't matter, I think filestore_xattr_use_omap is a 'noop' >>>>>>> and not used in the Hammer. >>>>>>> >>>>>> Then what was this functionality replaced with, esp. considering >>>>>> EXT4 based OSDs? >>>>>> >>>>>> Chibi >>>>>>> Thanks & Regards >>>>>>> Somnath >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: ceph-users [mailto:[email protected]] On >>>>>>> Behalf Of Adam Tygart Sent: Wednesday, July 01, 2015 8:20 AM >>>>>>> To: Ceph Users >>>>>>> Subject: [ceph-users] xattrs vs omap >>>>>>> >>>>>>> Hello all, >>>>>>> >>>>>>> I've got a coworker who put "filestore_xattr_use_omap = true" in >>>>>>> the ceph.conf when we first started building the cluster. Now he >>>>>>> can't remember why. He thinks it may be a holdover from our first >>>>>>> Ceph cluster (running dumpling on ext4, iirc). >>>>>>> >>>>>>> In the newly built cluster, we are using XFS with 2048 byte >>>>>>> inodes, running Ceph 0.94.2. It currently has production data in it. >>>>>>> >>>>>>> From my reading of other threads, it looks like this is probably >>>>>>> not something you want set to true (at least on XFS), due to >>>>>>> performance implications. Is this something you can change on a running >>>>>>> cluster? >>>>>>> Is it worth the hassle? >>>>>>> >>>>>>> Thanks, >>>>>>> Adam >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> [email protected] >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>>>>>> ________________________________ >>>>>>> >>>>>>> PLEASE NOTE: The information contained in this electronic mail >>>>>>> message is intended only for the use of the designated >>>>>>> recipient(s) named above. If the reader of this message is not the >>>>>>> intended recipient, you are hereby notified that you have received >>>>>>> this message in error and that any review, dissemination, >>>>>>> distribution, or copying of this message is strictly prohibited. >>>>>>> If you have received this communication in error, please notify >>>>>>> the sender by telephone or e-mail (as shown above) immediately and >>>>>>> destroy any and all copies of this message in your possession >>>>>>> (whether hard copies or electronically stored copies). >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> [email protected] >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Christian Balzer Network/Systems Engineer >>>>>> [email protected] Global OnLine Japan/Fusion Communications >>>>>> http://www.gol.com/ >>>>>> >>>>>> ________________________________ >>>>>> >>>>>> PLEASE NOTE: The information contained in this electronic mail >>>>>> message is intended only for the use of the designated recipient(s) >>>>>> named above. >>>>>> If the reader of this message is not the intended recipient, you >>>>>> are hereby notified that you have received this message in error >>>>>> and that any review, dissemination, distribution, or copying of >>>>>> this message is strictly prohibited. If you have received this >>>>>> communication in error, please notify the sender by telephone or >>>>>> e-mail (as shown above) immediately and destroy any and all copies >>>>>> of this message in your possession (whether hard copies or >>>>>> electronically stored copies). >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Christian Balzer Network/Systems Engineer >>>>> [email protected] Global OnLine Japan/Fusion Communications >>>>> http://www.gol.com/ >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> [email protected] >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> [email protected] >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> ________________________________ >> >> PLEASE NOTE: The information contained in this electronic mail message is >> intended only for the use of the designated recipient(s) named above. If the >> reader of this message is not the intended recipient, you are hereby >> notified that you have received this message in error and that any review, >> dissemination, distribution, or copying of this message is strictly >> prohibited. If you have received this communication in error, please notify >> the sender by telephone or e-mail (as shown above) immediately and destroy >> any and all copies of this message in your possession (whether hard copies >> or electronically stored copies). >> > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
