Hi Yuan, Thanks for sharing the link, it is interesting to read. My understanding of the test results, is that with a fixed size of xattrs, using smaller stripe size will incur larger latency for read, which kind of makes sense since there are more k-v pairs, and with the size, it needs to get extents anyway.
Correct me if I am wrong here... Thanks, Guang > From: [email protected] > To: [email protected]; [email protected] > CC: [email protected]; [email protected] > Subject: RE: xattrs vs. omap with radosgw > Date: Wed, 17 Jun 2015 01:32:35 +0000 > > FWIW, there was some discussion in OpenStack Swift and their performance > tests showed 255 is not the best in recent XFS. They decided to use large > xattr boundary size(65535). > > https://gist.github.com/smerritt/5e7e650abaa20599ff34 > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Sage Weil > Sent: Wednesday, June 17, 2015 3:43 AM > To: GuangYang > Cc: [email protected]; [email protected] > Subject: Re: xattrs vs. omap with radosgw > > On Tue, 16 Jun 2015, GuangYang wrote: >> Hi Cephers, >> While looking at disk utilization on OSD, I noticed the disk was constantly >> busy with large number of small writes, further investigation showed that, >> as radosgw uses xattrs to store metadata (e.g. etag, content-type, etc.), >> which made the xattrs get from local to extents, which incurred extra I/O. >> >> I would like to check if anybody has experience with offloading the metadata >> to omap: >> 1> Offload everything to omap? If this is the case, should we make the >> inode size as 512 (instead of 2k)? >> 2> Partial offload the metadata to omap, e.g. only offloading the rgw >> specified metadata to omap. >> >> Any sharing is deeply appreciated. Thanks! > > Hi Guang, > > Is this hammer or firefly? > > With hammer the size of object_info_t crossed the 255 byte boundary, which is > the max xattr value that XFS can inline. We've since merged something that > stripes over several small xattrs so that we can keep things inline, but it > hasn't been backported to hammer yet. See > c6cdb4081e366f471b372102905a1192910ab2da. Perhaps this is what you're seeing? > > I think we're still better off with larger XFS inodes and inline xattrs if it > means we avoid leveldb at all for most objects. > > sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
