Hi Haomai; Do you use filestore_fiemap=true parameter over CentOS7+ Hammer/Interfalis on any Production Ceph environment for rbd style storage? Is it safe to use on production environment?
Thanks Özhan On Wed, Nov 18, 2015 at 8:12 AM, Haomai Wang <[email protected]> wrote: > Yes, it's a expected case. Actually if you use Hammer, you can enable > filestore_fiemap to use sparse copy which especially useful for rbd > snapshot copy. But keep in mind some old kernel are *broken* in > fiemap. CentOS 7 is only the distro I verfied fine to this feature. > > > On Wed, Nov 18, 2015 at 12:25 PM, Will Bryant <[email protected]> > wrote: > > Hi, > > > > We’ve been running an all-SSD Ceph cluster for a few months now and > generally are very happy with it. > > > > However, we’ve noticed that if we create a snapshot of an RBD device, > then writing to the RBD goes massively slower than before we took the > snapshot. Similarly, we get poor performance if we make a clone of that > snapshot and write to it. > > > > For example, using fio to run a 2-worker 4kb synchronous random write > benchmark, we normally get about 5000 IOPS to RBD on our test-sized cluster > (Intel 3710, 10G networking, Ubuntu 14.04). But as soon as I take a > snapshot, this goes down to about 100 IOPS, and with high variability - at > times 0 IOPS, 60 IOPS, or 300 IOPS. > > > > I realise that after a snapshot, any write will trigger a copy of the > block, which by default would be 4 MB of data - to minimize this effect > I’ve reduced the RBD order to 18 ie. 256 KB blocks. > > > > But shouldn’t that effect only degrade it to the same performance as we > get on a completely new RBD image that has no snapshots and no data? For > us that is more like 1000-1500 IOPS ie. still at least 10x better than the > performance we get after a snapshot is taken. > > > > Is there something particularly inefficient about the copy-on-write > block implementation that makes it much worse than writing to fresh > blocks? Note that we get this performance drop even if the other data on > the blocks are cached in memory, and since we’re using fast SSDs, the time > to read in the rest of the 256 KB should be negligible. > > > > We’re currently using Hammer but we also tested with Infernalis and it > didn’t seem any better. > > > > Cheers, > > Will > > _______________________________________________ > > ceph-users mailing list > > [email protected] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Best Regards, > > Wheat > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
