Hi Haomai;
Do you use filestore_fiemap=true parameter over CentOS7+ Hammer/Interfalis
on any Production Ceph environment for rbd style storage? Is it safe to use
on production environment?

Thanks
Özhan

On Wed, Nov 18, 2015 at 8:12 AM, Haomai Wang <[email protected]> wrote:

> Yes, it's a expected case. Actually if you use Hammer, you can enable
> filestore_fiemap to use sparse copy which especially useful for rbd
> snapshot copy. But keep in mind some old kernel are *broken* in
> fiemap. CentOS 7 is only the distro I verfied fine to this feature.
>
>
> On Wed, Nov 18, 2015 at 12:25 PM, Will Bryant <[email protected]>
> wrote:
> > Hi,
> >
> > We’ve been running an all-SSD Ceph cluster for a few months now and
> generally are very happy with it.
> >
> > However, we’ve noticed that if we create a snapshot of an RBD device,
> then writing to the RBD goes massively slower than before we took the
> snapshot.  Similarly, we get poor performance if we make a clone of that
> snapshot and write to it.
> >
> > For example, using fio to run a 2-worker 4kb synchronous random write
> benchmark, we normally get about 5000 IOPS to RBD on our test-sized cluster
> (Intel 3710, 10G networking, Ubuntu 14.04).  But as soon as I take a
> snapshot, this goes down to about 100 IOPS, and with high variability - at
> times 0 IOPS, 60 IOPS, or 300 IOPS.
> >
> > I realise that after a snapshot, any write will trigger a copy of the
> block, which by default would be 4 MB of data - to minimize this effect
> I’ve reduced the RBD order to 18 ie. 256 KB blocks.
> >
> > But shouldn’t that effect only degrade it to the same performance as we
> get on a completely new RBD image that has no snapshots and no data?  For
> us that is more like 1000-1500 IOPS ie. still at least 10x better than the
> performance we get after a snapshot is taken.
> >
> > Is there something particularly inefficient about the copy-on-write
> block implementation that makes it much worse than writing to fresh
> blocks?  Note that we get this performance drop even if the other data on
> the blocks are cached in memory, and since we’re using fast SSDs, the time
> to read in the rest of the 256 KB should be negligible.
> >
> > We’re currently using Hammer but we also tested with Infernalis and it
> didn’t seem any better.
> >
> > Cheers,
> > Will
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Best Regards,
>
> Wheat
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to