On 10/10/2012 09:23 AM, Sage Weil wrote:
On Wed, 10 Oct 2012, Andrey Korolyov wrote:
Hi,

Recent tests on my test rack with 20G IB(iboip, 64k mtu, default
CUBIC, CFQ, LSI SAS 2108 w/ wb cache) interconnect shows a quite
fantastic performance - on both reads and writes Ceph completely
utilizing all disk bandwidth as high as 0.9 of theoretical limit of
sum of all bandwidths bearing in mind replication level. The only
thing that may bring down overall performance is a O_SYNC|O_DIRECT
writes which will be issued by almost every database server in the
default setup. Assuming that the database config may be untouchable
and somehow I can build very reliable hardware setup which `ll never
fail on power, should ceph have an option to ignore these flags? May
be there is another real-world cases for including such or I am very
wrong even thinking on fool client application in this way.

I certainly wouldn't recommend it, but there are probably use cases where
it makes sense (i.e., the data isn't as important as the performance).
Any such option would probably be called

  rbd async flush danger danger = true

and would trigger a flush but not wait for it, or perhaps

  rbd ignore flush danger danger = true

which would not honor flush at all.

qemu already has a cache=unsafe option which does exactly that.

This would jeopoardize the integrity of the file system living on the RBD
image; they rely on flush to order their commits, and playing fast and
loose with that can lead to any number of corruptions.  The only silver
lining is that in the not-so-distant future (3-4 years ago) this was
poorly supported by the block layer and file systems alike and ext3 didn't
crash and burn as quite often as you might have expected.

Anyway, not something I would recommend, certainly for a generic VM
platform.  Maybe if you have a sepcific performance-sensitive application
you can afford to let crash and burn...

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to