On Wed, Oct 10, 2018 at 8:48 PM Kjetil Joergensen <[email protected]> wrote:
>
> Hi,
>
> We tested bcache, dm-cache/lvmcache, and one more which name eludes me with 
> PCIe NVME on top of large spinning rust drives behind a SAS3 expander - and 
> decided this were not for us.
>
> This was probably jewel with filestore, and our primary reason for trying to 
> go down this path were that leveldb compaction were killing us, and putting 
> omap/leveldb and things on separate locations were "so-so" supported (IIRC: 
> some were explicitly supported, some you could do a bit of symlink or mount 
> trickery).
>
> The caching worked - although, when we started doing power failure 
> survivability (power cycle the entire rig, wait for recovery, repeat), we 
> ended up with seriously corrupted the XFS filesystems on top of the cached 
> block device within a handful of power cycles). We did not test fully 
> disabling the spinning rust on-device cache (which were the leading 
> hypothesis of why this actually failed, potentially combined with ordering of 
> FLUSH+FUA ending up slightly funky combined with the rather asymmetric commit 
> latency). Just to rule out anything else, we did run the same power-fail test 
> regimen for days without the nvme-over-spinning-rust-caching, without 
> triggering the same filesystem corruption.
>
> So yea - I'd recommend looking at i.e. bluestore and stick rocksdb, journal 
> and anything else performance critical on faster storage instead.
>
> If you do decide to go down the dm-cache/lvmcache/(other cache) road - I'd 
> recommend throughly testing failure scenarios like i.e. power-loss so you 
> don't find out accidentally when you do have a multi-failure-domain outage. :)

Yeah, definitely do a lot of pulling disks and power cycle testing.
dm-cache had a data corruption on power loss bug in 4.9+:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b1fe7bec8a8d0cc547a22e7ddc2bd59acd67de4

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to