Thanks all for you input. I'm aware of the overlap, I'm aware I need to turn Ceph replication off, I'm aware this isn't ideal. Nonetheless in on my environments instead of raw disk to install C* on, I'm likely to just have Ceph storage. This is a fully managed environment (excepting for C*) and that's their standard.
cheers Colin On 2 February 2015 at 14:42, Daniel Compton <daniel.compton.li...@gmail.com> wrote: > As Jan has already mentioned, Ceph and Cassandra do almost all of the same > things. "Replicated self healing data storage on commodity hardware without > a SPOF" describes both of these systems. If you did manage to get it > running it would be a nightmare to reason about what's happening at the > disk and network level. > > You're going to get write amplification by your replication factor of both > Cassandra, and Ceph unless you turn one of them down. This impacts disk > I/O, disk space, CPU, and network bandwidth. If you turned down Ceph > replication I think it would be possible for all of the replicated data for > some chunk to be stored on one node and be at risk of loss. E.g. 1x Ceph, > 3x Cassandra replication could store all 3 Cassandra replicas on the same > Ceph node. 3x Ceph, 1x Cassandra would be safer, but presumably slower. > > Lastly Cassandra is designed around running against local disks, you will > lose a lot of the advantages of this running it on Ceph. > > Daniel. > > On Mon, 2 Feb 2015 at 1:11 am Baskar Duraikannu < > baskar.duraika...@outlook.com> wrote: > >> What is the reason for running Cassandra on Ceph? I have both running >> in my environment but doing different things - Cassandra as transactional >> store and Ceph as block storage for storing files. >> ------------------------------ >> From: Jan <cne...@yahoo.com> >> Sent: 2/1/2015 2:53 AM >> To: user@cassandra.apache.org >> Subject: Re: Cassandra on Ceph >> >> Colin; >> >> Ceph is a block based storage architecture based on RADOS. >> It comes with its own replication & rebalancing along with a map of the >> storage layer. >> >> Some more details & similarities: >> a)Ceph stores a client’s data as objects within storage pools. (think >> of C* partitions) >> b) Using the CRUSH algorithm, Ceph calculates which placement group >> should contain the object, (C* primary keys & vnode data distribution) >> c) and further calculates which Ceph OSD Daemon should store the >> placement group (C* node locality) >> d) The CRUSH algorithm enables the Ceph Storage Cluster to scale, >> rebalance, and recover dynamically (C* big table storage architecture). >> >> Summary: >> C* comes with everything that Ceph provides (with the exception of block >> storage). >> There is no value add that Ceph brings to the table that C* does not >> already provide. >> I seriously doubt if C* could even work out of the box with yet another >> level of replication & rebalancing. >> >> Hope this helps >> Jan/ >> >> C* Architect >> >> >> >> >> >> >> On Saturday, January 31, 2015 7:28 PM, Colin Taylor < >> colin.tay...@gmail.com> wrote: >> >> >> I may be forced to run Cassandra on top of Ceph. Does anyone have >> experience / tips with this. Or alternatively, strong reasons why this >> won't work. >> >> cheers >> Colin >> >> >>