Colin, I'm not familiar with Ceph, but it sounds like it's a more sophisticated version of a SAN.
Be aware that running Cassandra on absolutely anything other than local disks is an anti-pattern. It will have a profound negative impact on performance, scalability, and reliability of your cluster. On Sun, Feb 1, 2015 at 8:13 PM, Colin Taylor <colin.tay...@gmail.com> wrote: > Oops - Nonetheless in on my environments -> Nonetheless in *one of* my > environments > > On 2 February 2015 at 16:12, Colin Taylor <colin.tay...@gmail.com> wrote: > >> Thanks all for you input. >> >> I'm aware of the overlap, I'm aware I need to turn Ceph replication off, >> I'm aware this isn't ideal. Nonetheless in on my environments instead of >> raw disk to install C* on, I'm likely to just have Ceph storage. This is a >> fully managed environment (excepting for C*) and that's their standard. >> >> cheers >> Colin >> >> On 2 February 2015 at 14:42, Daniel Compton < >> daniel.compton.li...@gmail.com> wrote: >> >>> As Jan has already mentioned, Ceph and Cassandra do almost all of the >>> same things. "Replicated self healing data storage on commodity hardware >>> without a SPOF" describes both of these systems. If you did manage to get >>> it running it would be a nightmare to reason about what's happening at the >>> disk and network level. >>> >>> You're going to get write amplification by your replication factor of >>> both Cassandra, and Ceph unless you turn one of them down. This impacts >>> disk I/O, disk space, CPU, and network bandwidth. If you turned down Ceph >>> replication I think it would be possible for all of the replicated data for >>> some chunk to be stored on one node and be at risk of loss. E.g. 1x Ceph, >>> 3x Cassandra replication could store all 3 Cassandra replicas on the same >>> Ceph node. 3x Ceph, 1x Cassandra would be safer, but presumably slower. >>> >>> Lastly Cassandra is designed around running against local disks, you >>> will lose a lot of the advantages of this running it on Ceph. >>> >>> Daniel. >>> >>> On Mon, 2 Feb 2015 at 1:11 am Baskar Duraikannu < >>> baskar.duraika...@outlook.com> wrote: >>> >>>> What is the reason for running Cassandra on Ceph? I have both running >>>> in my environment but doing different things - Cassandra as transactional >>>> store and Ceph as block storage for storing files. >>>> ------------------------------ >>>> From: Jan <cne...@yahoo.com> >>>> Sent: 2/1/2015 2:53 AM >>>> To: user@cassandra.apache.org >>>> Subject: Re: Cassandra on Ceph >>>> >>>> Colin; >>>> >>>> Ceph is a block based storage architecture based on RADOS. >>>> It comes with its own replication & rebalancing along with a map of the >>>> storage layer. >>>> >>>> Some more details & similarities: >>>> a)Ceph stores a client’s data as objects within storage pools. >>>> (think of C* partitions) >>>> b) Using the CRUSH algorithm, Ceph calculates which placement group >>>> should contain the object, (C* primary keys & vnode data distribution) >>>> c) and further calculates which Ceph OSD Daemon should store the >>>> placement group (C* node locality) >>>> d) The CRUSH algorithm enables the Ceph Storage Cluster to scale, >>>> rebalance, and recover dynamically (C* big table storage architecture). >>>> >>>> Summary: >>>> C* comes with everything that Ceph provides (with the exception of >>>> block storage). >>>> There is no value add that Ceph brings to the table that C* does not >>>> already provide. >>>> I seriously doubt if C* could even work out of the box with yet >>>> another level of replication & rebalancing. >>>> >>>> Hope this helps >>>> Jan/ >>>> >>>> C* Architect >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Saturday, January 31, 2015 7:28 PM, Colin Taylor < >>>> colin.tay...@gmail.com> wrote: >>>> >>>> >>>> I may be forced to run Cassandra on top of Ceph. Does anyone have >>>> experience / tips with this. Or alternatively, strong reasons why this >>>> won't work. >>>> >>>> cheers >>>> Colin >>>> >>>> >>>> >> >