If optimizing for IO, use Cassandra's JBOD configuration (list each disk under data directories in cassandra.yaml). It would put sstables on the disk thats least used. If want to optimize for disk space, I'd go with RAID0. Will probably want to tune concurrent reader/writers, stream throughput (if have network for it) and compaction throughput if you end up with IO to spare. I generally would not recommend putting multiple C* instances on a single box.
--- Chris Lohfink On Thu, Nov 6, 2014 at 5:13 PM, Kevin Burton <[email protected]> wrote: > I’m curious what people are doing with multiple SSDs per server. > > I think there are two main paths: > > - RAID 0 them… the problem here is that RAID0 is not a panacea and the > drives may or may not see better IO throughput. > > - use N cassandra instances per box (or containers) and have one C* node > accessing each SSD. The upside here is that Cassandra sees the drive > directly. The downside is that you would probably have to cheat and tell > C* that all the containers on that box are on the same “rack” so C* doesn’t > schedule two replicas on the same box. > > Thoughts? > > Kevin > > -- > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > blog: http://burtonator.wordpress.com > … or check out my Google+ profile > <https://plus.google.com/102718274791889610666/posts> > <http://spinn3r.com> > >
