Re: Multiple large disks in server - setup considerations

Erik Forsberg Tue, 07 Jun 2011 04:35:00 -0700

On Tue, 31 May 2011 13:23:36 -0500
Jonathan Ellis <jbel...@gmail.com> wrote:

> Have you read http://wiki.apache.org/cassandra/CassandraHardware ?

I had, but it was a while ago so I guess I kind of deserved an RTFM! :-)

After re-reading it, I still want to know:

* If we disregard the performance hit caused by having the commitlog on
  the same physical device as parts of the data, are there any other
  grave effects on Cassandra's functionality with a setup like that?

* How does Cassandra handle a case where one of the disks in a striped
  RAID0 partition goes bad and is replaced? Is the only option to wipe
  everything from that node and reinit the node, or will it handle
  corrupt files? I.e, what's the recommended thing to do from an
  operations point of view when a disk dies on one of the nodes in a
  RAID0 Cassandra setup? What will cause the least risk for data loss?
  What will be the fastest way to get the node up to speed with the
  rest of the cluster?

Thanks,
\EF

> 
> On Tue, May 31, 2011 at 7:47 AM, Erik Forsberg <forsb...@opera.com>
> wrote:
> > Hi!
> >
> > I'm considering setting up a small (4-6 nodes) Cassandra cluster on
> > machines that each have 3x2TB disks. There's no hardware RAID in the
> > machine, and if there were, it could only stripe single disks
> > together, not parts of disks.
> >
> > I'm planning RF=2 (or higher).
> >
> > I'm pondering what the best disk configuration is. Two alternatives:
> >
> > 1) Make small partition on first disk for Linux installation and
> > commit log. Use Linux' software RAID0 to stripe the remaining space
> > on disk1
> >   + the two remaining disks into one large XFS partition.
> >
> > 2) Make small partition on first disk for Linux installation and
> > commit log. Mount rest of disk 1 as /var/cassandra1, then disk2
> >   as /var/cassandra2 and disk3 as /var/cassandra3.
> >
> > Is it unwise to put the commit log on the same physical disk as
> > some of the data? I guess it could impact write performance, but
> > maybe it's bad from a data consistency point of view?
> >
> > How does Cassandra handle replacement of a bad disk in the two
> > alternatives? With option 1) I guess there's risk of files being
> > corrupt. With option 2) they will simply be missing after replacing
> > the disk with a new one.
> >
> > With option 2) I guess I'm limiting the size of the total amount of
> > data in the largest CF at compaction to, hmm.. the free space on the
> > disk with most free space, correct?
> >
> > Comments welcome!
> >
> > Thanks,
> > \EF
> > --
> > Erik Forsberg <forsb...@opera.com>
> > Developer, Opera Software - http://www.opera.com/
> >
> 
> 
> 

-- 
Erik Forsberg <forsb...@opera.com>
Developer, Opera Software - http://www.opera.com/

Re: Multiple large disks in server - setup considerations

Reply via email to