Re: cassandra hardware requirements (STAT/SSD)

2017-10-03 Thread Jeronimo de A. Barros
Hello,

It's a bit old but at least for me, still a great guide:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html

My 2 cents: We deal with electronic invoices and our load is about 10,000
transactions/s during the peak housr.

We are not located in USA, so AWS would be a bit expensive for our project.
After a lot of research, tests and simulations (technical and financial) I
decide to purchase four Supermicros MicroClouds 5038ML-H12TRF with Xeon E3
series, 32GB RAM and 4 x 2.5 "1TB spinning disks each node. The chassis are
divided in 2 DCs with 2 x 1Gbps links redundant for DC/DC interconnection
and each blade is divided as follow: 3 x Cassandra 2.0 nodes for
production, 3 x Cassandra 3.x (tests for migration), 3 x Spark / Hadoop for
analytics and 3 x application servers. So, we have a 12 node Cassandra 2.0
that has been working very fine for the last 3 years with a very low
latency and overhead. Some bumps here and there but with properly
management and monitoring we can deal with almost everything.

Despite we use 2.5 "disks, we always check the BlackBlaze's hard drive
reliability reports before any disk purchasing:
https://www.backblaze.com/blog/hard-drive-failure-rates-q1-2017/

On Cas 2.0 we started using separeted disks for the data_file_directories.
On Cas 3.x, following Al Tobey's guide, we're using MD Raid0 in a XFS
filesystem and the performance are far better than on Cas 2.0.

I hope it helps.

Jero


On Fri, Sep 29, 2017 at 3:19 AM, Peng Xiao <2535...@qq.com> wrote:

> Hi there,
> we are struggling on hardware selection,we all know that ssd is good,and
> Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are
> considering to use sata disk,we noticed that the normal IO throughput is
> 7MB/s.
>
> Could anyone give some advice?
>
> Thanks,
> Peng Xiao
>
>


Re: cassandra hardware requirements (STAT/SSD)

2017-09-29 Thread Jeff Jirsa
Cassandra was designed for spinning disks -

commitlogs are (mostly) append-only, linear writes.
sstables are written exactly once, again with linear writes.
The index for finding the positions in sstables to start reads is cached in
RAM (and when it's not cached, it's a linear read through a file).
Most reads for sstables are linear seeks.
Cassandra tries to optimize for fewest possible sstable reads (using things
like bloom filters and compaction strategies that optimize for fewer files
touched per read - STCS row lifting, and LCS in general).
Some day we may have b/b+ tree indices, which will not be linear reads, but
right now everything is linear.

All of that said: SSDs ARE faster, and you're not always CPU bound. You'll
get better latencies when you do have to go to disk if you're using SSDs,
but you can CERTAINLY make spinning disks work.



On Thu, Sep 28, 2017 at 11:19 PM, Peng Xiao <2535...@qq.com> wrote:

> Hi there,
> we are struggling on hardware selection,we all know that ssd is good,and
> Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are
> considering to use sata disk,we noticed that the normal IO throughput is
> 7MB/s.
>
> Could anyone give some advice?
>
> Thanks,
> Peng Xiao
>
>


Re: cassandra hardware requirements (STAT/SSD)

2017-09-29 Thread daemeon reiydelle
Note to the AWS poster, you have some limited understanding of how disks
are presented to AWS compute nodes. As a result your post is not relevant,
and misleading.

When considering throughput, recall that disk IO is ideally parallel. While
C* handles IO across multiple devices nicely, the unit of storage is a very
large "block". Whether that serial read is adequate, or whether you do RAID
0 (max parallel, no checksum overhead, loss of one drive makes the whole
volume unavailable) is a performance vs. reliability tradeoff.



We like to say that we don’t get to choose our parents, that they were
given by chance – yet, we can truly choose whose children we wish to be. -
Seneca the Younger



*Daemeon C.M. ReiydelleSan Francisco 1.415.501.0198London 44 020 8144 9872*


On Fri, Sep 29, 2017 at 6:11 AM, Lutaya Shafiq Holmes <
lutayasha...@gmail.com> wrote:

> Please try and USE AWS
>
> amazon web services on aws.amazon.com
>
> On 9/29/17, Peng Xiao <2535...@qq.com> wrote:
> > Hi there,
> > we are struggling on hardware selection,we all know that ssd is good,and
> > Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are
> > considering to use sata disk,we noticed that the normal IO throughput is
> > 7MB/s.
> >
> >
> > Could anyone give some advice?
> >
> >
> > Thanks,
> > Peng Xiao
>
>
> --
> Lutaaya Shafiq
> Web: www.ronzag.com | i...@ronzag.com
> Mobile: +256702772721 | +256783564130
> Twitter: @lutayashafiq
> Skype: lutaya5
> Blog: lutayashafiq.com
> http://www.fourcornersalliancegroup.com/?a=shafiqholmes
>
> "The most beautiful people we have known are those who have known defeat,
> known suffering, known struggle, known loss and have found their way out of
> the depths. These persons have an appreciation, a sensitivity and an
> understanding of life that fills them with compassion, gentleness and a
> deep loving concern. Beautiful people do not just happen." - *Elisabeth
> Kubler-Ross*
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: cassandra hardware requirements (STAT/SSD)

2017-09-29 Thread Lutaya Shafiq Holmes
Please try and USE AWS

amazon web services on aws.amazon.com

On 9/29/17, Peng Xiao <2535...@qq.com> wrote:
> Hi there,
> we are struggling on hardware selection,we all know that ssd is good,and
> Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are
> considering to use sata disk,we noticed that the normal IO throughput is
> 7MB/s.
>
>
> Could anyone give some advice?
>
>
> Thanks,
> Peng Xiao


-- 
Lutaaya Shafiq
Web: www.ronzag.com | i...@ronzag.com
Mobile: +256702772721 | +256783564130
Twitter: @lutayashafiq
Skype: lutaya5
Blog: lutayashafiq.com
http://www.fourcornersalliancegroup.com/?a=shafiqholmes

"The most beautiful people we have known are those who have known defeat,
known suffering, known struggle, known loss and have found their way out of
the depths. These persons have an appreciation, a sensitivity and an
understanding of life that fills them with compassion, gentleness and a
deep loving concern. Beautiful people do not just happen." - *Elisabeth
Kubler-Ross*

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



cassandra hardware requirements (STAT/SSD)

2017-09-29 Thread Peng Xiao
Hi there,
we are struggling on hardware selection,we all know that ssd is good,and 
Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are 
considering to use sata disk,we noticed that the normal IO throughput is 7MB/s.


Could anyone give some advice?


Thanks,
Peng Xiao