Hello DuyHai,

I have no problem with performance even if I'm using 3 HDD in RAID 0.
Last 4 years of data were imported in two weeks with is acceptable for the
client.
Daily data will be much less intensive and my client is more concerned with
storage price than with pure latency.
To be more precise, Cassandra was not chosen for its latency but because it
is a distributed, multi-datacenter and no downtime database.

Tests show that write amplification is not a problem in our case. So, if
LCS may not be the best technical choice, it is a relatively correct one as
long as the number of sstables doesn't explode.

The first thing I looked at was compaction stats and there is no compaction
pending.
The size of most sstable is 160 MB, the expected size.
If you do the math, 4 TB divided by 160 MB equals 26,214 sstables. With 8
files per sstable, you get 209,715 files.
With 6 TB, we get 39,321 sstables and 314,572 files.

If I change sstable_size_in_mb to 512, it would end with 12,2881 sstables
and 98,304 files for 6 TB.
That seems to be a good compromise if there is no trap.

The only problem I can see would be a latency drop due to the size of index
and summary.
But as average line size is 70 KB. So there should not be so many entries
per file.

Am I missing something ?


-- 
Jérôme Mainaud
jer...@mainaud.com

2016-08-31 13:28 GMT+02:00 DuyHai Doan <doanduy...@gmail.com>:

> Some random thoughts
>
> 1) Are they using SSD ?
>
> 2) If using SSD, I remember that one recommendation is not to exceed
> ~3Tb/node, unless they're using DateTiered or better TimeWindow compaction
> strategy
>
> 3) LCS is very disk intensive and usually exacerbates write amp the more
> you have data
>
> 4) The huge number of SSTable let me suspect some issue with compaction
> not keeping up. Can you post here a "nodetool tablestats"  and
> "compactionstats" ? Are there many pending compactions ?
>
> 5) Last but not least, what does "dstat" shows ? Is there any frequent CPU
> wait ?
>
> On Wed, Aug 31, 2016 at 12:34 PM, Jérôme Mainaud <jer...@mainaud.com>
> wrote:
>
>> Hello,
>>
>> My cluster use LeveledCompactionStrategy on rather big nodes (9 TB disk
>> per node with a target of 6 TB of data and the 3 remaining TB are reserved
>> for compaction and snapshots). There is only one table for this application.
>>
>> With default sstable_size_in_mb at 160 MB, we have a huge number of
>> sstables (25,000+ for 4TB already loaded) which lead to IO errors due to
>> open files limit (set at 100,000).
>>
>> Increasing the open files limit can be a solution but at this level, I
>> would rather increase sstable_size to 500 MB which would keep the file
>> number around 100,000.
>>
>> Could increasing sstable size lead to any problem I don't see ?
>> Do you have any advice about this ?
>>
>> Thank you.
>>
>> --
>> Jérôme Mainaud
>> jer...@mainaud.com
>>
>
>

Reply via email to