Re: Why so many SSTables?

2012-04-12 Thread Thorsten von Eicken
>From my experience I would strongly advise against leveled compaction for your use-case. But you should certainly test and see for yourself! I have ~1TB on a node with ~13GB of heap. I ended up with 30k SSTables. I raised the SSTable size to 100MB but that didn't prove to be sufficient and I did i

Re: Why so many SSTables?

2012-04-12 Thread Romain HARDOUIN
I've just opened a new JIRA: CASSANDRA-4142 I've double checked numbers, 7747 seems to be array list object's capacity (Eclipse Memory Analyzer displays "java.lang.Object[7747] @ 0x7d3f3f798"). Actually there are 5757 browsable entries in EMA therefore each object is about 140 KB (size varies be

Re: Why so many SSTables?

2012-04-11 Thread Watanabe Maki
If you increase sstable_size_in_mb to 200MB, you will need more IO for each compaction. For example, if your memtable will be flushed, and LCS needs to compact it with 10 overwrapped L1 sstables, you will need almost 2GB read and 2GB write for the single compaction. From iPhone On 2012/04/11,

Re: Why so many SSTables?

2012-04-11 Thread Ben Coverston
>In general I would limit the data load per node to 300 to 400GB. Otherwise > things can painful when it comes time to run compaction / repair / move . +1 on more nodes of moderate size

Re: Why so many SSTables?

2012-04-11 Thread aaron morton
In general I would limit the data load per node to 300 to 400GB. Otherwise things can painful when it comes time to run compaction / repair / move . Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/04/2012, at 1:00 AM, Dave Brosius wrote

Re: Why so many SSTables?

2012-04-11 Thread Dave Brosius
It's easy to spend other people's money, but handling 1TB of data with 1.5 g heap? Memory is cheap, and just a little more will solve many problems. On 04/11/2012 08:43 AM, Romain HARDOUIN wrote: Thank you for your answers. I originally post this question because we encoutered an OOM Excep

Re: Why so many SSTables?

2012-04-11 Thread Sylvain Lebresne
On Wed, Apr 11, 2012 at 2:43 PM, Romain HARDOUIN wrote: > > Thank you for your answers. > > I originally post this question because we encoutered an OOM Exception on 2 > nodes during repair session. > Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner > which contains as many

Re: Why so many SSTables?

2012-04-11 Thread Romain HARDOUIN
Thank you for your answers. I originally post this question because we encoutered an OOM Exception on 2 nodes during repair session. Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner which contains as many objects there are SSTables on disk (7747 objects at the time). Thi

Re: Why so many SSTables?

2012-04-10 Thread Maki Watanabe
You can configure sstable size by sstable_size_in_mb parameter for LCS. The default value is 5MB. You should better to check you don't have many pending compaction tasks with nodetool tpstats and compactionstats also. If you have enough IO throughput, you can increase compaction_throughput_mb_per_s

Re: Why so many SSTables?

2012-04-10 Thread Jonathan Ellis
LCS explicitly tries to keep sstables under 5MB to minimize extra work done by compacting data that didn't really overlap across different levels. On Tue, Apr 10, 2012 at 9:24 AM, Romain HARDOUIN wrote: > > Hi, > > We are surprised by the number of files generated by Cassandra. > Our cluster cons

Why so many SSTables?

2012-04-10 Thread Romain HARDOUIN
Hi, We are surprised by the number of files generated by Cassandra. Our cluster consists of 9 nodes and each node handles about 35 GB. We're using Cassandra 1.0.6 with LeveledCompactionStrategy. We have 30 CF. We've got roughly 45,000 files under the keyspace directory on each node: ls -l /var/l