Re: Newsletter / Marketing: Re: Compaction Strategy

2018-09-21 Thread Ali Hubail
@cassandra.apache.org, cc Subject Newsletter / Marketing: Re: Compaction Strategy Hi Ali, Please find my answers 1) The table holds customer history data, where we receive the transaction data everyday for multiple vendors and batch job is executed which updates the data if the customer do any transactions

Re: Compaction Strategy

2018-09-20 Thread rajasekhar kommineni
a.apache.org, cc Subject Re: Compaction Strategy Hello, Can any one respond to my questions. Is it a good idea to disable auto compaction and schedule it every 3 days. I am unable to control compaction and it is causing timeouts. Also will reducing or increasing compaction_throughput_m

Re: Compaction Strategy

2018-09-20 Thread Ali Hubail
, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. rajasekhar kommineni 09/19/2018 04:44 PM Please respond to user@cassandra.apache.org To user@cassandra.apache.org, cc Subject Re: Compaction

Re: Compaction Strategy

2018-09-19 Thread Nitan Kainth
It’s not recommended to disable compaction, you will end up with hundreds to thousands of sstables and increased read latency. If your data is immitable, means no update/deletes it will have least impact. Decreasing compaction throughput will release resources for application but don’t

Re: Compaction Strategy

2018-09-19 Thread rajasekhar kommineni
Hello, Can any one respond to my questions. Is it a good idea to disable auto compaction and schedule it every 3 days. I am unable to control compaction and it is causing timeouts. Also will reducing or increasing compaction_throughput_mb_per_sec eliminate timeouts ? Thanks, > On Sep 17,

Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
> > I wouldn't use TWCS if there's updates, you're going to risk having > data that's never deleted and really small sstables sticking around > forever. How do you risk having data sticking around forever when everything is TTL'd? If you use really large buckets, what's the point of TWCS? No

Re: Compaction strategy for update heavy workload

2018-06-13 Thread Jonathan Haddad
I wouldn't use TWCS if there's updates, you're going to risk having data that's never deleted and really small sstables sticking around forever. If you use really large buckets, what's the point of TWCS? Honestly this is such a small workload you could easily use STCS or LCS and you'd likely

Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
TWCS is probably still worth trying. If you mean updating old rows in TWCS "out of order updates" will only really mean you'll hit more SSTables on read. This might add a bit of complexity in your client if your bucketing partitions (not strictly necessary), but that's about it. As long as you're

Re: Compaction Strategy guidance

2014-11-25 Thread Jean-Armel Luce
Hi Andrei, Hi Nicolai, Which version of C* are you using ? There are some recommendations about the max storage per node : http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to handle 10x (3-5TB). I have

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Hi Jean-Armel, Nikolai, 1. Increasing sstable size doesn't work (well, I think, unless we overscale - add more nodes than really necessary, which is prohibitive for us in a way). Essentially there is no change. I gave up and will go for STCS;-( 2. We use 2.0.11 as of now 3. We are running on EC2

Re: Compaction Strategy guidance

2014-11-25 Thread Marcus Eriksson
If you are that write-heavy you should definitely go with STCS, LCS optimizes for reads by doing more compactions /Marcus On Tue, Nov 25, 2014 at 11:22 AM, Andrei Ivanov aiva...@iponweb.net wrote: Hi Jean-Armel, Nikolai, 1. Increasing sstable size doesn't work (well, I think, unless we

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Yep, Marcus, I know. It's mainly a question of cost of those extra x2 disks, you know. Our final setup will be more like 30TB, so doubling it is still some cost. But i guess, we will have to live with it On Tue, Nov 25, 2014 at 1:26 PM, Marcus Eriksson krum...@gmail.com wrote: If you are that

Re: Compaction Strategy guidance

2014-11-25 Thread Nikolai Grigoriev
Hi Jean-Armel, I am using latest and greatest DSE 4.5.2 (4.5.3 in another cluster but there are no relevant changes between 4.5.2 and 4.5.3) - thus, Cassandra 2.0.10. I have about 1,8Tb of data per node now in total, which falls into that range. As I said, it is really a problem with large

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Nikolai, Just in case you've missed my comment in the thread (guess you have) - increasing sstable size does nothing (in our case at least). That is, it's not worse but the load pattern is still the same - doing nothing most of the time. So, I switched to STCS and we will have to live with extra

Re: Compaction Strategy guidance

2014-11-25 Thread Nikolai Grigoriev
Andrei, Oh, yes, I have scanned the top of your previous email but overlooked the last part. I am using SSDs so I prefer to put extra work to keep my system performing and save expensive disk space. So far I've been able to size the system more or less correctly so these LCS limitations do not

Re: Compaction Strategy guidance

2014-11-25 Thread Andrei Ivanov
Ah, clear then. SSD usage imposes a different bias in terms of costs;-) On Tue, Nov 25, 2014 at 9:48 PM, Nikolai Grigoriev ngrigor...@gmail.com wrote: Andrei, Oh, yes, I have scanned the top of your previous email but overlooked the last part. I am using SSDs so I prefer to put extra work

Re: Compaction Strategy guidance

2014-11-24 Thread Nikolai Grigoriev
YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good

Re: Compaction Strategy guidance

2014-11-24 Thread Andrei Ivanov
: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you

Re: Compaction Strategy guidance

2014-11-24 Thread Nikolai Grigoriev
...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use

Re: Compaction Strategy guidance

2014-11-24 Thread Andrei Ivanov
: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU

Re: Compaction Strategy guidance

2014-11-24 Thread Nikolai Grigoriev
. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful

Re: Compaction Strategy guidance

2014-11-24 Thread Andrei Ivanov
Muñoz G. smg...@gmail.com wrote: ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction

Re: Compaction Strategy guidance

2014-11-24 Thread Robert Coli
On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev ngrigor...@gmail.com wrote: One of the obvious recommendations I have received was to run more than one instance of C* per host. Makes sense - it will reduce the amount of data per node and will make better use of the resources. This is

Re: Compaction Strategy guidance

2014-11-23 Thread Andrei Ivanov
wrote: ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good

Re: Compaction Strategy guidance

2014-11-23 Thread Nikolai Grigoriev
de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about

Re: Compaction Strategy guidance

2014-11-23 Thread Jean-Armel Luce
, Servando Muñoz G. smg...@gmail.com wrote: ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance

Re: Compaction Strategy guidance

2014-11-23 Thread Jean-Armel Luce
. smg...@gmail.com wrote: ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta

Re: Compaction Strategy guidance

2014-11-23 Thread Andrei Ivanov
@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall

Re: Compaction Strategy guidance

2014-11-22 Thread Nikolai Grigoriev
Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall back to STCS if it is falling behind (which is very possible with heavy writing activity)

RE: Compaction Strategy guidance

2014-11-22 Thread Servando Muñoz G .
ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes

Re: compaction strategy

2011-05-11 Thread Terje Marthinussen
Not sure I follow you. 4 sstables is the minimum compaction look for (by default). If there is 30 sstables of ~20MB sitting there because compaction is behind, you will compact those 30 sstables together (unless there is not enough space for that and considering you haven't changed the

Re: compaction strategy

2011-05-11 Thread Jonathan Ellis
You are of course free to reduce the min per bucket to 2. The fundamental idea of sstables + compaction is to trade disk space for higher write performance. For most applications this is the right trade to make on modern hardware... I don't think you'll get very far trying to get the 2nd without

Re: compaction strategy

2011-05-10 Thread Terje Marthinussen
Everyone may be well aware of that, but I'll still remark that a minor compaction will try to merge as many 20MB sstables as it can up to the max compaction threshold (which is configurable). So if you do accumulate some newly created sstable at some point in time, the next minor compaction

Re: compaction strategy

2011-05-10 Thread Sylvain Lebresne
On Tue, May 10, 2011 at 6:20 PM, Terje Marthinussen tmarthinus...@gmail.com wrote: Everyone may be well aware of that, but I'll still remark that a minor compaction will try to merge as many 20MB sstables as it can up to the max compaction threshold (which is configurable). So if you do

Re: compaction strategy

2011-05-09 Thread Sylvain Lebresne
On Sat, May 7, 2011 at 7:20 PM, Terje Marthinussen tmarthinus...@gmail.com wrote: This is an all ssd system. I have no problems with read/write performance due to I/O. I do have a potential with the crazy explosion you can get in terms of disk use if compaction cannot keep up. As things

Re: compaction strategy

2011-05-09 Thread David Boxenhorn
I'm also not too much in favor of triggering major compactions, because it mostly have a nasty effect (create one huge sstable). If that is the case, why can't major compactions create many, non-overlapping SSTables? In general, it seems to me that non-overlapping SSTables have all the

Re: compaction strategy

2011-05-09 Thread David Boxenhorn
If they each have their own copy of the data, then they are *not* non-overlapping! If you have non-overlapping SSTables (and you know the min/max keys), it's like having one big SSTable because you know exactly where each row is, and it becomes easy to merge a new SSTable in small batches, rather

Re: compaction strategy

2011-05-09 Thread Terje Marthinussen
Sorry, I was referring to the claim that one big file was a problem, not the non-overlapping part. If you never compact to a single file, you never get rid of all generations/duplicates. With non-overlapping files covering small enough token ranges, compacting down to one file is not a big issue.

Re: compaction strategy

2011-05-07 Thread Jonathan Ellis
On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: 1. Would it make sense to make full compactions occur a bit more aggressive. I'd rather reduce the performance impact of being behind, than do more full compactions:

Re: compaction strategy

2011-05-07 Thread Edward Capriolo
On Sat, May 7, 2011 at 8:54 AM, Jonathan Ellis jbel...@gmail.com wrote: On Sat, May 7, 2011 at 2:01 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: 1. Would it make sense to make full compactions occur a bit more aggressive. I'd rather reduce the performance impact of being behind, than

Re: compaction strategy

2011-05-07 Thread Peter Schuller
If you are seeing 600 pending compaction tasks regularly you almost definitely need more hardware. Note that pending compactions is pretty misleading and you can't really draw conclusions just based on the pending compactions number/graph. For example, standard behavior during e.g.a long

Re: compaction strategy

2011-05-07 Thread Terje Marthinussen
This is an all ssd system. I have no problems with read/write performance due to I/O. I do have a potential with the crazy explosion you can get in terms of disk use if compaction cannot keep up. As things falls behind and you get many generations of data, yes, read performance gets a problem due

Re: compaction strategy

2011-05-07 Thread Peter Schuller
It does not really make sense to me to go through all these minor merges when a full compaction will do a much faster and better job. In a system heavily reliant on caching (platter drives, large data sizes, much larger than RAM) major compactions can be very detrimental to performance due to