Re: Compaction Strategy guidance

Andrei Ivanov Mon, 24 Nov 2014 06:33:11 -0800

Nikolai,

This is more or less what I'm seeing on my cluster then. Trying to
switch to bigger sstables right now (1Gb)


On Mon, Nov 24, 2014 at 5:18 PM, Nikolai Grigoriev <ngrigor...@gmail.com> wrote:
> Andrei,
>
> Oh, Monday mornings...Tb :)
>
> On Mon, Nov 24, 2014 at 9:12 AM, Andrei Ivanov <aiva...@iponweb.net> wrote:
>>
>> Nikolai,
>>
>> Are you sure about 1.26Gb? Like it doesn't look right - 5195 tables
>> with 256Mb table size...
>>
>> Andrei
>>
>> On Mon, Nov 24, 2014 at 5:09 PM, Nikolai Grigoriev <ngrigor...@gmail.com>
>> wrote:
>> > Jean-Armel,
>> >
>> > I have only two large tables, the rest is super-small. In the test
>> > cluster
>> > of 15 nodes the largest table has about 110M rows. Its total size is
>> > about
>> > 1,26Gb per node (total disk space used per node for that CF). It's got
>> > about
>> > 5K sstables per node - the sstable size is 256Mb. cfstats on a "healthy"
>> > node look like this:
>> >
>> >     Read Count: 8973748
>> >     Read Latency: 16.130059053251774 ms.
>> >     Write Count: 32099455
>> >     Write Latency: 1.6124713938912671 ms.
>> >     Pending Tasks: 0
>> >         Table: wm_contacts
>> >         SSTable count: 5195
>> >         SSTables in each level: [27/4, 11/10, 104/100, 1053/1000, 4000,
>> > 0,
>> > 0, 0, 0]
>> >         Space used (live), bytes: 1266060391852
>> >         Space used (total), bytes: 1266144170869
>> >         SSTable Compression Ratio: 0.32604853410787327
>> >         Number of keys (estimate): 25696000
>> >         Memtable cell count: 71402
>> >         Memtable data size, bytes: 26938402
>> >         Memtable switch count: 9489
>> >         Local read count: 8973748
>> >         Local read latency: 17.696 ms
>> >         Local write count: 32099471
>> >         Local write latency: 1.732 ms
>> >         Pending tasks: 0
>> >         Bloom filter false positives: 32248
>> >         Bloom filter false ratio: 0.50685
>> >         Bloom filter space used, bytes: 20744432
>> >         Compacted partition minimum bytes: 104
>> >         Compacted partition maximum bytes: 3379391
>> >         Compacted partition mean bytes: 172660
>> >         Average live cells per slice (last five minutes): 495.0
>> >         Average tombstones per slice (last five minutes): 0.0
>> >
>> > Another table of similar structure (same number of rows) is about 4x
>> > times
>> > smaller. That table does not suffer from those issues - it compacts well
>> > and
>> > efficiently.
>> >
>> > On Mon, Nov 24, 2014 at 2:30 AM, Jean-Armel Luce <jaluc...@gmail.com>
>> > wrote:
>> >>
>> >> Hi Nikolai,
>> >>
>> >> Please could you clarify a little bit what you call "a large amount of
>> >> data" ?
>> >>
>> >> How many tables ?
>> >> How many rows in your largest table ?
>> >> How many GB in your largest table ?
>> >> How many GB per node ?
>> >>
>> >> Thanks.
>> >>
>> >>
>> >>
>> >> 2014-11-24 8:27 GMT+01:00 Jean-Armel Luce <jaluc...@gmail.com>:
>> >>>
>> >>> Hi Nikolai,
>> >>>
>> >>> Thanks for those informations.
>> >>>
>> >>> Please could you clarify a little bit what you call "
>> >>>
>> >>> 2014-11-24 4:37 GMT+01:00 Nikolai Grigoriev <ngrigor...@gmail.com>:
>> >>>>
>> >>>> Just to clarify - when I was talking about the large amount of data I
>> >>>> really meant large amount of data per node in a single CF (table).
>> >>>> LCS does
>> >>>> not seem to like it when it gets thousands of sstables (makes 4-5
>> >>>> levels).
>> >>>>
>> >>>> When bootstraping a new node you'd better enable that option from
>> >>>> CASSANDRA-6621 (the one that disables STCS in L0). But it will still
>> >>>> be a
>> >>>> mess - I have a node that I have bootstrapped ~2 weeks ago. Initially
>> >>>> it had
>> >>>> 7,5K pending compactions, now it has almost stabilized ad 4,6K. Does
>> >>>> not go
>> >>>> down. Number of sstables at L0  is over 11K and it is slowly slowly
>> >>>> building
>> >>>> upper levels. Total number of sstables is 4x the normal amount. Now I
>> >>>> am not
>> >>>> entirely sure if this node will ever get back to normal life. And
>> >>>> believe me
>> >>>> - this is not because of I/O, I have SSDs everywhere and 16 physical
>> >>>> cores.
>> >>>> This machine is barely using 1-3 cores at most of the time. The
>> >>>> problem is
>> >>>> that allowing STCS fallback is not a good option either - it will
>> >>>> quickly
>> >>>> result in a few 200Gb+ sstables in my configuration and then these
>> >>>> sstables
>> >>>> will never be compacted. Plus, it will require close to 2x disk space
>> >>>> on
>> >>>> EVERY disk in my JBOD configuration...this will kill the node sooner
>> >>>> or
>> >>>> later. This is all because all sstables after bootstrap end at L0 and
>> >>>> then
>> >>>> the process slowly slowly moves them to other levels. If you have
>> >>>> write
>> >>>> traffic to that CF then the number of sstables and L0 will grow
>> >>>> quickly -
>> >>>> like it happens in my case now.
>> >>>>
>> >>>> Once something like
>> >>>> https://issues.apache.org/jira/browse/CASSANDRA-8301
>> >>>> is implemented it may be better.
>> >>>>
>> >>>>
>> >>>> On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov <aiva...@iponweb.net>
>> >>>> wrote:
>> >>>>>
>> >>>>> Stephane,
>> >>>>>
>> >>>>> We are having a somewhat similar C* load profile. Hence some
>> >>>>> comments
>> >>>>> in addition Nikolai's answer.
>> >>>>> 1. Fallback to STCS - you can disable it actually
>> >>>>> 2. Based on our experience, if you have a lot of data per node, LCS
>> >>>>> may work just fine. That is, till the moment you decide to join
>> >>>>> another node - chances are that the newly added node will not be
>> >>>>> able
>> >>>>> to compact what it gets from old nodes. In your case, if you switch
>> >>>>> strategy the same thing may happen. This is all due to limitations
>> >>>>> mentioned by Nikolai.
>> >>>>>
>> >>>>> Andrei,
>> >>>>>
>> >>>>>
>> >>>>> On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G.
>> >>>>> <smg...@gmail.com>
>> >>>>> wrote:
>> >>>>> > ABUSE
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > YA NO QUIERO MAS MAILS SOY DE MEXICO
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com]
>> >>>>> > Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m.
>> >>>>> > Para: user@cassandra.apache.org
>> >>>>> > Asunto: Re: Compaction Strategy guidance
>> >>>>> > Importancia: Alta
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > Stephane,
>> >>>>> >
>> >>>>> > As everything good, LCS comes at certain price.
>> >>>>> >
>> >>>>> > LCS will put most load on you I/O system (if you use spindles -
>> >>>>> > you
>> >>>>> > may need
>> >>>>> > to be careful about that) and on CPU. Also LCS (by default) may
>> >>>>> > fall
>> >>>>> > back to
>> >>>>> > STCS if it is falling behind (which is very possible with heavy
>> >>>>> > writing
>> >>>>> > activity) and this will result in higher disk space usage. Also
>> >>>>> > LCS
>> >>>>> > has
>> >>>>> > certain limitation I have discovered lately. Sometimes LCS may not
>> >>>>> > be
>> >>>>> > able
>> >>>>> > to use all your node's resources (algorithm limitations) and this
>> >>>>> > reduces
>> >>>>> > the overall compaction throughput. This may happen if you have a
>> >>>>> > large
>> >>>>> > column family with lots of data per node. STCS won't have this
>> >>>>> > limitation.
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > By the way, the primary goal of LCS is to reduce the number of
>> >>>>> > sstables C*
>> >>>>> > has to look at to find your data. With LCS properly functioning
>> >>>>> > this
>> >>>>> > number
>> >>>>> > will be most likely between something like 1 and 3 for most of the
>> >>>>> > reads.
>> >>>>> > But if you do few reads and not concerned about the latency today,
>> >>>>> > most
>> >>>>> > likely LCS may only save you some disk space.
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay
>> >>>>> > <sle...@looplogic.com>
>> >>>>> > wrote:
>> >>>>> >
>> >>>>> > Hi there,
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > use case:
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > - Heavy write app, few reads.
>> >>>>> >
>> >>>>> > - Lots of updates of rows / columns.
>> >>>>> >
>> >>>>> > - Current performance is fine, for both writes and reads..
>> >>>>> >
>> >>>>> > - Currently using SizedCompactionStrategy
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > We're trying to limit the amount of storage used during
>> >>>>> > compaction.
>> >>>>> > Should
>> >>>>> > we switch to LeveledCompactionStrategy?
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > Thanks
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > --
>> >>>>> >
>> >>>>> > Nikolai Grigoriev
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Nikolai Grigoriev
>> >>>>
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Nikolai Grigoriev
>> >
>
>
>
>
> --
> Nikolai Grigoriev
> (514) 772-5178

Re: Compaction Strategy guidance

Reply via email to