Re: Many pending compactions

Roni Balthazar Wed, 18 Feb 2015 04:34:05 -0800

Hi,

You can check if the number of SSTables is decreasing. Look for the
"SSTable count" information of your tables using "nodetool cfstats".
The compaction history can be viewed using "nodetool
compactionhistory".


About the timeouts, check this out:
http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure
Also try to run "nodetool tpstats" to see the threads statistics. It
can lead you to know if you are having performance problems. If you
are having too many pending tasks or dropped messages, maybe will you
need to tune your system (eg: driver's timeout, concurrent reads and
so on)

Regards,

Roni Balthazar

On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam <ptrstp...@gmail.com> wrote:
> Hi,
> Thanks for your "tip" it looks that something changed - I still don't know
> if it is ok.
>
> My nodes started to do more compaction, but it looks that some compactions
> are really slow.
> In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput to
> 999, but I do not see difference.
>
> Can we check something more? Or do you have any method to monitor progress
> with small files?
>
> Regards
>
> On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar <ronibaltha...@gmail.com>
> wrote:
>>
>> HI,
>>
>> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was
>> the solution...
>> The number of SSTables decreased from many thousands to a number below
>> a hundred and the SSTables are now much bigger with several gigabytes
>> (most of them).
>>
>> Cheers,
>>
>> Roni Balthazar
>>
>>
>>
>> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstp...@gmail.com> wrote:
>> > After some diagnostic ( we didn't set yet cold_reads_to_omit ).
>> > Compaction
>> > are running but VERY slow with "idle" IO.
>> >
>> > We had a lot of "Data files" in Cassandra. In DC_A it is about ~120000
>> > (only
>> > xxx-Data.db) in DC_B has only ~4000.
>> >
>> > I don't know if this change anything but:
>> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big
>> > ones,
>> > but most is really small (almost 10000 files are less then 100mb).
>> > 2) in DC_B avg size of Data.db is much bigger ~260mb.
>> >
>> > Do you think that above flag will help us?
>> >
>> >
>> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstp...@gmail.com> wrote:
>> >>
>> >> I set setcompactionthroughput 999 permanently and it doesn't change
>> >> anything. IO is still same. CPU is idle.
>> >>
>> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar
>> >> <ronibaltha...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> You can run "nodetool compactionstats" to view statistics on
>> >>> compactions.
>> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number of
>> >>> SSTables when you use Size-Tiered compaction.
>> >>> You can also create a cron job to increase the value of
>> >>> setcompactionthroughput during the night or when your IO is not busy.
>> >>>
>> >>> From http://wiki.apache.org/cassandra/NodeTool:
>> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
>> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Roni Balthazar
>> >>>
>> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam <ptrstp...@gmail.com> wrote:
>> >>> > One think I do not understand. In my case compaction is running
>> >>> > permanently.
>> >>> > Is there a way to check which compaction is pending? The only
>> >>> > information is
>> >>> > about total count.
>> >>> >
>> >>> >
>> >>> > On Monday, February 16, 2015, Ja Sam <ptrstp...@gmail.com> wrote:
>> >>> >>
>> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night build is
>> >>> >> available from
>> >>> >> http://cassci.datastax.com/job/cassandra-2.1/
>> >>> >>
>> >>> >> I read about cold_reads_to_omit It looks promising. Should I set
>> >>> >> also
>> >>> >> compaction throughput?
>> >>> >>
>> >>> >> p.s. I am really sad that I didn't read this before:
>> >>> >>
>> >>> >>
>> >>> >> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Monday, February 16, 2015, Carlos Rolo <r...@pythian.com> wrote:
>> >>> >>>
>> >>> >>> Hi 100% in agreement with Roland,
>> >>> >>>
>> >>> >>> 2.1.x series is a pain! I would never recommend the current 2.1.x
>> >>> >>> series
>> >>> >>> for production.
>> >>> >>>
>> >>> >>> Clocks is a pain, and check your connectivity! Also check tpstats
>> >>> >>> to
>> >>> >>> see
>> >>> >>> if your threadpools are being overrun.
>> >>> >>>
>> >>> >>> Regards,
>> >>> >>>
>> >>> >>> Carlos Juzarte Rolo
>> >>> >>> Cassandra Consultant
>> >>> >>>
>> >>> >>> Pythian - Love your data
>> >>> >>>
>> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin:
>> >>> >>> linkedin.com/in/carlosjuzarterolo
>> >>> >>> Tel: 1649
>> >>> >>> www.pythian.com
>> >>> >>>
>> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer
>> >>> >>> <r.etzenham...@t-online.de> wrote:
>> >>> >>>>
>> >>> >>>> Hi,
>> >>> >>>>
>> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested
>> >>> >>>> by
>> >>> >>>> Al
>> >>> >>>> Tobey from DataStax)
>> >>> >>>> 7) minimal reads (usually none, sometimes few)
>> >>> >>>>
>> >>> >>>> those two points keep me repeating an anwser I got. First where
>> >>> >>>> did
>> >>> >>>> you
>> >>> >>>> get 2.1.3 from? Maybe I missed it, I will have a look. But if it
>> >>> >>>> is
>> >>> >>>> 2.1.2
>> >>> >>>> whis is the latest released version, that version has many bugs -
>> >>> >>>> most of
>> >>> >>>> them I got kicked by while testing 2.1.2. I got many problems
>> >>> >>>> with
>> >>> >>>> compactions not beeing triggred on column families not beeing
>> >>> >>>> read,
>> >>> >>>> compactions and repairs not beeing completed.  See
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html
>> >>> >>>>
>> >>> >>>> Apart from that, how are those both datacenters connected? Maybe
>> >>> >>>> there
>> >>> >>>> is a bottleneck.
>> >>> >>>>
>> >>> >>>> Also do you have ntp up and running on all nodes to keep all
>> >>> >>>> clocks
>> >>> >>>> in
>> >>> >>>> thight sync?
>> >>> >>>>
>> >>> >>>> Note: I'm no expert (yet) - just sharing my 2 cents.
>> >>> >>>>
>> >>> >>>> Cheers,
>> >>> >>>> Roland
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> --
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >
>> >>
>> >>
>> >
>
>

Re: Many pending compactions

Reply via email to