Re: Compaction process stuck

atul atri Thu, 05 Jul 2018 21:57:55 -0700

Hi Chris,

Compaction process finally finished. It took long time though.


Thank you very much for all your help.

Please let me know if you have any guidelines to make future compaction
processes faster.

Thanks & Regards,
Atul Atri.

On 5 July 2018 at 22:05, atul atri <atulatri2...@gmail.com> wrote:

> Hi Cris,
>
> Thank you for reply.
>
> I already have tried to run "nodetool stop compaction" and this does not
> help. I have restarted each node in cluster one by one and compaction
> starts again. It gets stuck on same table.
>
> Following in 'nodetool compactionstats' output. It's stuck at *1336035468*
> for more than 35  hours at least.
>
>
>
>
> *pending tasks: 1          compaction type        keyspace
> table       completed           total      unit  progress
> Compactionnotification_system_v1user_notification      1336035468
> 1660997721     bytes    80.44%Active compaction remaining time :   0h00m38s*
>
>
> Following is output for "nodetool cfstats".
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *Table: user_notification        SSTable count: 18        Space used
> (live), bytes: 17247516201        Space used (total), bytes: 17316488652
>     SSTable Compression Ratio: 0.41922805938461566        Number of keys
> (estimate): 32556160        Memtable cell count: 44717        Memtable data
> size, bytes: 27705294        Memtable switch count: 5        Local read
> count: 0        Local read latency: 0.000 ms        Local write count:
> 236961        Local write latency: 0.047 ms        Pending tasks: 0
> Bloom filter false positives: 0        Bloom filter false ratio: 0.00000
>     Bloom filter space used, bytes: 72414688        Compacted partition
> minimum bytes: 104        Compacted partition maximum bytes: 4966933177
>     Compacted partition mean bytes: 1183        Average live cells per
> slice (last five minutes): 0.0        Average tombstones per slice (last
> five minutes): 0.0*
>
> Please let me know if any more information. I am really thankful to you
> for spending time on this investigation.
>
> Thanks & Regards,
> Atul Atri.
>
>
> On 5 July 2018 at 20:54, Chris Lohfink <clohf...@apple.com> wrote:
>
>> That looks a bit to me like it isnt stuck but just a long running
>> compaction. Can you include the output of `nodetool compactionstats` and
>> the `nodetool cfstats` with schema for the table thats being compacted
>> (redacted names if necessary).
>>
>> Can stop compaction with `nodetool stop COMPACTION` or restarting the
>> node.
>>
>> Chris
>>
>> On Jul 5, 2018, at 12:08 AM, atul atri <atulatri2...@gmail.com> wrote:
>>
>> Hi,
>>
>> We noticed that compaction process is also hanging on a node in backup
>> ring. Please find attached thread dump for both servers. Recently, we have
>> made few changes in cluster topology.
>>
>> a. Added new server in backup data-center and decommissioned old server.
>> Backup ring only has 2 server.
>> b. Added new node in primary data-center. Now it has 4 nods.
>>
>> Is there way we can stop this compaction? As we have added a new node in
>> this cluster and we are waiting to run cleanup on this node on which
>> compaction is hanging. I am afraid that cleanup will not start until
>> compaction job finishes.
>>
>> Attachments:
>> 1. cass-logg02.prod2.thread_dump.out: Thread dump from old node in
>> primary datacenter
>> 2. cass-logg03.prod1.thread_dump.out: Thread dump from new node in
>> backup datacenter. This node is added recently.
>>
>> Your help is much appreciated.
>>
>> Thanks & Regards,
>> Atul Atri.
>>
>>
>> On 4 July 2018 at 21:15, atul atri <atulatri2...@gmail.com> wrote:
>>
>>> Hi Chris,
>>> Thanks for reply.
>>>
>>> Unfortunately, our servers do not have jstack installed.
>>> I tried "kill -3 <PID>" option but that is also not generating thread
>>> dump.
>>>
>>> Is there any other way I can generate thread dump?
>>>
>>> Thanks & Regards,
>>> Atul Atri.
>>>
>>> On 4 July 2018 at 20:32, Chris Lohfink <clohf...@apple.com> wrote:
>>>
>>>> Can you take a thread dump (jstack) and share the state of the
>>>> compaction threads? Also check for “Exception” in logs
>>>>
>>>> Chris
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Jul 4, 2018, at 8:37 AM, atul atri <atulatri2...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On one of our server, compaction process is hanging. It's stuck at 80%.
>>>> It was stuck for last 3 days. And today we did a cluster restart (one host
>>>> at time). And again it is stuck at same 80%. CPU usages are 100% and there
>>>> seems no IO issue. We are seeing following kinds of WARNING in system.log
>>>>
>>>> *BatchStatement.java (line 226) Batch of prepared statements for [****,
>>>> *****] is of size 7557, exceeding specified threshold of 5120 by 2437.*
>>>>
>>>>
>>>> Other than this there seems no error.  I have tried to stop compaction
>>>> process, but it does not stop. Cassandra version is 2.1.
>>>>
>>>>  Can someone please guide us in solving this issue?
>>>>
>>>> Thanks & Regards,
>>>> Atul Atri.
>>>>
>>>>
>>>
>> <cass-logg02.prod2.thread_dump.out><cass-logg03.prod1.thread_dump.out>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> <user-unsubscr...@cassandra.apache.org>
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> <user-h...@cassandra.apache.org>
>>
>>
>>
>

Re: Compaction process stuck

Reply via email to