Re: Long running compaction on huge hint table.

Jeff Jirsa Tue, 16 May 2017 20:49:21 -0700

You could also try stopping compaction, but that'll probably take a very long 
time as well


Manually stopping each node (one at a time) and removing the sstables from only 
system.hints may be a better option. May want to take a snapshot if you're very 
concerned with that data.




-- 
Jeff Jirsa


> On May 16, 2017, at 6:53 PM, varun saluja <saluj...@gmail.com> wrote:
> 
> Hi,
> 
>  
>  Truncatehints on nodes is running for more than 7 hours now. Nothing 
> mentioned for same in sysemt logs even.
> 
> And compaction stats reports increase in hints total bytes.
> 
> pending tasks: 1
>    compaction type   keyspace   table     completed          total    unit   
> progress
>         Compaction     system   hints   12152557998   869257869352   bytes    
>   1.40%
> Active compaction remaining time :   0h27m14s
> 
> Can anything else be checked here? Will manually deleting system.hint files 
> and restart node fix this.
> 
> 
> 
> Regards,
> Varun Saluja
> 
>> On 16 May 2017 at 23:29, varun saluja <saluj...@gmail.com> wrote:
>> Hi Jeff,
>> 
>> I ran nodetool truncatehints  on all nodes. Its running for more than 30 
>> mins now. Status for compactstats reports same.
>> 
>> pending tasks: 1
>>    compaction type   keyspace   table     completed          total    unit   
>> progress
>>         Compaction     system   hints   11189118129   851658989612   bytes   
>>    1.31%
>> Active compaction remaining time :   0h26m43s
>> 
>> Will truncatehints takes time for completion? Could not see anything related 
>> truncatehints in system logs.
>> 
>> Please let me know if anything else can be checked here.
>> 
>> Regards,
>> Varun Saluja 
>> 
>> 
>> 
>>> On 16 May 2017 at 20:58, varun saluja <saluj...@gmail.com> wrote:
>>> Thanks a lot Jeff.
>>> 
>>> You have explaned very well here. We have consitency as local quorum. Will 
>>> follow truncate hints and repair therafter.
>>> 
>>> I hope this brings cluster in stable state
>>> 
>>> Thanks again.
>>> 
>>> Regards,
>>> Varun Saluja
>>> 
>>> Sent from my iPhone
>>> 
>>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa <jji...@apache.org> wrote:
>>> >
>>> >
>>> > In Cassandra versions up to 3.0, hints are stored within a table, where 
>>> > the partition key is the host ID of the server for which the hints are 
>>> > stored.
>>> >
>>> > In such a data model, accumulating 800GB of hints is almost certain to 
>>> > cause very wide rows, which will in turn cause GC pressure when you 
>>> > attempt to read the hints for delivery. This will cause GC pauses, which 
>>> > will cause hints to fail to be delivered, which will cause more hints to 
>>> > be stored. This is bad.
>>> >
>>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1, 
>>> > your most likely corrective course is to use 'nodetool truncatehints' on 
>>> > all servers, followed by 'nodetool repair' to deliver the data you lost 
>>> > by truncating the hints.
>>> >
>>> > NOTE: this is ONLY safe if you wrote with a consistency level stronger 
>>> > than CL:ANY. If you wrote this data with CL:ANY, you may lose data if you 
>>> > truncate hints.
>>> >
>>> > - Jeff
>>> >
>>> >> On 2017-05-16 06:50 (-0700), varun saluja <saluj...@gmail.com> wrote:
>>> >> Thanks for update.
>>> >> I could see lot of io waits. This causing  Gc and mutation drops .
>>> >> But as i mentioned we do not have high load for now. Hint replays are 
>>> >> creating such high disk I/O.
>>> >> compactionstats show very high hint bytes like 780gb around. Is this 
>>> >> normal?
>>> >>
>>> >> Just mentioning we are using flash disks.
>>> >>
>>> >> In such case, if i run truncatehints , will it remove or decrease size 
>>> >> of hints bytes in compaction stats. I can trigger repair therafter.
>>> >> Please let me know if any recommendation on same.
>>> >>
>>> >> Also , table which we dumped from kafka which created this much hints 
>>> >> and compaction pendings is also dropped today. Because we have to redump 
>>> >> table again once cluster is stable.
>>> >>
>>> >> Regards,
>>> >> Varun
>>> >>
>>> >> Sent from my iPhone
>>> >>
>>> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth <ni...@bamlabs.com> wrote:
>>> >>>
>>> >>> Yes but it means data has to be replicated using repair.
>>> >>>
>>> >>> Hints are out come of unhealthy nodes, focus on finding why you have 
>>> >>> mutation drops, is it node, io or network etc. ideally you shouldn't 
>>> >>> see increasing hints all the time.
>>> >>>
>>> >>> Sent from my iPhone
>>> >>>
>>> >>>> On May 16, 2017, at 7:58 AM, varun saluja <saluj...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hi Nitan,
>>> >>>>
>>> >>>> Thanks for response.
>>> >>>>
>>> >>>> Yes, I could see mutation drops and increase count in system.hints. Is 
>>> >>>> there any way , i can proceed to truncate hints like using nodetool 
>>> >>>> truncatehints.
>>> >>>>
>>> >>>>
>>> >>>> Regards,
>>> >>>> Varun Saluja
>>> >>>>
>>> >>>>> On 16 May 2017 at 17:52, Nitan Kainth <ni...@bamlabs.com> wrote:
>>> >>>>> Do you see mutation drops?
>>> >>>>> Select count from system.hints; is it increasing?
>>> >>>>>
>>> >>>>> Sent from my iPhone
>>> >>>>>
>>> >>>>>> On May 16, 2017, at 5:52 AM, varun saluja <saluj...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Hi Experts,
>>> >>>>>>
>>> >>>>>> We are facing issue on production cluster. Compaction on system.hint 
>>> >>>>>> table is running from last 2 days.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> pending tasks: 1
>>> >>>>>>   compaction type   keyspace   table     completed          total    
>>> >>>>>>                   unit   progress
>>> >>>>>>              Compaction     system   hints   20623021829   
>>> >>>>>> 877874092407   bytes      2.35%
>>> >>>>>> Active compaction remaining time :   0h27m15s
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> Active compaction remaining time shows in minutes.  But, this is job 
>>> >>>>>> is running like indefinitely.
>>> >>>>>>
>>> >>>>>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last 
>>> >>>>>> week on particular table.
>>> >>>>>> Compaction on this table finished but hint table size is growing 
>>> >>>>>> continuously.
>>> >>>>>>
>>> >>>>>> Can someone Please help me.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> Thanks & Regards,
>>> >>>>>> Varun Saluja
>>> >>>>>>
>>> >>>>
>>> >>
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> > For additional commands, e-mail: user-h...@cassandra.apache.org
>>> >
>> 
>

Re: Long running compaction on huge hint table.

Reply via email to