Thanks a lot Ben. Really appreciate your suggestions here.
Regards, Varun Saluja Sent from my iPhone > On 21-May-2017, at 5:40 PM, Ben Slater <ben.sla...@instaclustr.com> wrote: > > My main suggestion would be to monitor the compaction backlog (pending > compactions). If the backlog is growing you need to either throttle writes, > add more capacity to your cluster or possibly tune things. There is no simple > answer to tuning but several good guides on the internet to help - this is my > favourite: https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html. > > Unless there is something really badly set up with your cluster then I would > guess that if it got in this state trying to handle your write load then > you’ll potentially need additional capacity as well as tuning to meet your > needs. > > Cheers > Ben > >> On Sun, 21 May 2017 at 21:47 varun saluja <saluj...@gmail.com> wrote: >> Hi All, >> >> Can someone Please suggest any recommendations for write intensive jobs >> >> Regards, >> Varun Saluja >> Sent from my iPhone >> >>> On 17-May-2017, at 3:52 PM, varun saluja <saluj...@gmail.com> wrote: >>> >>> Thanks Jeff. >>> >>> I have taken backup and did manual removal of hints with rolling restart. >>> This brought cluster back in stable state. >>> >>> Can you Please share some recommendation for write intensive job . Actually >>> ,we need to load dump from kafka to 3 node cassandra cluster . Write TPS >>> per node will be around 7k. >>> >>> Can you Please suggest any parameter tuning for our use case here. We do >>> not want to get stuck in similar situation of large compactions of hint or >>> any other table where we are loading dump. >>> >>> >>> Regards, >>> Varun >>> >>>> On 17 May 2017 at 09:17, Jeff Jirsa <jji...@gmail.com> wrote: >>>> You could also try stopping compaction, but that'll probably take a very >>>> long time as well >>>> >>>> Manually stopping each node (one at a time) and removing the sstables from >>>> only system.hints may be a better option. May want to take a snapshot if >>>> you're very concerned with that data. >>>> >>>> >>>> >>>> >>>> -- >>>> Jeff Jirsa >>>> >>>> >>>>> On May 16, 2017, at 6:53 PM, varun saluja <saluj...@gmail.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> >>>>> Truncatehints on nodes is running for more than 7 hours now. Nothing >>>>> mentioned for same in sysemt logs even. >>>>> >>>>> And compaction stats reports increase in hints total bytes. >>>>> >>>>> pending tasks: 1 >>>>> compaction type keyspace table completed total >>>>> unit progress >>>>> Compaction system hints 12152557998 869257869352 >>>>> bytes 1.40% >>>>> Active compaction remaining time : 0h27m14s >>>>> >>>>> Can anything else be checked here? Will manually deleting system.hint >>>>> files and restart node fix this. >>>>> >>>>> >>>>> >>>>> Regards, >>>>> Varun Saluja >>>>> >>>>>> On 16 May 2017 at 23:29, varun saluja <saluj...@gmail.com> wrote: >>>>>> Hi Jeff, >>>>>> >>>>>> I ran nodetool truncatehints on all nodes. Its running for more than 30 >>>>>> mins now. Status for compactstats reports same. >>>>>> >>>>>> pending tasks: 1 >>>>>> compaction type keyspace table completed total >>>>>> unit progress >>>>>> Compaction system hints 11189118129 851658989612 >>>>>> bytes 1.31% >>>>>> Active compaction remaining time : 0h26m43s >>>>>> >>>>>> Will truncatehints takes time for completion? Could not see anything >>>>>> related truncatehints in system logs. >>>>>> >>>>>> Please let me know if anything else can be checked here. >>>>>> >>>>>> Regards, >>>>>> Varun Saluja >>>>>> >>>>>> >>>>>> >>>>>>> On 16 May 2017 at 20:58, varun saluja <saluj...@gmail.com> wrote: >>>>>>> Thanks a lot Jeff. >>>>>>> >>>>>>> You have explaned very well here. We have consitency as local quorum. >>>>>>> Will follow truncate hints and repair therafter. >>>>>>> >>>>>>> I hope this brings cluster in stable state >>>>>>> >>>>>>> Thanks again. >>>>>>> >>>>>>> Regards, >>>>>>> Varun Saluja >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa <jji...@apache.org> wrote: >>>>>>> > >>>>>>> > >>>>>>> > In Cassandra versions up to 3.0, hints are stored within a table, >>>>>>> > where the partition key is the host ID of the server for which the >>>>>>> > hints are stored. >>>>>>> > >>>>>>> > In such a data model, accumulating 800GB of hints is almost certain >>>>>>> > to cause very wide rows, which will in turn cause GC pressure when >>>>>>> > you attempt to read the hints for delivery. This will cause GC >>>>>>> > pauses, which will cause hints to fail to be delivered, which will >>>>>>> > cause more hints to be stored. This is bad. >>>>>>> > >>>>>>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1, >>>>>>> > your most likely corrective course is to use 'nodetool truncatehints' >>>>>>> > on all servers, followed by 'nodetool repair' to deliver the data you >>>>>>> > lost by truncating the hints. >>>>>>> > >>>>>>> > NOTE: this is ONLY safe if you wrote with a consistency level >>>>>>> > stronger than CL:ANY. If you wrote this data with CL:ANY, you may >>>>>>> > lose data if you truncate hints. >>>>>>> > >>>>>>> > - Jeff >>>>>>> > >>>>>>> >> On 2017-05-16 06:50 (-0700), varun saluja <saluj...@gmail.com> wrote: >>>>>>> >> Thanks for update. >>>>>>> >> I could see lot of io waits. This causing Gc and mutation drops . >>>>>>> >> But as i mentioned we do not have high load for now. Hint replays >>>>>>> >> are creating such high disk I/O. >>>>>>> >> compactionstats show very high hint bytes like 780gb around. Is this >>>>>>> >> normal? >>>>>>> >> >>>>>>> >> Just mentioning we are using flash disks. >>>>>>> >> >>>>>>> >> In such case, if i run truncatehints , will it remove or decrease >>>>>>> >> size of hints bytes in compaction stats. I can trigger repair >>>>>>> >> therafter. >>>>>>> >> Please let me know if any recommendation on same. >>>>>>> >> >>>>>>> >> Also , table which we dumped from kafka which created this much >>>>>>> >> hints and compaction pendings is also dropped today. Because we have >>>>>>> >> to redump table again once cluster is stable. >>>>>>> >> >>>>>>> >> Regards, >>>>>>> >> Varun >>>>>>> >> >>>>>>> >> Sent from my iPhone >>>>>>> >> >>>>>>> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth <ni...@bamlabs.com> wrote: >>>>>>> >>> >>>>>>> >>> Yes but it means data has to be replicated using repair. >>>>>>> >>> >>>>>>> >>> Hints are out come of unhealthy nodes, focus on finding why you >>>>>>> >>> have mutation drops, is it node, io or network etc. ideally you >>>>>>> >>> shouldn't see increasing hints all the time. >>>>>>> >>> >>>>>>> >>> Sent from my iPhone >>>>>>> >>> >>>>>>> >>>> On May 16, 2017, at 7:58 AM, varun saluja <saluj...@gmail.com> >>>>>>> >>>> wrote: >>>>>>> >>>> >>>>>>> >>>> Hi Nitan, >>>>>>> >>>> >>>>>>> >>>> Thanks for response. >>>>>>> >>>> >>>>>>> >>>> Yes, I could see mutation drops and increase count in >>>>>>> >>>> system.hints. Is there any way , i can proceed to truncate hints >>>>>>> >>>> like using nodetool truncatehints. >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> Regards, >>>>>>> >>>> Varun Saluja >>>>>>> >>>> >>>>>>> >>>>> On 16 May 2017 at 17:52, Nitan Kainth <ni...@bamlabs.com> wrote: >>>>>>> >>>>> Do you see mutation drops? >>>>>>> >>>>> Select count from system.hints; is it increasing? >>>>>>> >>>>> >>>>>>> >>>>> Sent from my iPhone >>>>>>> >>>>> >>>>>>> >>>>>> On May 16, 2017, at 5:52 AM, varun saluja <saluj...@gmail.com> >>>>>>> >>>>>> wrote: >>>>>>> >>>>>> >>>>>>> >>>>>> Hi Experts, >>>>>>> >>>>>> >>>>>>> >>>>>> We are facing issue on production cluster. Compaction on >>>>>>> >>>>>> system.hint table is running from last 2 days. >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> pending tasks: 1 >>>>>>> >>>>>> compaction type keyspace table completed >>>>>>> >>>>>> total unit progress >>>>>>> >>>>>> Compaction system hints 20623021829 >>>>>>> >>>>>> 877874092407 bytes 2.35% >>>>>>> >>>>>> Active compaction remaining time : 0h27m15s >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> Active compaction remaining time shows in minutes. But, this is >>>>>>> >>>>>> job is running like indefinitely. >>>>>>> >>>>>> >>>>>>> >>>>>> We have 3 node cluster V 2.1.7. And we ran write intensive job >>>>>>> >>>>>> last week on particular table. >>>>>>> >>>>>> Compaction on this table finished but hint table size is growing >>>>>>> >>>>>> continuously. >>>>>>> >>>>>> >>>>>>> >>>>>> Can someone Please help me. >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> Thanks & Regards, >>>>>>> >>>>>> Varun Saluja >>>>>>> >>>>>> >>>>>>> >>>> >>>>>>> >> >>>>>>> > >>>>>>> > --------------------------------------------------------------------- >>>>>>> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>>>> > For additional commands, e-mail: user-h...@cassandra.apache.org >>>>>>> > >>>>>> >>>>> >>> > > -- > Ben Slater > Chief Product Officer > > > Read our latest technical blog posts here. > This email has been sent on behalf of Instaclustr Pty. Limited (Australia) > and Instaclustr Inc (USA). > This email and any attachments may contain confidential and legally > privileged information. If you are not the intended recipient, do not copy > or disclose its content, but please reply to this email immediately and > highlight the error to the sender and then immediately delete the message.