Re: Long running compaction on huge hint table.

2017-05-21 Thread varun saluja
Thanks a lot Ben.

Really appreciate your suggestions here.

Regards,
Varun Saluja

Sent from my iPhone

> On 21-May-2017, at 5:40 PM, Ben Slater  wrote:
> 
> My main suggestion would be to monitor the compaction backlog (pending 
> compactions). If the backlog is growing you need to either throttle writes, 
> add more capacity to your cluster or possibly tune things. There is no simple 
> answer to tuning but several good guides on the internet to help - this is my 
> favourite: https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html. 
> 
> Unless there is something really badly set up with your cluster then I would 
> guess that if it got in this state trying to handle your write load then 
> you’ll potentially need additional capacity as well as tuning to meet your 
> needs.
> 
> Cheers
> Ben
> 
>> On Sun, 21 May 2017 at 21:47 varun saluja  wrote:
>> Hi All,
>>   
>> Can someone Please suggest any recommendations for write intensive jobs
>> 
>> Regards,
>> Varun Saluja
>> Sent from my iPhone
>> 
>>> On 17-May-2017, at 3:52 PM, varun saluja  wrote:
>>> 
>>> Thanks Jeff.
>>> 
>>> I have taken backup and did manual removal of hints with rolling restart. 
>>> This brought cluster back in stable state.
>>> 
>>> Can you Please share some recommendation for write intensive job . Actually 
>>> ,we need to load dump from kafka to 3 node cassandra cluster . Write TPS 
>>> per node will be around 7k.
>>> 
>>> Can you Please suggest any parameter tuning for our use case here. We do 
>>> not want to get stuck in similar situation of large compactions of hint or 
>>> any other table where we are loading dump.
>>> 
>>> 
>>> Regards,
>>> Varun
>>> 
 On 17 May 2017 at 09:17, Jeff Jirsa  wrote:
 You could also try stopping compaction, but that'll probably take a very 
 long time as well
 
 Manually stopping each node (one at a time) and removing the sstables from 
 only system.hints may be a better option. May want to take a snapshot if 
 you're very concerned with that data.
 
 
 
 
 -- 
 Jeff Jirsa
 
 
> On May 16, 2017, at 6:53 PM, varun saluja  wrote:
> 
> Hi,
> 
>  
>  Truncatehints on nodes is running for more than 7 hours now. Nothing 
> mentioned for same in sysemt logs even.
> 
> And compaction stats reports increase in hints total bytes.
> 
> pending tasks: 1
>compaction type   keyspace   table completed  total
> unit   progress
> Compaction system   hints   12152557998   869257869352   
> bytes  1.40%
> Active compaction remaining time :   0h27m14s
> 
> Can anything else be checked here? Will manually deleting system.hint 
> files and restart node fix this.
> 
> 
> 
> Regards,
> Varun Saluja
> 
>> On 16 May 2017 at 23:29, varun saluja  wrote:
>> Hi Jeff,
>> 
>> I ran nodetool truncatehints  on all nodes. Its running for more than 30 
>> mins now. Status for compactstats reports same.
>> 
>> pending tasks: 1
>>compaction type   keyspace   table completed  total
>> unit   progress
>> Compaction system   hints   11189118129   851658989612   
>> bytes  1.31%
>> Active compaction remaining time :   0h26m43s
>> 
>> Will truncatehints takes time for completion? Could not see anything 
>> related truncatehints in system logs.
>> 
>> Please let me know if anything else can be checked here.
>> 
>> Regards,
>> Varun Saluja 
>> 
>> 
>> 
>>> On 16 May 2017 at 20:58, varun saluja  wrote:
>>> Thanks a lot Jeff.
>>> 
>>> You have explaned very well here. We have consitency as local quorum. 
>>> Will follow truncate hints and repair therafter.
>>> 
>>> I hope this brings cluster in stable state
>>> 
>>> Thanks again.
>>> 
>>> Regards,
>>> Varun Saluja
>>> 
>>> Sent from my iPhone
>>> 
>>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
>>> >
>>> >
>>> > In Cassandra versions up to 3.0, hints are stored within a table, 
>>> > where the partition key is the host ID of the server for which the 
>>> > hints are stored.
>>> >
>>> > In such a data model, accumulating 800GB of hints is almost certain 
>>> > to cause very wide rows, which will in turn cause GC pressure when 
>>> > you attempt to read the hints for delivery. This will cause GC 
>>> > pauses, which will cause hints to fail to be delivered, which will 
>>> > cause more hints to be stored. This is bad.
>>> >
>>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1, 
>>> > your most likely corrective course is to use 'nodetool truncatehints' 

Re: Long running compaction on huge hint table.

2017-05-21 Thread Ben Slater
My main suggestion would be to monitor the compaction backlog (pending
compactions). If the backlog is growing you need to either throttle writes,
add more capacity to your cluster or possibly tune things. There is no
simple answer to tuning but several good guides on the internet to help -
this is my favourite:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html.

Unless there is something really badly set up with your cluster then I
would guess that if it got in this state trying to handle your write load
then you’ll potentially need additional capacity as well as tuning to meet
your needs.

Cheers
Ben

On Sun, 21 May 2017 at 21:47 varun saluja  wrote:

> Hi All,
>
> Can someone Please suggest any recommendations for write intensive jobs
>
> Regards,
> Varun Saluja
> Sent from my iPhone
>
> On 17-May-2017, at 3:52 PM, varun saluja  wrote:
>
> Thanks Jeff.
>
> I have taken backup and did manual removal of hints with rolling restart.
> This brought cluster back in stable state.
>
> Can you Please share some recommendation for write intensive job .
> Actually ,we need to load dump from kafka to 3 node cassandra cluster .
> Write TPS per node will be around 7k.
>
> Can you Please suggest any parameter tuning for our use case here. We do
> not want to get stuck in similar situation of large compactions of hint or
> any other table where we are loading dump.
>
>
> Regards,
> Varun
>
> On 17 May 2017 at 09:17, Jeff Jirsa  wrote:
>
>> You could also try stopping compaction, but that'll probably take a very
>> long time as well
>>
>> Manually stopping each node (one at a time) and removing the sstables
>> from only system.hints may be a better option. May want to take a snapshot
>> if you're very concerned with that data.
>>
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On May 16, 2017, at 6:53 PM, varun saluja  wrote:
>>
>> Hi,
>>
>>
>>  Truncatehints on nodes is running for more than 7 hours now. Nothing
>> mentioned for same in sysemt logs even.
>>
>> And compaction stats reports increase in hints total bytes.
>>
>> pending tasks: 1
>>compaction type   keyspace   table completed  total
>>  unit   progress
>> Compaction system   hints   12152557998 <(215)%20255-7998>
>> 869257869352   bytes  1.40%
>> Active compaction remaining time :   0h27m14s
>>
>> Can anything else be checked here? Will manually deleting system.hint
>> files and restart node fix this.
>>
>>
>>
>> Regards,
>> Varun Saluja
>>
>> On 16 May 2017 at 23:29, varun saluja  wrote:
>>
>>> Hi Jeff,
>>>
>>> I ran nodetool truncatehints  on all nodes. Its running for more than
>>> 30 mins now. Status for compactstats reports same.
>>>
>>> pending tasks: 1
>>>compaction type   keyspace   table completed  total
>>>  unit   progress
>>> Compaction system   hints   11189118129   851658989612
>>> bytes  1.31%
>>> Active compaction remaining time :   0h26m43s
>>>
>>> Will truncatehints takes time for completion? Could not see anything
>>> related truncatehints in system logs.
>>>
>>> Please let me know if anything else can be checked here.
>>>
>>> Regards,
>>> Varun Saluja
>>>
>>>
>>>
>>> On 16 May 2017 at 20:58, varun saluja  wrote:
>>>
 Thanks a lot Jeff.

 You have explaned very well here. We have consitency as local quorum.
 Will follow truncate hints and repair therafter.

 I hope this brings cluster in stable state

 Thanks again.

 Regards,
 Varun Saluja

 Sent from my iPhone

 > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
 >
 >
 > In Cassandra versions up to 3.0, hints are stored within a table,
 where the partition key is the host ID of the server for which the hints
 are stored.
 >
 > In such a data model, accumulating 800GB of hints is almost certain
 to cause very wide rows, which will in turn cause GC pressure when you
 attempt to read the hints for delivery. This will cause GC pauses, which
 will cause hints to fail to be delivered, which will cause more hints to be
 stored. This is bad.
 >
 > In 3.0, hints were rewritten to work around this design flaw. In 2.1,
 your most likely corrective course is to use 'nodetool truncatehints' on
 all servers, followed by 'nodetool repair' to deliver the data you lost by
 truncating the hints.
 >
 > NOTE: this is ONLY safe if you wrote with a consistency level
 stronger than CL:ANY. If you wrote this data with CL:ANY, you may lose data
 if you truncate hints.
 >
 > - Jeff
 >
 >> On 2017-05-16 06:50 (-0700), varun saluja 
 wrote:
 >> Thanks for update.
 >> I could see lot of io waits. This causing  Gc and mutation drops .
 >> But as i mentioned we do not have high load for now. Hint replays
 are creating such 

Re: Long running compaction on huge hint table.

2017-05-21 Thread varun saluja
Hi All,
  
Can someone Please suggest any recommendations for write intensive jobs

Regards,
Varun Saluja
Sent from my iPhone

> On 17-May-2017, at 3:52 PM, varun saluja  wrote:
> 
> Thanks Jeff.
> 
> I have taken backup and did manual removal of hints with rolling restart. 
> This brought cluster back in stable state.
> 
> Can you Please share some recommendation for write intensive job . Actually 
> ,we need to load dump from kafka to 3 node cassandra cluster . Write TPS per 
> node will be around 7k.
> 
> Can you Please suggest any parameter tuning for our use case here. We do not 
> want to get stuck in similar situation of large compactions of hint or any 
> other table where we are loading dump.
> 
> 
> Regards,
> Varun
> 
>> On 17 May 2017 at 09:17, Jeff Jirsa  wrote:
>> You could also try stopping compaction, but that'll probably take a very 
>> long time as well
>> 
>> Manually stopping each node (one at a time) and removing the sstables from 
>> only system.hints may be a better option. May want to take a snapshot if 
>> you're very concerned with that data.
>> 
>> 
>> 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On May 16, 2017, at 6:53 PM, varun saluja  wrote:
>>> 
>>> Hi,
>>> 
>>>  
>>>  Truncatehints on nodes is running for more than 7 hours now. Nothing 
>>> mentioned for same in sysemt logs even.
>>> 
>>> And compaction stats reports increase in hints total bytes.
>>> 
>>> pending tasks: 1
>>>compaction type   keyspace   table completed  totalunit  
>>>  progress
>>> Compaction system   hints   12152557998   869257869352   bytes  
>>> 1.40%
>>> Active compaction remaining time :   0h27m14s
>>> 
>>> Can anything else be checked here? Will manually deleting system.hint files 
>>> and restart node fix this.
>>> 
>>> 
>>> 
>>> Regards,
>>> Varun Saluja
>>> 
 On 16 May 2017 at 23:29, varun saluja  wrote:
 Hi Jeff,
 
 I ran nodetool truncatehints  on all nodes. Its running for more than 30 
 mins now. Status for compactstats reports same.
 
 pending tasks: 1
compaction type   keyspace   table completed  totalunit 
   progress
 Compaction system   hints   11189118129   851658989612   bytes 
  1.31%
 Active compaction remaining time :   0h26m43s
 
 Will truncatehints takes time for completion? Could not see anything 
 related truncatehints in system logs.
 
 Please let me know if anything else can be checked here.
 
 Regards,
 Varun Saluja 
 
 
 
> On 16 May 2017 at 20:58, varun saluja  wrote:
> Thanks a lot Jeff.
> 
> You have explaned very well here. We have consitency as local quorum. 
> Will follow truncate hints and repair therafter.
> 
> I hope this brings cluster in stable state
> 
> Thanks again.
> 
> Regards,
> Varun Saluja
> 
> Sent from my iPhone
> 
> > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
> >
> >
> > In Cassandra versions up to 3.0, hints are stored within a table, where 
> > the partition key is the host ID of the server for which the hints are 
> > stored.
> >
> > In such a data model, accumulating 800GB of hints is almost certain to 
> > cause very wide rows, which will in turn cause GC pressure when you 
> > attempt to read the hints for delivery. This will cause GC pauses, 
> > which will cause hints to fail to be delivered, which will cause more 
> > hints to be stored. This is bad.
> >
> > In 3.0, hints were rewritten to work around this design flaw. In 2.1, 
> > your most likely corrective course is to use 'nodetool truncatehints' 
> > on all servers, followed by 'nodetool repair' to deliver the data you 
> > lost by truncating the hints.
> >
> > NOTE: this is ONLY safe if you wrote with a consistency level stronger 
> > than CL:ANY. If you wrote this data with CL:ANY, you may lose data if 
> > you truncate hints.
> >
> > - Jeff
> >
> >> On 2017-05-16 06:50 (-0700), varun saluja  wrote:
> >> Thanks for update.
> >> I could see lot of io waits. This causing  Gc and mutation drops .
> >> But as i mentioned we do not have high load for now. Hint replays are 
> >> creating such high disk I/O.
> >> compactionstats show very high hint bytes like 780gb around. Is this 
> >> normal?
> >>
> >> Just mentioning we are using flash disks.
> >>
> >> In such case, if i run truncatehints , will it remove or decrease size 
> >> of hints bytes in compaction stats. I can trigger repair therafter.
> >> Please let me know if any recommendation on same.
> >>
> >> Also , table which we dumped from kafka which created this much hints 
> >> and compaction pendings is also dropped today. 

Re: Long running compaction on huge hint table.

2017-05-17 Thread varun saluja
Thanks Jeff.

I have taken backup and did manual removal of hints with rolling restart.
This brought cluster back in stable state.

Can you Please share some recommendation for write intensive job . Actually
,we need to load dump from kafka to 3 node cassandra cluster . Write TPS
per node will be around 7k.

Can you Please suggest any parameter tuning for our use case here. We do
not want to get stuck in similar situation of large compactions of hint or
any other table where we are loading dump.


Regards,
Varun

On 17 May 2017 at 09:17, Jeff Jirsa  wrote:

> You could also try stopping compaction, but that'll probably take a very
> long time as well
>
> Manually stopping each node (one at a time) and removing the sstables from
> only system.hints may be a better option. May want to take a snapshot if
> you're very concerned with that data.
>
>
>
>
> --
> Jeff Jirsa
>
>
> On May 16, 2017, at 6:53 PM, varun saluja  wrote:
>
> Hi,
>
>
>  Truncatehints on nodes is running for more than 7 hours now. Nothing
> mentioned for same in sysemt logs even.
>
> And compaction stats reports increase in hints total bytes.
>
> pending tasks: 1
>compaction type   keyspace   table completed  totalunit
>   progress
> Compaction system   hints   12152557998   869257869352   bytes
>  1.40%
> Active compaction remaining time :   0h27m14s
>
> Can anything else be checked here? Will manually deleting system.hint
> files and restart node fix this.
>
>
>
> Regards,
> Varun Saluja
>
> On 16 May 2017 at 23:29, varun saluja  wrote:
>
>> Hi Jeff,
>>
>> I ran nodetool truncatehints  on all nodes. Its running for more than 30
>> mins now. Status for compactstats reports same.
>>
>> pending tasks: 1
>>compaction type   keyspace   table completed  total
>>  unit   progress
>> Compaction system   hints   11189118129   851658989612
>> bytes  1.31%
>> Active compaction remaining time :   0h26m43s
>>
>> Will truncatehints takes time for completion? Could not see anything
>> related truncatehints in system logs.
>>
>> Please let me know if anything else can be checked here.
>>
>> Regards,
>> Varun Saluja
>>
>>
>>
>> On 16 May 2017 at 20:58, varun saluja  wrote:
>>
>>> Thanks a lot Jeff.
>>>
>>> You have explaned very well here. We have consitency as local quorum.
>>> Will follow truncate hints and repair therafter.
>>>
>>> I hope this brings cluster in stable state
>>>
>>> Thanks again.
>>>
>>> Regards,
>>> Varun Saluja
>>>
>>> Sent from my iPhone
>>>
>>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
>>> >
>>> >
>>> > In Cassandra versions up to 3.0, hints are stored within a table,
>>> where the partition key is the host ID of the server for which the hints
>>> are stored.
>>> >
>>> > In such a data model, accumulating 800GB of hints is almost certain to
>>> cause very wide rows, which will in turn cause GC pressure when you attempt
>>> to read the hints for delivery. This will cause GC pauses, which will cause
>>> hints to fail to be delivered, which will cause more hints to be stored.
>>> This is bad.
>>> >
>>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1,
>>> your most likely corrective course is to use 'nodetool truncatehints' on
>>> all servers, followed by 'nodetool repair' to deliver the data you lost by
>>> truncating the hints.
>>> >
>>> > NOTE: this is ONLY safe if you wrote with a consistency level stronger
>>> than CL:ANY. If you wrote this data with CL:ANY, you may lose data if you
>>> truncate hints.
>>> >
>>> > - Jeff
>>> >
>>> >> On 2017-05-16 06:50 (-0700), varun saluja  wrote:
>>> >> Thanks for update.
>>> >> I could see lot of io waits. This causing  Gc and mutation drops .
>>> >> But as i mentioned we do not have high load for now. Hint replays are
>>> creating such high disk I/O.
>>> >> compactionstats show very high hint bytes like 780gb around. Is this
>>> normal?
>>> >>
>>> >> Just mentioning we are using flash disks.
>>> >>
>>> >> In such case, if i run truncatehints , will it remove or decrease
>>> size of hints bytes in compaction stats. I can trigger repair therafter.
>>> >> Please let me know if any recommendation on same.
>>> >>
>>> >> Also , table which we dumped from kafka which created this much hints
>>> and compaction pendings is also dropped today. Because we have to redump
>>> table again once cluster is stable.
>>> >>
>>> >> Regards,
>>> >> Varun
>>> >>
>>> >> Sent from my iPhone
>>> >>
>>> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
>>> >>>
>>> >>> Yes but it means data has to be replicated using repair.
>>> >>>
>>> >>> Hints are out come of unhealthy nodes, focus on finding why you have
>>> mutation drops, is it node, io or network etc. ideally you shouldn't see
>>> increasing hints all the time.
>>> >>>
>>> >>> Sent from my iPhone
>>> >>>
>>>  On May 16, 2017, 

Re: Long running compaction on huge hint table.

2017-05-16 Thread Jeff Jirsa
You could also try stopping compaction, but that'll probably take a very long 
time as well

Manually stopping each node (one at a time) and removing the sstables from only 
system.hints may be a better option. May want to take a snapshot if you're very 
concerned with that data.




-- 
Jeff Jirsa


> On May 16, 2017, at 6:53 PM, varun saluja  wrote:
> 
> Hi,
> 
>  
>  Truncatehints on nodes is running for more than 7 hours now. Nothing 
> mentioned for same in sysemt logs even.
> 
> And compaction stats reports increase in hints total bytes.
> 
> pending tasks: 1
>compaction type   keyspace   table completed  totalunit   
> progress
> Compaction system   hints   12152557998   869257869352   bytes
>   1.40%
> Active compaction remaining time :   0h27m14s
> 
> Can anything else be checked here? Will manually deleting system.hint files 
> and restart node fix this.
> 
> 
> 
> Regards,
> Varun Saluja
> 
>> On 16 May 2017 at 23:29, varun saluja  wrote:
>> Hi Jeff,
>> 
>> I ran nodetool truncatehints  on all nodes. Its running for more than 30 
>> mins now. Status for compactstats reports same.
>> 
>> pending tasks: 1
>>compaction type   keyspace   table completed  totalunit   
>> progress
>> Compaction system   hints   11189118129   851658989612   bytes   
>>1.31%
>> Active compaction remaining time :   0h26m43s
>> 
>> Will truncatehints takes time for completion? Could not see anything related 
>> truncatehints in system logs.
>> 
>> Please let me know if anything else can be checked here.
>> 
>> Regards,
>> Varun Saluja 
>> 
>> 
>> 
>>> On 16 May 2017 at 20:58, varun saluja  wrote:
>>> Thanks a lot Jeff.
>>> 
>>> You have explaned very well here. We have consitency as local quorum. Will 
>>> follow truncate hints and repair therafter.
>>> 
>>> I hope this brings cluster in stable state
>>> 
>>> Thanks again.
>>> 
>>> Regards,
>>> Varun Saluja
>>> 
>>> Sent from my iPhone
>>> 
>>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
>>> >
>>> >
>>> > In Cassandra versions up to 3.0, hints are stored within a table, where 
>>> > the partition key is the host ID of the server for which the hints are 
>>> > stored.
>>> >
>>> > In such a data model, accumulating 800GB of hints is almost certain to 
>>> > cause very wide rows, which will in turn cause GC pressure when you 
>>> > attempt to read the hints for delivery. This will cause GC pauses, which 
>>> > will cause hints to fail to be delivered, which will cause more hints to 
>>> > be stored. This is bad.
>>> >
>>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1, 
>>> > your most likely corrective course is to use 'nodetool truncatehints' on 
>>> > all servers, followed by 'nodetool repair' to deliver the data you lost 
>>> > by truncating the hints.
>>> >
>>> > NOTE: this is ONLY safe if you wrote with a consistency level stronger 
>>> > than CL:ANY. If you wrote this data with CL:ANY, you may lose data if you 
>>> > truncate hints.
>>> >
>>> > - Jeff
>>> >
>>> >> On 2017-05-16 06:50 (-0700), varun saluja  wrote:
>>> >> Thanks for update.
>>> >> I could see lot of io waits. This causing  Gc and mutation drops .
>>> >> But as i mentioned we do not have high load for now. Hint replays are 
>>> >> creating such high disk I/O.
>>> >> compactionstats show very high hint bytes like 780gb around. Is this 
>>> >> normal?
>>> >>
>>> >> Just mentioning we are using flash disks.
>>> >>
>>> >> In such case, if i run truncatehints , will it remove or decrease size 
>>> >> of hints bytes in compaction stats. I can trigger repair therafter.
>>> >> Please let me know if any recommendation on same.
>>> >>
>>> >> Also , table which we dumped from kafka which created this much hints 
>>> >> and compaction pendings is also dropped today. Because we have to redump 
>>> >> table again once cluster is stable.
>>> >>
>>> >> Regards,
>>> >> Varun
>>> >>
>>> >> Sent from my iPhone
>>> >>
>>> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
>>> >>>
>>> >>> Yes but it means data has to be replicated using repair.
>>> >>>
>>> >>> Hints are out come of unhealthy nodes, focus on finding why you have 
>>> >>> mutation drops, is it node, io or network etc. ideally you shouldn't 
>>> >>> see increasing hints all the time.
>>> >>>
>>> >>> Sent from my iPhone
>>> >>>
>>>  On May 16, 2017, at 7:58 AM, varun saluja  wrote:
>>> 
>>>  Hi Nitan,
>>> 
>>>  Thanks for response.
>>> 
>>>  Yes, I could see mutation drops and increase count in system.hints. Is 
>>>  there any way , i can proceed to truncate hints like using nodetool 
>>>  truncatehints.
>>> 
>>> 
>>>  Regards,
>>>  Varun Saluja
>>> 
>>> > On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>>> > Do you see mutation drops?
>>> > 

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi,


 Truncatehints on nodes is running for more than 7 hours now. Nothing
mentioned for same in sysemt logs even.

And compaction stats reports increase in hints total bytes.

pending tasks: 1
   compaction type   keyspace   table completed  totalunit
  progress
Compaction system   hints   12152557998   869257869352   bytes
 1.40%
Active compaction remaining time :   0h27m14s

Can anything else be checked here? Will manually deleting system.hint files
and restart node fix this.



Regards,
Varun Saluja

On 16 May 2017 at 23:29, varun saluja  wrote:

> Hi Jeff,
>
> I ran nodetool truncatehints  on all nodes. Its running for more than 30
> mins now. Status for compactstats reports same.
>
> pending tasks: 1
>compaction type   keyspace   table completed  totalunit
>   progress
> Compaction system   hints   11189118129   851658989612   bytes
>  1.31%
> Active compaction remaining time :   0h26m43s
>
> Will truncatehints takes time for completion? Could not see anything
> related truncatehints in system logs.
>
> Please let me know if anything else can be checked here.
>
> Regards,
> Varun Saluja
>
>
>
> On 16 May 2017 at 20:58, varun saluja  wrote:
>
>> Thanks a lot Jeff.
>>
>> You have explaned very well here. We have consitency as local quorum.
>> Will follow truncate hints and repair therafter.
>>
>> I hope this brings cluster in stable state
>>
>> Thanks again.
>>
>> Regards,
>> Varun Saluja
>>
>> Sent from my iPhone
>>
>> > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
>> >
>> >
>> > In Cassandra versions up to 3.0, hints are stored within a table, where
>> the partition key is the host ID of the server for which the hints are
>> stored.
>> >
>> > In such a data model, accumulating 800GB of hints is almost certain to
>> cause very wide rows, which will in turn cause GC pressure when you attempt
>> to read the hints for delivery. This will cause GC pauses, which will cause
>> hints to fail to be delivered, which will cause more hints to be stored.
>> This is bad.
>> >
>> > In 3.0, hints were rewritten to work around this design flaw. In 2.1,
>> your most likely corrective course is to use 'nodetool truncatehints' on
>> all servers, followed by 'nodetool repair' to deliver the data you lost by
>> truncating the hints.
>> >
>> > NOTE: this is ONLY safe if you wrote with a consistency level stronger
>> than CL:ANY. If you wrote this data with CL:ANY, you may lose data if you
>> truncate hints.
>> >
>> > - Jeff
>> >
>> >> On 2017-05-16 06:50 (-0700), varun saluja  wrote:
>> >> Thanks for update.
>> >> I could see lot of io waits. This causing  Gc and mutation drops .
>> >> But as i mentioned we do not have high load for now. Hint replays are
>> creating such high disk I/O.
>> >> compactionstats show very high hint bytes like 780gb around. Is this
>> normal?
>> >>
>> >> Just mentioning we are using flash disks.
>> >>
>> >> In such case, if i run truncatehints , will it remove or decrease size
>> of hints bytes in compaction stats. I can trigger repair therafter.
>> >> Please let me know if any recommendation on same.
>> >>
>> >> Also , table which we dumped from kafka which created this much hints
>> and compaction pendings is also dropped today. Because we have to redump
>> table again once cluster is stable.
>> >>
>> >> Regards,
>> >> Varun
>> >>
>> >> Sent from my iPhone
>> >>
>> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
>> >>>
>> >>> Yes but it means data has to be replicated using repair.
>> >>>
>> >>> Hints are out come of unhealthy nodes, focus on finding why you have
>> mutation drops, is it node, io or network etc. ideally you shouldn't see
>> increasing hints all the time.
>> >>>
>> >>> Sent from my iPhone
>> >>>
>>  On May 16, 2017, at 7:58 AM, varun saluja 
>> wrote:
>> 
>>  Hi Nitan,
>> 
>>  Thanks for response.
>> 
>>  Yes, I could see mutation drops and increase count in system.hints.
>> Is there any way , i can proceed to truncate hints like using nodetool
>> truncatehints.
>> 
>> 
>>  Regards,
>>  Varun Saluja
>> 
>> > On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>> > Do you see mutation drops?
>> > Select count from system.hints; is it increasing?
>> >
>> > Sent from my iPhone
>> >
>> >> On May 16, 2017, at 5:52 AM, varun saluja 
>> wrote:
>> >>
>> >> Hi Experts,
>> >>
>> >> We are facing issue on production cluster. Compaction on
>> system.hint table is running from last 2 days.
>> >>
>> >>
>> >> pending tasks: 1
>> >>   compaction type   keyspace   table completed  total
>> unit   progress
>> >>  Compaction system   hints   20623021829
>>  877874092407   bytes  2.35%
>> >> Active compaction 

Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Jeff,

I ran nodetool truncatehints  on all nodes. Its running for more than 30
mins now. Status for compactstats reports same.

pending tasks: 1
   compaction type   keyspace   table completed  totalunit
  progress
Compaction system   hints   11189118129   851658989612   bytes
 1.31%
Active compaction remaining time :   0h26m43s

Will truncatehints takes time for completion? Could not see anything
related truncatehints in system logs.

Please let me know if anything else can be checked here.

Regards,
Varun Saluja



On 16 May 2017 at 20:58, varun saluja  wrote:

> Thanks a lot Jeff.
>
> You have explaned very well here. We have consitency as local quorum. Will
> follow truncate hints and repair therafter.
>
> I hope this brings cluster in stable state
>
> Thanks again.
>
> Regards,
> Varun Saluja
>
> Sent from my iPhone
>
> > On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
> >
> >
> > In Cassandra versions up to 3.0, hints are stored within a table, where
> the partition key is the host ID of the server for which the hints are
> stored.
> >
> > In such a data model, accumulating 800GB of hints is almost certain to
> cause very wide rows, which will in turn cause GC pressure when you attempt
> to read the hints for delivery. This will cause GC pauses, which will cause
> hints to fail to be delivered, which will cause more hints to be stored.
> This is bad.
> >
> > In 3.0, hints were rewritten to work around this design flaw. In 2.1,
> your most likely corrective course is to use 'nodetool truncatehints' on
> all servers, followed by 'nodetool repair' to deliver the data you lost by
> truncating the hints.
> >
> > NOTE: this is ONLY safe if you wrote with a consistency level stronger
> than CL:ANY. If you wrote this data with CL:ANY, you may lose data if you
> truncate hints.
> >
> > - Jeff
> >
> >> On 2017-05-16 06:50 (-0700), varun saluja  wrote:
> >> Thanks for update.
> >> I could see lot of io waits. This causing  Gc and mutation drops .
> >> But as i mentioned we do not have high load for now. Hint replays are
> creating such high disk I/O.
> >> compactionstats show very high hint bytes like 780gb around. Is this
> normal?
> >>
> >> Just mentioning we are using flash disks.
> >>
> >> In such case, if i run truncatehints , will it remove or decrease size
> of hints bytes in compaction stats. I can trigger repair therafter.
> >> Please let me know if any recommendation on same.
> >>
> >> Also , table which we dumped from kafka which created this much hints
> and compaction pendings is also dropped today. Because we have to redump
> table again once cluster is stable.
> >>
> >> Regards,
> >> Varun
> >>
> >> Sent from my iPhone
> >>
> >>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
> >>>
> >>> Yes but it means data has to be replicated using repair.
> >>>
> >>> Hints are out come of unhealthy nodes, focus on finding why you have
> mutation drops, is it node, io or network etc. ideally you shouldn't see
> increasing hints all the time.
> >>>
> >>> Sent from my iPhone
> >>>
>  On May 16, 2017, at 7:58 AM, varun saluja  wrote:
> 
>  Hi Nitan,
> 
>  Thanks for response.
> 
>  Yes, I could see mutation drops and increase count in system.hints.
> Is there any way , i can proceed to truncate hints like using nodetool
> truncatehints.
> 
> 
>  Regards,
>  Varun Saluja
> 
> > On 16 May 2017 at 17:52, Nitan Kainth  wrote:
> > Do you see mutation drops?
> > Select count from system.hints; is it increasing?
> >
> > Sent from my iPhone
> >
> >> On May 16, 2017, at 5:52 AM, varun saluja 
> wrote:
> >>
> >> Hi Experts,
> >>
> >> We are facing issue on production cluster. Compaction on
> system.hint table is running from last 2 days.
> >>
> >>
> >> pending tasks: 1
> >>   compaction type   keyspace   table completed  total
> unit   progress
> >>  Compaction system   hints   20623021829
>  877874092407   bytes  2.35%
> >> Active compaction remaining time :   0h27m15s
> >>
> >>
> >> Active compaction remaining time shows in minutes.  But, this is
> job is running like indefinitely.
> >>
> >> We have 3 node cluster V 2.1.7. And we ran  write intensive job
> last week on particular table.
> >> Compaction on this table finished but hint table size is growing
> continuously.
> >>
> >> Can someone Please help me.
> >>
> >>
> >> Thanks & Regards,
> >> Varun Saluja
> >>
> 
> >>
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: user-h...@cassandra.apache.org
> >
>


Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Thanks a lot Jeff.

You have explaned very well here. We have consitency as local quorum. Will 
follow truncate hints and repair therafter.

I hope this brings cluster in stable state

Thanks again.

Regards,
Varun Saluja

Sent from my iPhone

> On 16-May-2017, at 8:42 PM, Jeff Jirsa  wrote:
> 
> 
> In Cassandra versions up to 3.0, hints are stored within a table, where the 
> partition key is the host ID of the server for which the hints are stored.
> 
> In such a data model, accumulating 800GB of hints is almost certain to cause 
> very wide rows, which will in turn cause GC pressure when you attempt to read 
> the hints for delivery. This will cause GC pauses, which will cause hints to 
> fail to be delivered, which will cause more hints to be stored. This is bad.
> 
> In 3.0, hints were rewritten to work around this design flaw. In 2.1, your 
> most likely corrective course is to use 'nodetool truncatehints' on all 
> servers, followed by 'nodetool repair' to deliver the data you lost by 
> truncating the hints.
> 
> NOTE: this is ONLY safe if you wrote with a consistency level stronger than 
> CL:ANY. If you wrote this data with CL:ANY, you may lose data if you truncate 
> hints.
> 
> - Jeff
> 
>> On 2017-05-16 06:50 (-0700), varun saluja  wrote: 
>> Thanks for update.
>> I could see lot of io waits. This causing  Gc and mutation drops .
>> But as i mentioned we do not have high load for now. Hint replays are 
>> creating such high disk I/O.
>> compactionstats show very high hint bytes like 780gb around. Is this normal?
>> 
>> Just mentioning we are using flash disks.
>> 
>> In such case, if i run truncatehints , will it remove or decrease size of 
>> hints bytes in compaction stats. I can trigger repair therafter.
>> Please let me know if any recommendation on same.
>> 
>> Also , table which we dumped from kafka which created this much hints and 
>> compaction pendings is also dropped today. Because we have to redump table 
>> again once cluster is stable.
>> 
>> Regards,
>> Varun
>> 
>> Sent from my iPhone
>> 
>>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
>>> 
>>> Yes but it means data has to be replicated using repair.
>>> 
>>> Hints are out come of unhealthy nodes, focus on finding why you have 
>>> mutation drops, is it node, io or network etc. ideally you shouldn't see 
>>> increasing hints all the time.
>>> 
>>> Sent from my iPhone
>>> 
 On May 16, 2017, at 7:58 AM, varun saluja  wrote:
 
 Hi Nitan,
 
 Thanks for response.
 
 Yes, I could see mutation drops and increase count in system.hints. Is 
 there any way , i can proceed to truncate hints like using nodetool 
 truncatehints.
 
 
 Regards,
 Varun Saluja
 
> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
> Do you see mutation drops?
> Select count from system.hints; is it increasing?
> 
> Sent from my iPhone
> 
>> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>> 
>> Hi Experts,
>> 
>> We are facing issue on production cluster. Compaction on system.hint 
>> table is running from last 2 days.
>> 
>> 
>> pending tasks: 1
>>   compaction type   keyspace   table completed  total
>>   unit   progress
>>  Compaction system   hints   20623021829   877874092407  
>>  bytes  2.35%
>> Active compaction remaining time :   0h27m15s
>> 
>> 
>> Active compaction remaining time shows in minutes.  But, this is job is 
>> running like indefinitely.
>> 
>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last 
>> week on particular table.
>> Compaction on this table finished but hint table size is growing 
>> continuously.
>> 
>> Can someone Please help me.
>> 
>> 
>> Thanks & Regards,
>> Varun Saluja
>> 
 
>> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Long running compaction on huge hint table.

2017-05-16 Thread Jeff Jirsa

In Cassandra versions up to 3.0, hints are stored within a table, where the 
partition key is the host ID of the server for which the hints are stored.

In such a data model, accumulating 800GB of hints is almost certain to cause 
very wide rows, which will in turn cause GC pressure when you attempt to read 
the hints for delivery. This will cause GC pauses, which will cause hints to 
fail to be delivered, which will cause more hints to be stored. This is bad.

In 3.0, hints were rewritten to work around this design flaw. In 2.1, your most 
likely corrective course is to use 'nodetool truncatehints' on all servers, 
followed by 'nodetool repair' to deliver the data you lost by truncating the 
hints.

NOTE: this is ONLY safe if you wrote with a consistency level stronger than 
CL:ANY. If you wrote this data with CL:ANY, you may lose data if you truncate 
hints.

- Jeff

On 2017-05-16 06:50 (-0700), varun saluja  wrote: 
> Thanks for update.
> I could see lot of io waits. This causing  Gc and mutation drops .
> But as i mentioned we do not have high load for now. Hint replays are 
> creating such high disk I/O.
> compactionstats show very high hint bytes like 780gb around. Is this normal?
> 
> Just mentioning we are using flash disks.
> 
> In such case, if i run truncatehints , will it remove or decrease size of 
> hints bytes in compaction stats. I can trigger repair therafter.
> Please let me know if any recommendation on same.
> 
> Also , table which we dumped from kafka which created this much hints and 
> compaction pendings is also dropped today. Because we have to redump table 
> again once cluster is stable.
> 
> Regards,
> Varun
> 
> Sent from my iPhone
> 
> > On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
> > 
> > Yes but it means data has to be replicated using repair.
> > 
> > Hints are out come of unhealthy nodes, focus on finding why you have 
> > mutation drops, is it node, io or network etc. ideally you shouldn't see 
> > increasing hints all the time.
> > 
> > Sent from my iPhone
> > 
> >> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
> >> 
> >> Hi Nitan,
> >> 
> >> Thanks for response.
> >> 
> >> Yes, I could see mutation drops and increase count in system.hints. Is 
> >> there any way , i can proceed to truncate hints like using nodetool 
> >> truncatehints.
> >> 
> >> 
> >> Regards,
> >> Varun Saluja
> >> 
> >>> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
> >>> Do you see mutation drops?
> >>> Select count from system.hints; is it increasing?
> >>> 
> >>> Sent from my iPhone
> >>> 
>  On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>  
>  Hi Experts,
>  
>  We are facing issue on production cluster. Compaction on system.hint 
>  table is running from last 2 days.
>  
>  
>  pending tasks: 1
> compaction type   keyspace   table completed  total   
> unit   progress
>    Compaction system   hints   20623021829   877874092407 
>    bytes  2.35%
>  Active compaction remaining time :   0h27m15s
>  
>  
>  Active compaction remaining time shows in minutes.  But, this is job is 
>  running like indefinitely.
>  
>  We have 3 node cluster V 2.1.7. And we ran  write intensive job last 
>  week on particular table.
>  Compaction on this table finished but hint table size is growing 
>  continuously.
>  
>  Can someone Please help me.
>  
>  
>  Thanks & Regards,
>  Varun Saluja
>  
> >> 
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
You can control compaction with nodetool compactionthroughput but it will just 
slow down compaction and give resources for application, however it's not a fix.

Sent from my iPhone

> On May 16, 2017, at 9:15 AM, varun saluja  wrote:
> 
> Thanks Nitan.
> Appreciate your help.
> 
> Can anyone suggest parameter change or something which can help in this 
> situation.
> 
> Regards,
> Varun 
> 
> Sent from my iPhone
> 
>> On 16-May-2017, at 7:31 PM, Nitan Kainth  wrote:
>> 
>> If target table is dropped then you can remove its hints but there could be 
>> more hints from other table. If it has tables of your interest , then I 
>> won't comment on truncating hints.
>> 
>> Size of hints depends on Kafka load , looks like you had overloaded the 
>> cluster during data load and not hints are just recovering from it. I would 
>> say wait until cluster comes to normal state. May be some other expert can 
>> suggest an alternate.
>> 
>> Sent from my iPhone
>> 
>>> On May 16, 2017, at 8:50 AM, varun saluja  wrote:
>>> 
>>> Thanks for update.
>>> I could see lot of io waits. This causing  Gc and mutation drops .
>>> But as i mentioned we do not have high load for now. Hint replays are 
>>> creating such high disk I/O.
>>> compactionstats show very high hint bytes like 780gb around. Is this normal?
>>> 
>>> Just mentioning we are using flash disks.
>>> 
>>> In such case, if i run truncatehints , will it remove or decrease size of 
>>> hints bytes in compaction stats. I can trigger repair therafter.
>>> Please let me know if any recommendation on same.
>>> 
>>> Also , table which we dumped from kafka which created this much hints and 
>>> compaction pendings is also dropped today. Because we have to redump table 
>>> again once cluster is stable.
>>> 
>>> Regards,
>>> Varun
>>> 
>>> Sent from my iPhone
>>> 
 On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
 
 Yes but it means data has to be replicated using repair.
 
 Hints are out come of unhealthy nodes, focus on finding why you have 
 mutation drops, is it node, io or network etc. ideally you shouldn't see 
 increasing hints all the time.
 
 Sent from my iPhone
 
> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
> 
> Hi Nitan,
> 
> Thanks for response.
> 
> Yes, I could see mutation drops and increase count in system.hints. Is 
> there any way , i can proceed to truncate hints like using nodetool 
> truncatehints.
> 
> 
> Regards,
> Varun Saluja
> 
>> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>> Do you see mutation drops?
>> Select count from system.hints; is it increasing?
>> 
>> Sent from my iPhone
>> 
>>> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>>> 
>>> Hi Experts,
>>> 
>>> We are facing issue on production cluster. Compaction on system.hint 
>>> table is running from last 2 days.
>>> 
>>> 
>>> pending tasks: 1
>>>compaction type   keyspace   table completed  total  
>>> unit   progress
>>>   Compaction system   hints   20623021829   
>>> 877874092407   bytes  2.35%
>>> Active compaction remaining time :   0h27m15s
>>> 
>>> 
>>> Active compaction remaining time shows in minutes.  But, this is job is 
>>> running like indefinitely.
>>> 
>>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last 
>>> week on particular table.
>>> Compaction on this table finished but hint table size is growing 
>>> continuously.
>>> 
>>> Can someone Please help me.
>>> 
>>> 
>>> Thanks & Regards,
>>> Varun Saluja
>>> 
> 


Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Thanks Nitan.
Appreciate your help.

Can anyone suggest parameter change or something which can help in this 
situation.

Regards,
Varun 

Sent from my iPhone

> On 16-May-2017, at 7:31 PM, Nitan Kainth  wrote:
> 
> If target table is dropped then you can remove its hints but there could be 
> more hints from other table. If it has tables of your interest , then I won't 
> comment on truncating hints.
> 
> Size of hints depends on Kafka load , looks like you had overloaded the 
> cluster during data load and not hints are just recovering from it. I would 
> say wait until cluster comes to normal state. May be some other expert can 
> suggest an alternate.
> 
> Sent from my iPhone
> 
>> On May 16, 2017, at 8:50 AM, varun saluja  wrote:
>> 
>> Thanks for update.
>> I could see lot of io waits. This causing  Gc and mutation drops .
>> But as i mentioned we do not have high load for now. Hint replays are 
>> creating such high disk I/O.
>> compactionstats show very high hint bytes like 780gb around. Is this normal?
>> 
>> Just mentioning we are using flash disks.
>> 
>> In such case, if i run truncatehints , will it remove or decrease size of 
>> hints bytes in compaction stats. I can trigger repair therafter.
>> Please let me know if any recommendation on same.
>> 
>> Also , table which we dumped from kafka which created this much hints and 
>> compaction pendings is also dropped today. Because we have to redump table 
>> again once cluster is stable.
>> 
>> Regards,
>> Varun
>> 
>> Sent from my iPhone
>> 
>>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
>>> 
>>> Yes but it means data has to be replicated using repair.
>>> 
>>> Hints are out come of unhealthy nodes, focus on finding why you have 
>>> mutation drops, is it node, io or network etc. ideally you shouldn't see 
>>> increasing hints all the time.
>>> 
>>> Sent from my iPhone
>>> 
 On May 16, 2017, at 7:58 AM, varun saluja  wrote:
 
 Hi Nitan,
 
 Thanks for response.
 
 Yes, I could see mutation drops and increase count in system.hints. Is 
 there any way , i can proceed to truncate hints like using nodetool 
 truncatehints.
 
 
 Regards,
 Varun Saluja
 
> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
> Do you see mutation drops?
> Select count from system.hints; is it increasing?
> 
> Sent from my iPhone
> 
>> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>> 
>> Hi Experts,
>> 
>> We are facing issue on production cluster. Compaction on system.hint 
>> table is running from last 2 days.
>> 
>> 
>> pending tasks: 1
>>compaction type   keyspace   table completed  total   
>>unit   progress
>>   Compaction system   hints   20623021829   877874092407 
>>   bytes  2.35%
>> Active compaction remaining time :   0h27m15s
>> 
>> 
>> Active compaction remaining time shows in minutes.  But, this is job is 
>> running like indefinitely.
>> 
>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last 
>> week on particular table.
>> Compaction on this table finished but hint table size is growing 
>> continuously.
>> 
>> Can someone Please help me.
>> 
>> 
>> Thanks & Regards,
>> Varun Saluja
>> 
 


Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
If target table is dropped then you can remove its hints but there could be 
more hints from other table. If it has tables of your interest , then I won't 
comment on truncating hints.

Size of hints depends on Kafka load , looks like you had overloaded the cluster 
during data load and not hints are just recovering from it. I would say wait 
until cluster comes to normal state. May be some other expert can suggest an 
alternate.

Sent from my iPhone

> On May 16, 2017, at 8:50 AM, varun saluja  wrote:
> 
> Thanks for update.
> I could see lot of io waits. This causing  Gc and mutation drops .
> But as i mentioned we do not have high load for now. Hint replays are 
> creating such high disk I/O.
> compactionstats show very high hint bytes like 780gb around. Is this normal?
> 
> Just mentioning we are using flash disks.
> 
> In such case, if i run truncatehints , will it remove or decrease size of 
> hints bytes in compaction stats. I can trigger repair therafter.
> Please let me know if any recommendation on same.
> 
> Also , table which we dumped from kafka which created this much hints and 
> compaction pendings is also dropped today. Because we have to redump table 
> again once cluster is stable.
> 
> Regards,
> Varun
> 
> Sent from my iPhone
> 
>> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
>> 
>> Yes but it means data has to be replicated using repair.
>> 
>> Hints are out come of unhealthy nodes, focus on finding why you have 
>> mutation drops, is it node, io or network etc. ideally you shouldn't see 
>> increasing hints all the time.
>> 
>> Sent from my iPhone
>> 
>>> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
>>> 
>>> Hi Nitan,
>>> 
>>> Thanks for response.
>>> 
>>> Yes, I could see mutation drops and increase count in system.hints. Is 
>>> there any way , i can proceed to truncate hints like using nodetool 
>>> truncatehints.
>>> 
>>> 
>>> Regards,
>>> Varun Saluja
>>> 
 On 16 May 2017 at 17:52, Nitan Kainth  wrote:
 Do you see mutation drops?
 Select count from system.hints; is it increasing?
 
 Sent from my iPhone
 
> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
> 
> Hi Experts,
> 
> We are facing issue on production cluster. Compaction on system.hint 
> table is running from last 2 days.
> 
> 
> pending tasks: 1
>compaction type   keyspace   table completed  total
>   unit   progress
>   Compaction system   hints   20623021829   877874092407  
>  bytes  2.35%
> Active compaction remaining time :   0h27m15s
> 
> 
> Active compaction remaining time shows in minutes.  But, this is job is 
> running like indefinitely.
> 
> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week 
> on particular table.
> Compaction on this table finished but hint table size is growing 
> continuously.
> 
> Can someone Please help me.
> 
> 
> Thanks & Regards,
> Varun Saluja
> 
>>> 


Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Nitan,

Rolling reatart did not helped. Same compaction status after restart.
No other processes running here. These are dedicated cassandra nodes.
Sent from my iPhone

> On 16-May-2017, at 7:16 PM, Nitan Kainth  wrote:
> 
> Have you tried rolling restart?
> Any agent or other process hogging system?
> 
> Sent from my iPhone
> 
>> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
>> 
>> Hi Nitan,
>> 
>> Thanks for response.
>> 
>> Yes, I could see mutation drops and increase count in system.hints. Is there 
>> any way , i can proceed to truncate hints like using nodetool truncatehints.
>> 
>> 
>> Regards,
>> Varun Saluja
>> 
>>> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>>> Do you see mutation drops?
>>> Select count from system.hints; is it increasing?
>>> 
>>> Sent from my iPhone
>>> 
 On May 16, 2017, at 5:52 AM, varun saluja  wrote:
 
 Hi Experts,
 
 We are facing issue on production cluster. Compaction on system.hint table 
 is running from last 2 days.
 
 
 pending tasks: 1
compaction type   keyspace   table completed  total 
  unit   progress
   Compaction system   hints   20623021829   877874092407   
 bytes  2.35%
 Active compaction remaining time :   0h27m15s
 
 
 Active compaction remaining time shows in minutes.  But, this is job is 
 running like indefinitely.
 
 We have 3 node cluster V 2.1.7. And we ran  write intensive job last week 
 on particular table.
 Compaction on this table finished but hint table size is growing 
 continuously.
 
 Can someone Please help me.
 
 
 Thanks & Regards,
 Varun Saluja
 
>> 


Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Thanks for update.
I could see lot of io waits. This causing  Gc and mutation drops .
But as i mentioned we do not have high load for now. Hint replays are creating 
such high disk I/O.
compactionstats show very high hint bytes like 780gb around. Is this normal?

Just mentioning we are using flash disks.

In such case, if i run truncatehints , will it remove or decrease size of hints 
bytes in compaction stats. I can trigger repair therafter.
Please let me know if any recommendation on same.

Also , table which we dumped from kafka which created this much hints and 
compaction pendings is also dropped today. Because we have to redump table 
again once cluster is stable.

Regards,
Varun

Sent from my iPhone

> On 16-May-2017, at 6:59 PM, Nitan Kainth  wrote:
> 
> Yes but it means data has to be replicated using repair.
> 
> Hints are out come of unhealthy nodes, focus on finding why you have mutation 
> drops, is it node, io or network etc. ideally you shouldn't see increasing 
> hints all the time.
> 
> Sent from my iPhone
> 
>> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
>> 
>> Hi Nitan,
>> 
>> Thanks for response.
>> 
>> Yes, I could see mutation drops and increase count in system.hints. Is there 
>> any way , i can proceed to truncate hints like using nodetool truncatehints.
>> 
>> 
>> Regards,
>> Varun Saluja
>> 
>>> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>>> Do you see mutation drops?
>>> Select count from system.hints; is it increasing?
>>> 
>>> Sent from my iPhone
>>> 
 On May 16, 2017, at 5:52 AM, varun saluja  wrote:
 
 Hi Experts,
 
 We are facing issue on production cluster. Compaction on system.hint table 
 is running from last 2 days.
 
 
 pending tasks: 1
compaction type   keyspace   table completed  total 
  unit   progress
   Compaction system   hints   20623021829   877874092407   
 bytes  2.35%
 Active compaction remaining time :   0h27m15s
 
 
 Active compaction remaining time shows in minutes.  But, this is job is 
 running like indefinitely.
 
 We have 3 node cluster V 2.1.7. And we ran  write intensive job last week 
 on particular table.
 Compaction on this table finished but hint table size is growing 
 continuously.
 
 Can someone Please help me.
 
 
 Thanks & Regards,
 Varun Saluja
 
>> 


Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
Have you tried rolling restart?
Any agent or other process hogging system?

Sent from my iPhone

> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
> 
> Hi Nitan,
> 
> Thanks for response.
> 
> Yes, I could see mutation drops and increase count in system.hints. Is there 
> any way , i can proceed to truncate hints like using nodetool truncatehints.
> 
> 
> Regards,
> Varun Saluja
> 
>> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>> Do you see mutation drops?
>> Select count from system.hints; is it increasing?
>> 
>> Sent from my iPhone
>> 
>>> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>>> 
>>> Hi Experts,
>>> 
>>> We are facing issue on production cluster. Compaction on system.hint table 
>>> is running from last 2 days.
>>> 
>>> 
>>> pending tasks: 1
>>>compaction type   keyspace   table completed  total  
>>> unit   progress
>>>   Compaction system   hints   20623021829   877874092407   
>>> bytes  2.35%
>>> Active compaction remaining time :   0h27m15s
>>> 
>>> 
>>> Active compaction remaining time shows in minutes.  But, this is job is 
>>> running like indefinitely.
>>> 
>>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week 
>>> on particular table.
>>> Compaction on this table finished but hint table size is growing 
>>> continuously.
>>> 
>>> Can someone Please help me.
>>> 
>>> 
>>> Thanks & Regards,
>>> Varun Saluja
>>> 
> 


Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
Yes but it means data has to be replicated using repair.

Hints are out come of unhealthy nodes, focus on finding why you have mutation 
drops, is it node, io or network etc. ideally you shouldn't see increasing 
hints all the time.

Sent from my iPhone

> On May 16, 2017, at 7:58 AM, varun saluja  wrote:
> 
> Hi Nitan,
> 
> Thanks for response.
> 
> Yes, I could see mutation drops and increase count in system.hints. Is there 
> any way , i can proceed to truncate hints like using nodetool truncatehints.
> 
> 
> Regards,
> Varun Saluja
> 
>> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>> Do you see mutation drops?
>> Select count from system.hints; is it increasing?
>> 
>> Sent from my iPhone
>> 
>>> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>>> 
>>> Hi Experts,
>>> 
>>> We are facing issue on production cluster. Compaction on system.hint table 
>>> is running from last 2 days.
>>> 
>>> 
>>> pending tasks: 1
>>>compaction type   keyspace   table completed  total  
>>> unit   progress
>>>   Compaction system   hints   20623021829   877874092407   
>>> bytes  2.35%
>>> Active compaction remaining time :   0h27m15s
>>> 
>>> 
>>> Active compaction remaining time shows in minutes.  But, this is job is 
>>> running like indefinitely.
>>> 
>>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week 
>>> on particular table.
>>> Compaction on this table finished but hint table size is growing 
>>> continuously.
>>> 
>>> Can someone Please help me.
>>> 
>>> 
>>> Thanks & Regards,
>>> Varun Saluja
>>> 
> 


Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi,

Could see intermittent GCs and mutation drops.

*System log reports:*

INFO  [Service Thread]  GCInspector.java:252 - ParNew GC in 3816ms.  CMS
Old Gen: 4663180720 -> 5520012520; Par Eden Space: 1718091776 -> 0; Par
Survivor Space: 0 -> 214695936
INFO  [ScheduledTasks:1] MessagingService.java:888 - 228 MUTATION messages
dropped in last 5000ms

PS: As of now , there is no significant load on our cluster. The only load
is of these hints been replayed.

Can you Please help.

Regards,
Varun Saluja

On 16 May 2017 at 18:28, varun saluja  wrote:

> Hi Nitan,
>
> Thanks for response.
>
> Yes, I could see mutation drops and increase count in system.hints. Is
> there any way , i can proceed to truncate hints like using nodetool
> truncatehints.
>
>
> Regards,
> Varun Saluja
>
> On 16 May 2017 at 17:52, Nitan Kainth  wrote:
>
>> Do you see mutation drops?
>> Select count from system.hints; is it increasing?
>>
>> Sent from my iPhone
>>
>> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>>
>> Hi Experts,
>>
>> We are facing issue on production cluster. Compaction on system.hint
>> table is running from last 2 days.
>>
>>
>> pending tasks: 1
>>compaction type   keyspace   table completed  total
>>   unit   progress
>>   Compaction system   hints   20623021829
>> *877874092407*   bytes  2.35%
>> Active compaction remaining time :   0h27m15s
>>
>>
>> Active compaction remaining time shows in minutes.  But, this is job is
>> running like indefinitely.
>>
>> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week
>> on particular table.
>> Compaction on this table finished but hint table size is growing
>> continuously.
>>
>> Can someone Please help me.
>>
>>
>> Thanks & Regards,
>> Varun Saluja
>>
>>
>


Re: Long running compaction on huge hint table.

2017-05-16 Thread varun saluja
Hi Nitan,

Thanks for response.

Yes, I could see mutation drops and increase count in system.hints. Is
there any way , i can proceed to truncate hints like using nodetool
truncatehints.


Regards,
Varun Saluja

On 16 May 2017 at 17:52, Nitan Kainth  wrote:

> Do you see mutation drops?
> Select count from system.hints; is it increasing?
>
> Sent from my iPhone
>
> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
>
> Hi Experts,
>
> We are facing issue on production cluster. Compaction on system.hint table
> is running from last 2 days.
>
>
> pending tasks: 1
>compaction type   keyspace   table completed  total
>   unit   progress
>   Compaction system   hints   20623021829   *877874092407*
>  bytes  2.35%
> Active compaction remaining time :   0h27m15s
>
>
> Active compaction remaining time shows in minutes.  But, this is job is
> running like indefinitely.
>
> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week
> on particular table.
> Compaction on this table finished but hint table size is growing
> continuously.
>
> Can someone Please help me.
>
>
> Thanks & Regards,
> Varun Saluja
>
>


Re: Long running compaction on huge hint table.

2017-05-16 Thread Jason Brown
Varun,

This a message better for the user@ ML.

Thanks,

-Jason

On Tue, May 16, 2017 at 3:41 AM, varun saluja  wrote:

> Hi Experts,
>
> We are facing issue on production cluster. Compaction on system.hint table
> is running from last 2 days.
>
>
> pending tasks: 1
>compaction type   keyspace   table completed  total
>   unit   progress
>   Compaction system   hints   20623021829   877874092407
>  bytes  2.35%
> Active compaction remaining time :   0h27m15s
>
>
> Active compaction remaining time shows in minutes.  But, this is job is
> running like indefinitely.
>
> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week
> on particular table.
> Compaction on this table finished but hint table size is growing
> continuously.
>
> Can someone Please help me.
>
>
> Thanks & Regards,
> Varun Saluja
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Long running compaction on huge hint table.

2017-05-16 Thread Nitan Kainth
Do you see mutation drops?
Select count from system.hints; is it increasing?

Sent from my iPhone

> On May 16, 2017, at 5:52 AM, varun saluja  wrote:
> 
> Hi Experts,
> 
> We are facing issue on production cluster. Compaction on system.hint table is 
> running from last 2 days.
> 
> 
> pending tasks: 1
>compaction type   keyspace   table completed  total
>   unit   progress
>   Compaction system   hints   20623021829   877874092407   
> bytes  2.35%
> Active compaction remaining time :   0h27m15s
> 
> 
> Active compaction remaining time shows in minutes.  But, this is job is 
> running like indefinitely.
> 
> We have 3 node cluster V 2.1.7. And we ran  write intensive job last week on 
> particular table.
> Compaction on this table finished but hint table size is growing continuously.
> 
> Can someone Please help me.
> 
> 
> Thanks & Regards,
> Varun Saluja