subject:"Re\: Nodetool repair"

Re: nodetool repair failing with "Validation failed in /X.X.X.X

2019-05-06 Thread Rhys Campbell

Hello Shalom,

Someone already tried a rolling restart of Cassandra. I will probably try
rebooting the OS.

Repair seems to work if you do it a keyspace at a time.

Thanks for your input.

Rhys

On Sun, May 5, 2019 at 2:14 PM shalom sagges  wrote:

> Hi Rhys,
>
> I encountered this error after adding new SSTables to a cluster and
> running nodetool refresh (v3.0.12).
> The refresh worked, but after starting repairs on the cluster, I got the
> "Validation failed in /X.X.X.X" error on the remote DC.
> A rolling restart solved the issue for me.
>
> Hope this helps!
>
>
>
> On Sat, May 4, 2019 at 3:58 PM Rhys Campbell
>  wrote:
>
>>
>> > Hello,
>> >
>> > I’m having issues running repair on an Apache Cassandra Cluster. I’m
>> getting "Failed creating a merkle tree“ errors on the replication partner
>> nodes. Anyone have any experience of this? I am running 2.2.13.
>> >
>> > Further details here…
>> https://issues.apache.org/jira/projects/CASSANDRA/issues/CASSANDRA-15109?filter=allopenissues
>> >
>> > Best,
>> >
>> > Rhys
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>

Re: nodetool repair failing with "Validation failed in /X.X.X.X

2019-05-05 Thread shalom sagges

Hi Rhys,

I encountered this error after adding new SSTables to a cluster and running
nodetool refresh (v3.0.12).
The refresh worked, but after starting repairs on the cluster, I got the
"Validation failed in /X.X.X.X" error on the remote DC.
A rolling restart solved the issue for me.

Hope this helps!

On Sat, May 4, 2019 at 3:58 PM Rhys Campbell
 wrote:

>
> > Hello,
> >
> > I’m having issues running repair on an Apache Cassandra Cluster. I’m
> getting "Failed creating a merkle tree“ errors on the replication partner
> nodes. Anyone have any experience of this? I am running 2.2.13.
> >
> > Further details here…
> https://issues.apache.org/jira/projects/CASSANDRA/issues/CASSANDRA-15109?filter=allopenissues
> >
> > Best,
> >
> > Rhys
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: nodetool repair -pr

2018-06-08 Thread Arvinder Dhillon

It depends on your data model. -pr only repair primary range. So if there
is a keyspace with replication 'DC2:3', and you run repair -pr only on all
nodes of DC1, it is not going to repair token ranges corsponding to DC2. So
you will have to run on each node.

-Arvinder

On Fri, Jun 8, 2018, 8:42 PM Igor Zubchenok  wrote:

> According docs at
> http://cassandra.apache.org/doc/latest/operating/repair.html?highlight=single
>
>
> *The -pr flag will only repair the “primary” ranges on a node, so you can
> repair your entire cluster by running nodetool repair -pr on each node in
> a single datacenter.*
> But I saw many places, where it is noted that I should run it at ALL data
> centers.
>
> Looking for a qualified answer.
>
>
> On Fri, 8 Jun 2018 at 18:08 Igor Zubchenok  wrote:
>
>> I want to repair all nodes at all data centers.
>>
>> Example:
>> DC1
>>  nodeA
>>  nodeB
>>  nodeC
>> DC2
>>  node D
>>  node E
>>  node F
>>
>> If I run `nodetool repair -pr` at nodeA nodeB and nodeC, will all ranges
>> be repaired?
>>
>>
>> On Fri, 8 Jun 2018 at 17:57 Rahul Singh 
>> wrote:
>>
>>> From DS dox : "Do not use -pr with this option to repair only a local
>>> data center."
>>> On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>>>
>>>
>>> *nodetool repair -pr*
>>>
>>>

Re: nodetool repair -pr

2018-06-08 Thread Igor Zubchenok

According docs at
http://cassandra.apache.org/doc/latest/operating/repair.html?highlight=single

*The -pr flag will only repair the “primary” ranges on a node, so you can
repair your entire cluster by running nodetool repair -pr on each node in
a single datacenter.*
But I saw many places, where it is noted that I should run it at ALL data
centers.

Looking for a qualified answer.

On Fri, 8 Jun 2018 at 18:08 Igor Zubchenok  wrote:

> I want to repair all nodes at all data centers.
>
> Example:
> DC1
>  nodeA
>  nodeB
>  nodeC
> DC2
>  node D
>  node E
>  node F
>
> If I run `nodetool repair -pr` at nodeA nodeB and nodeC, will all ranges
> be repaired?
>
>
> On Fri, 8 Jun 2018 at 17:57 Rahul Singh 
> wrote:
>
>> From DS dox : "Do not use -pr with this option to repair only a local
>> data center."
>> On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>>
>>
>> *nodetool repair -pr*
>>
>>

Re: nodetool repair -pr

2018-06-08 Thread Igor Zubchenok

I want to repair all nodes at all data centers.

Example:
DC1
 nodeA
 nodeB
 nodeC
DC2
 node D
 node E
 node F

If I run `nodetool repair -pr` at nodeA nodeB and nodeC, will all ranges be
repaired?

On Fri, 8 Jun 2018 at 17:57 Rahul Singh 
wrote:

> From DS dox : "Do not use -pr with this option to repair only a local
> data center."
> On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>
>
> *nodetool repair -pr*
>
> --
Regards,
Igor Zubchenok

CTO at Multi Brains LLC
Founder of taxistartup.com saytaxi.com chauffy.com
Skype: igor.zubchenok

Re: nodetool repair -pr

2018-06-08 Thread Rahul Singh

>From DS dox : "Do not use -pr with this option to repair only a local data 
>center."
On Jun 8, 2018, 10:42 AM -0400, user@cassandra.apache.org, wrote:
>
> nodetool repair -pr

Re: Nodetool repair multiple dc

2018-04-20 Thread Abdul Patel

One quick question on reaper ..what data is stored in reaper_db keyspace ?
And how much does it grow?
Do we have to cleanup that frequently or reaper has mechnism to slef clean ?

On Friday, April 13, 2018, Alexander Dejanovski 
wrote:

> Hi Abdul,
>
> Reaper has been used in production for several years now, by many
> companies.
> I've seen it handling 100s of clusters and 1000s of nodes with a single
> Reaper process.
> Check the docs on cassandra-reaper.io to see which architecture matches
> your cluster : http://cassandra-reaper.io/docs/usage/multi_dc/
>
> Cheers,
>
> On Fri, Apr 13, 2018 at 4:38 PM Rahul Singh 
> wrote:
>
>> Makes sense it takes a long time since it has to reconcile against
>> replicas in all DCs. I leverage commercial tools for production clusters,
>> but I’m pretty sure Reaper is the best open source option. Otherwise you’ll
>> waste a lot of time trying to figure it out own your own. No need to
>> reinvent the wheel.
>>
>> On Apr 12, 2018, 11:02 PM -0400, Abdul Patel ,
>> wrote:
>>
>> Hi All,
>>
>> I have 18 node cluster across 3 dc , if i tey to run incremental repair
>> on singke node it takes forever sometome 45 to 1hr and sometime times out
>> ..so i started running "nodetool repair -dc dc1" for each dc one by one
>> ..which works fine ..do we have an better way to handle this?
>> I am thinking abouy exploring cassandra reaper ..does anyone has used
>> that in prod?
>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Re: Nodetool repair multiple dc

2018-04-13 Thread Alexander Dejanovski

Hi Abdul,

Reaper has been used in production for several years now, by many companies.
I've seen it handling 100s of clusters and 1000s of nodes with a single
Reaper process.
Check the docs on cassandra-reaper.io to see which architecture matches
your cluster : http://cassandra-reaper.io/docs/usage/multi_dc/

Cheers,

On Fri, Apr 13, 2018 at 4:38 PM Rahul Singh 
wrote:

> Makes sense it takes a long time since it has to reconcile against
> replicas in all DCs. I leverage commercial tools for production clusters,
> but I’m pretty sure Reaper is the best open source option. Otherwise you’ll
> waste a lot of time trying to figure it out own your own. No need to
> reinvent the wheel.
>
> On Apr 12, 2018, 11:02 PM -0400, Abdul Patel , wrote:
>
> Hi All,
>
> I have 18 node cluster across 3 dc , if i tey to run incremental repair on
> singke node it takes forever sometome 45 to 1hr and sometime times out ..so
> i started running "nodetool repair -dc dc1" for each dc one by one ..which
> works fine ..do we have an better way to handle this?
> I am thinking abouy exploring cassandra reaper ..does anyone has used that
> in prod?
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Nodetool repair multiple dc

2018-04-13 Thread Rahul Singh

Makes sense it takes a long time since it has to reconcile against replicas in 
all DCs. I leverage commercial tools for production clusters, but I’m pretty 
sure Reaper is the best open source option. Otherwise you’ll waste a lot of 
time trying to figure it out own your own. No need to reinvent the wheel.

On Apr 12, 2018, 11:02 PM -0400, Abdul Patel , wrote:
> Hi All,
>
> I have 18 node cluster across 3 dc , if i tey to run incremental repair on 
> singke node it takes forever sometome 45 to 1hr and sometime times out ..so i 
> started running "nodetool repair -dc dc1" for each dc one by one ..which 
> works fine ..do we have an better way to handle this?
> I am thinking abouy exploring cassandra reaper ..does anyone has used that in 
> prod?

Re: nodetool repair and compact

2018-04-02 Thread Alain RODRIGUEZ

I have just this been told that my first statement is inaccurate:

If  'upgradesstable' is run as a routine operation, you might forget about
> it and suffer consequences. 'upgradesstable' is not only doing the
> compaction.


I should probably have checked upgradesstable closely before making this
statement and I definitely will.

Yet, I believe the second point still holds though: 'With UDC, you can
trigger the compaction of the sstables you want to remove the tombstones
from, instead of compacting *all* the sstables for a given table.'

C*heers,

2018-04-02 16:39 GMT+01:00 Alain RODRIGUEZ :

> Hi,
>
> it will re-write this table's sstable files to current version, while
>> re-writing, will evit droppable tombstones (expired +  gc_grace_seconds
>> (default 10 days) ), if partition cross different files, they will still
>> be kept, but most droppable tombstones gone and size reduced.
>>
>
> Nice tip James, I never thought about doing this, it could have been handy
> :).
>
> Now, these compactions can be automatically done using the proper
> tombstone compaction settings in most cases. Generally, tombstone
> compaction is enabled, but if tombstone eviction is still an issue, you
> might want to give a try enabling 'unchecked_tombstone_compaction' in the
> table options. This might claim quite a lot of disk space (depending on the
> sstable overlapping levels).
>
> In case manual action is really needed (even more if it is run
> automatically), I would recommend using 'User Defined Compactions' - UDC
> (accessible through JMX at least) instead of 'uprade sstable':
>
> - It will remove the tombstones the same way, but with no side effect if
> you are currently upgrading for example. If  'upgradesstable' is run as a
> routine operation, you might forget about it and suffer consequences.
> 'upgradesstable' is not only doing the compaction.
> - With UDC, you can trigger the compaction of the sstables you want to
> remove the tombstones from, instead of compacting *all* the sstables for a
> given table.
>
> This last point can prevent harming the cluster with useless compaction,
> and even allow the operator to do things like: 'Compact the 10% biggest
> sstables, that have an estimated tombstone ratio above 0.5, every day' or
> 'compact any sstable having more than 75% of tombstones' as you see fit,
> and using information such as the sstables sizes and sstablemetadata to get
> the tombstone ratio.
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2018-04-02 14:55 GMT+01:00 James Shaw :
>
>> you may use:  nodetool upgradesstables -a keyspace_name table_name
>> it will re-write this table's sstable files to current version, while
>> re-writing, will evit droppable tombstones (expired +  gc_grace_seconds
>> (default 10 days) ), if partition cross different files, they will still
>> be kept, but most droppable tombstones gone and size reduced.
>> It works well for ours.
>>
>>
>>
>> On Mon, Apr 2, 2018 at 12:45 AM, Jon Haddad  wrote:
>>
>>> You’ll find the answers to your questions (and quite a bit more) in this
>>> blog post from my coworker: http://thelastpickle
>>> .com/blog/2016/07/27/about-deletes-and-tombstones.html
>>>
>>> Repair doesn’t clean up tombstones, they’re only removed through
>>> compaction.  I advise taking care with nodetool compact, most of the time
>>> it’s not a great idea for a variety of reasons.  Check out the above post,
>>> if you still have questions, ask away.
>>>
>>>
>>> On Apr 1, 2018, at 9:41 PM, Xiangfei Ni  wrote:
>>>
>>> Hi All,
>>>   I want to delete the expired tombstone, someone uses nodetool repair
>>> ,but someone uses compact,so I want to know which one is the correct way,
>>>   I have read the below pages from Datastax,but the page just tells us
>>> how to use the command,but doesn’t tell us what it is exactly dose,
>>>   https://docs.datastax.com/en/cassandra/3.0/cassandra/tools
>>> /toolsRepair.html
>>>could anybody tell me how to clean the tombstone and give me some
>>> materials include the detailed instruction about the nodetool command and
>>> options?Web link is also ok.
>>>   Thanks very much
>>> Best Regards,
>>>
>>> 倪项菲*/ **David Ni*
>>> 中移德电网络科技有限公司
>>>
>>> Virtue Intelligent Network Ltd, co.
>>> Add: 2003,20F No.35 Luojia creative city,Luoyu Road,Wuhan,HuBei
>>> Mob: +86 13797007811 <+86%20137%209700%207811>|Tel: + 86 27 5024 2516
>>> <+86%2027%205024%202516>
>>>
>>>
>>>
>>
>

Re: nodetool repair and compact

2018-04-02 Thread Alain RODRIGUEZ

Hi,

it will re-write this table's sstable files to current version, while
> re-writing, will evit droppable tombstones (expired +  gc_grace_seconds
> (default 10 days) ), if partition cross different files, they will still
> be kept, but most droppable tombstones gone and size reduced.
>

Nice tip James, I never thought about doing this, it could have been handy
:).

Now, these compactions can be automatically done using the proper tombstone
compaction settings in most cases. Generally, tombstone compaction is
enabled, but if tombstone eviction is still an issue, you might want to
give a try enabling 'unchecked_tombstone_compaction' in the table options.
This might claim quite a lot of disk space (depending on the sstable
overlapping levels).

In case manual action is really needed (even more if it is run
automatically), I would recommend using 'User Defined Compactions' - UDC
(accessible through JMX at least) instead of 'uprade sstable':

- It will remove the tombstones the same way, but with no side effect if
you are currently upgrading for example. If  'upgradesstable' is run as a
routine operation, you might forget about it and suffer consequences.
'upgradesstable' is not only doing the compaction.
- With UDC, you can trigger the compaction of the sstables you want to
remove the tombstones from, instead of compacting *all* the sstables for a
given table.

This last point can prevent harming the cluster with useless compaction,
and even allow the operator to do things like: 'Compact the 10% biggest
sstables, that have an estimated tombstone ratio above 0.5, every day' or
'compact any sstable having more than 75% of tombstones' as you see fit,
and using information such as the sstables sizes and sstablemetadata to get
the tombstone ratio.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-04-02 14:55 GMT+01:00 James Shaw :

> you may use:  nodetool upgradesstables -a keyspace_name table_name
> it will re-write this table's sstable files to current version, while
> re-writing, will evit droppable tombstones (expired +  gc_grace_seconds
> (default 10 days) ), if partition cross different files, they will still
> be kept, but most droppable tombstones gone and size reduced.
> It works well for ours.
>
>
>
> On Mon, Apr 2, 2018 at 12:45 AM, Jon Haddad  wrote:
>
>> You’ll find the answers to your questions (and quite a bit more) in this
>> blog post from my coworker: http://thelastpickle
>> .com/blog/2016/07/27/about-deletes-and-tombstones.html
>>
>> Repair doesn’t clean up tombstones, they’re only removed through
>> compaction.  I advise taking care with nodetool compact, most of the time
>> it’s not a great idea for a variety of reasons.  Check out the above post,
>> if you still have questions, ask away.
>>
>>
>> On Apr 1, 2018, at 9:41 PM, Xiangfei Ni  wrote:
>>
>> Hi All,
>>   I want to delete the expired tombstone, someone uses nodetool repair
>> ,but someone uses compact,so I want to know which one is the correct way,
>>   I have read the below pages from Datastax,but the page just tells us
>> how to use the command,but doesn’t tell us what it is exactly dose,
>>   https://docs.datastax.com/en/cassandra/3.0/cassandra/tools
>> /toolsRepair.html
>>could anybody tell me how to clean the tombstone and give me some
>> materials include the detailed instruction about the nodetool command and
>> options?Web link is also ok.
>>   Thanks very much
>> Best Regards,
>>
>> 倪项菲*/ **David Ni*
>> 中移德电网络科技有限公司
>>
>> Virtue Intelligent Network Ltd, co.
>> Add: 2003,20F No.35 Luojia creative city,Luoyu Road,Wuhan,HuBei
>> Mob: +86 13797007811 <+86%20137%209700%207811>|Tel: + 86 27 5024 2516
>> <+86%2027%205024%202516>
>>
>>
>>
>

Re: nodetool repair and compact

2018-04-02 Thread James Shaw

you may use:  nodetool upgradesstables -a keyspace_name table_name
it will re-write this table's sstable files to current version, while
re-writing, will evit droppable tombstones (expired +  gc_grace_seconds
(default 10 days) ), if partition cross different files, they will still be
kept, but most droppable tombstones gone and size reduced.
It works well for ours.



On Mon, Apr 2, 2018 at 12:45 AM, Jon Haddad  wrote:

> You’ll find the answers to your questions (and quite a bit more) in this
> blog post from my coworker: http://thelastpickle.com/blog/2016/
> 07/27/about-deletes-and-tombstones.html
>
> Repair doesn’t clean up tombstones, they’re only removed through
> compaction.  I advise taking care with nodetool compact, most of the time
> it’s not a great idea for a variety of reasons.  Check out the above post,
> if you still have questions, ask away.
>
>
> On Apr 1, 2018, at 9:41 PM, Xiangfei Ni  wrote:
>
> Hi All,
>   I want to delete the expired tombstone, someone uses nodetool repair
> ,but someone uses compact,so I want to know which one is the correct way,
>   I have read the below pages from Datastax,but the page just tells us how
> to use the command,but doesn’t tell us what it is exactly dose,
>   https://docs.datastax.com/en/cassandra/3.0/cassandra/
> tools/toolsRepair.html
>could anybody tell me how to clean the tombstone and give me some
> materials include the detailed instruction about the nodetool command and
> options?Web link is also ok.
>   Thanks very much
> Best Regards,
>
> 倪项菲*/ **David Ni*
> 中移德电网络科技有限公司
>
> Virtue Intelligent Network Ltd, co.
> Add: 2003,20F No.35 Luojia creative city,Luoyu Road,Wuhan,HuBei
> Mob: +86 13797007811 <+86%20137%209700%207811>|Tel: + 86 27 5024 2516
> <+86%2027%205024%202516>
>
>
>

Re: nodetool repair and compact

2018-04-01 Thread Jon Haddad

You’ll find the answers to your questions (and quite a bit more) in this blog 
post from my coworker: 
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html 


Repair doesn’t clean up tombstones, they’re only removed through compaction.  I 
advise taking care with nodetool compact, most of the time it’s not a great 
idea for a variety of reasons.  Check out the above post, if you still have 
questions, ask away.  


> On Apr 1, 2018, at 9:41 PM, Xiangfei Ni  wrote:
> 
> Hi All,
>   I want to delete the expired tombstone, someone uses nodetool repair ,but 
> someone uses compact,so I want to know which one is the correct way,
>   I have read the below pages from Datastax,but the page just tells us how to 
> use the command,but doesn’t tell us what it is exactly dose,
>   https://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsRepair.html 
> 
>could anybody tell me how to clean the tombstone and give me some 
> materials include the detailed instruction about the nodetool command and 
> options?Web link is also ok.
>   Thanks very much
> Best Regards,
>  
> 倪项菲/ David Ni
> 中移德电网络科技有限公司
> Virtue Intelligent Network Ltd, co.
> 
> Add: 2003,20F No.35 Luojia creative city,Luoyu Road,Wuhan,HuBei
> Mob: +86 13797007811|Tel: + 86 27 5024 2516

Re: Nodetool Repair --full

2018-03-18 Thread kurt greaves

Worth noting that if you have racks == RF you only need to repair one rack
to repair all the data in the cluster if you *don't* use -pr. Also note
that full repairs on >=3.0 case anti-compactions and will mark things as
repaired, so once you start repairs you need to keep repairing to ensure
you don't have any zombie data or other problems.

On 17 March 2018 at 15:52, Hannu Kröger  wrote:

> Hi Jonathan,
>
> If you want to repair just one node (for example if it has been down for
> more than 3h), run “nodetool repair -full” on that node. This will bring
> all data on that node up to date.
>
> If you want to repair all data on the cluster, run “nodetool repair -full
> -pr” on each node. This will run full repair on all nodes but it will do it
> so only the primary range for each node is fixed. If you do it on all
> nodes, effectively the whole token range is repaired. You can run the same
> without -pr to get the same effect but it’s not efficient because then you
> are doing the repair RF times on all data instead of just repairing the
> whole data once.
>
> I hope this clarifies,
> Hannu
>
> On 17 Mar 2018, at 17:20, Jonathan Baynes 
> wrote:
>
> Hi Community,
>
> Can someone confirm, as the documentation out on the web is so
> contradictory and vague.
>
> Nodetool repair –full if I call this, do I need to run this on ALL my
> nodes or is just the once sufficient?
>
> Thanks
> J
>
> *Jonathan Baynes*
> DBA
> Tradeweb Europe Limited
> Moor Place  •  1 Fore Street Avenue
> 
>   •
> 
>   London EC2Y 9DT
> 
> P +44 (0)20 77760988 <+44%2020%207776%200988>  •  F +44 (0)20 7776 3201
> <+44%2020%207776%203201>  •  M +44 (0)7884111546 <+44%207884%20111546>
> jonathan.bay...@tradeweb.com
>
>     follow us:  **
>    <
> image003.jpg> 
> —
> A leading marketplace  for
> electronic fixed income, derivatives and ETF trading
>
>
> 
>
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and destroy it. Any unauthorized
> copying, disclosure or distribution of the material in this e-mail is
> strictly forbidden. Tradeweb reserves the right to monitor all e-mail
> communications through its networks. If you do not wish to receive
> marketing emails about our products / services, please let us know by
> contacting us, either by email at contac...@tradeweb.com or by writing to
> us at the registered office of Tradeweb in the UK, which is: Tradeweb
> Europe Limited (company number 3912826), 1 Fore Street Avenue London EC2Y
> 9DT
> .
> To see our privacy policy, visit our website @ www.tradeweb.com.
>
>
>

Re: Nodetool Repair --full

2018-03-17 Thread Hannu Kröger

Hi Jonathan,

If you want to repair just one node (for example if it has been down for more 
than 3h), run “nodetool repair -full” on that node. This will bring all data on 
that node up to date.

If you want to repair all data on the cluster, run “nodetool repair -full -pr” 
on each node. This will run full repair on all nodes but it will do it so only 
the primary range for each node is fixed. If you do it on all nodes, 
effectively the whole token range is repaired. You can run the same without -pr 
to get the same effect but it’s not efficient because then you are doing the 
repair RF times on all data instead of just repairing the whole data once.

I hope this clarifies,
Hannu

> On 17 Mar 2018, at 17:20, Jonathan Baynes  
> wrote:
> 
> Hi Community,
>  
> Can someone confirm, as the documentation out on the web is so contradictory 
> and vague.
>  
> Nodetool repair –full if I call this, do I need to run this on ALL my nodes 
> or is just the once sufficient?
>  
> Thanks
> J
>  
> Jonathan Baynes
> DBA
> Tradeweb Europe Limited
> Moor Place  •  1 Fore Street Avenue  •  London EC2Y 9DT
> P +44 (0)20 77760988  •  F +44 (0)20 7776 3201  •  M +44 (0)7884111546
> jonathan.bay...@tradeweb.com 
>  
>     follow us:   
> 
> 
> —
> A leading marketplace  for 
> electronic fixed income, derivatives and ETF trading
>  
> 
> 
> This e-mail may contain confidential and/or privileged information. If you 
> are not the intended recipient (or have received this e-mail in error) please 
> notify the sender immediately and destroy it. Any unauthorized copying, 
> disclosure or distribution of the material in this e-mail is strictly 
> forbidden. Tradeweb reserves the right to monitor all e-mail communications 
> through its networks. If you do not wish to receive marketing emails about 
> our products / services, please let us know by contacting us, either by email 
> at contac...@tradeweb.com  or by writing to us 
> at the registered office of Tradeweb in the UK, which is: Tradeweb Europe 
> Limited (company number 3912826), 1 Fore Street Avenue London EC2Y 9DT. To 
> see our privacy policy, visit our website @ www.tradeweb.com 
> .
>

Re: Nodetool repair on read only cluster

2017-11-29 Thread Jeff Jirsa

Over time the various nodes likely got slightly out of sync - dropped mutations 
primarily, during Long GC pauses or maybe network failures

In that case, repair will make all of the data match - how long it takes 
depends on size of data (more data takes longer to validate), size of your 
partitions (big partitions are more work to repair), and how you invoke repair



-- 
Jeff Jirsa


> On Nov 29, 2017, at 5:42 PM, Roger Warner  wrote:
> 
>  
> What would running a repair on a cluster do when there are no deletes nor 
> have there ever been?I have no deletes yet on my data.Yet running a 
> repair took over 9 hours on a 5 node cluster?
>  
> Roger?

Re: Nodetool repair on read only cluster

2017-11-29 Thread @Nandan@

Hi Roger,
As you provide incomplete information which is so tough to analyse .
But if you like to refer then please check below JIRA link to check out is
it useful or not. ?
https://issues.apache.org/jira/browse/CASSANDRA-6616

Thanks.

On Thu, Nov 30, 2017 at 9:42 AM, Roger Warner  wrote:

>
>
> What would running a repair on a cluster do when there are no deletes nor
> have there ever been?I have no deletes yet on my data.Yet running a
> repair took over 9 hours on a 5 node cluster?
>
>
>
> Roger?
>

Re: Nodetool repair -pr

2017-09-29 Thread Blake Eggleston

It will on 2.2 and higher, yes.

Also, just want to point out that it would be worth it for you to compare how 
long incremental repairs take vs full repairs in your cluster. There are some 
problems (which are fixed in 4.0) that can cause significant overstreaming when 
using incremental repair.

On September 28, 2017 at 11:46:47 AM, Dmitry Buzolin (dbuz5ga...@gmail.com) 
wrote:

Hi All, 

Can someone confirm if 

"nodetool repair -pr -j2" does run with -inc too? I see the docs mention -inc 
is set by default, but I am not sure if it is enabled when -pr option is used. 

Thanks! 
- 
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: nodetool repair failure

2017-08-31 Thread Fay Hou [Storage Service]

What is your GC_GRACE_SECONDS ?
What kind repair option do you use for nodetool repair on a keyspace ?
Did you start the repair on one node? did you use nodetool repair -pr ? or
just "nodetool repair keyspace" ? How many nodetool repair processes do you
use on the nodes?





On Sun, Jul 30, 2017 at 10:53 PM, Jeff Jirsa  wrote:

>
>
> On 2017-07-27 21:36 (-0700), Mitch Gitman  wrote:
> > Now, the particular symptom to which that response refers is not what I
> was
> > seeing, but the response got me thinking that perhaps the failures I was
> > getting were on account of attempting to run "nodetool repair
> > --partitioner-range" simultaneously on all the nodes in my cluster. These
> > are only three-node dev clusters, and what I would see is that the repair
> > would pass on one node but fail on the other two.
>
>
> Running nodetool repair --partitioner-range simultaneously on all nodes in
> the cluster will indeed be a problem, and the symptoms will vary widely
> based on node state / write load / compaction load. This is one of the
> times when the right answer is "don't do that" until the project comes up
> with a way to prevent you from doing it in order to protect you from
> yourself.
>
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: nodetool repair failure

2017-07-30 Thread Jeff Jirsa

On 2017-07-27 21:36 (-0700), Mitch Gitman  wrote: 
> Now, the particular symptom to which that response refers is not what I was
> seeing, but the response got me thinking that perhaps the failures I was
> getting were on account of attempting to run "nodetool repair
> --partitioner-range" simultaneously on all the nodes in my cluster. These
> are only three-node dev clusters, and what I would see is that the repair
> would pass on one node but fail on the other two.

Running nodetool repair --partitioner-range simultaneously on all nodes in the 
cluster will indeed be a problem, and the symptoms will vary widely based on 
node state / write load / compaction load. This is one of the times when the 
right answer is "don't do that" until the project comes up with a way to 
prevent you from doing it in order to protect you from yourself.

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: nodetool repair failure

2017-07-30 Thread kurt greaves

You need check the node that failed validation to find the relevant error.
The IP should be in the logs of the node you started repair on.

You shouldn't run multiple repairs on the same table from multiple nodes
unless you really know what you're doing and not using vnodes. The failure
you are likely seeing is that multiple repairs are trying to occur on the
same SSTable, which will cause the repair to fail.

Re: nodetool repair failure

2017-07-27 Thread Mitch Gitman

Michael, thanks for the input. I don't think I'm going to need to upgrade
to 3.11 for the sake of getting nodetool repair working for me. Instead, I
have another plausible explanation and solution for my particular situation.

First, I should say that disk usage proved to be a red herring. There was
plenty of disk space available.

When I said that the error message I was seeing was no more precise than
"Some repair failed," I misstated things. Just above that error message was
another further detail: "Validation failed in /(IP address of host)." Of
course, that's still vague. What validation failed?

However, that extra information led me to this JIRA ticket:
https://issues.apache.org/jira/browse/CASSANDRA-10057. In particular this
comment: "If you invoke repair on multiple node at once, this can be
happen. Can you confirm? And once it happens, the error will continue
unless you restart the node since some resources remain due to the hang. I
will post the patch not to hang."

Now, the particular symptom to which that response refers is not what I was
seeing, but the response got me thinking that perhaps the failures I was
getting were on account of attempting to run "nodetool repair
--partitioner-range" simultaneously on all the nodes in my cluster. These
are only three-node dev clusters, and what I would see is that the repair
would pass on one node but fail on the other two.

So I tried running the repairs sequentially on each of the nodes. With this
change the repair works, and I have every expectation that it will continue
to work--that running repair sequentially is the solution to my particular
problem. If this is the case and repairs are intended to be run
sequentially, then that constitutes a contract change for nodetool repair.
This is the first time I'm running a repair on a multi-node cluster on
Cassandra 3.10, and only with 3.10 was I seeing this problem. I'd never
seen it previously running repairs on Cassandra 2.1 clusters, which is what
I was upgrading from.

The last comment in that particular JIRA ticket is coming from someone
reporting the same problem I'm seeing, and their experience indirectly
corroborates mine, or at least it doesn't contradict mine.

On Thu, Jul 27, 2017 at 10:26 AM, Michael Shuler 
wrote:

> On 07/27/2017 12:10 PM, Mitch Gitman wrote:
> > I'm using Apache Cassandra 3.10.
> 
> > this is a dev cluster I'm talking about.
> 
> > Further insights welcome...
>
> Upgrade and see if one of the many fixes for 3.11.0 helped?
>
> https://github.com/apache/cassandra/blob/cassandra-3.11.
> 0/CHANGES.txt#L1-L129
>
> If you can reproduce on 3.11.0, hit JIRA with the steps to repro. There
> are several bug fixes committed to the cassandra-3.11 branch, pending a
> 3.11.1 release, but I don't see one that's particularly relevant to your
> trace.
>
> https://github.com/apache/cassandra/blob/cassandra-3.11/CHANGES.txt
>
> --
> Kind regards,
> Michael
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

Re: nodetool repair failure

2017-07-27 Thread Michael Shuler

On 07/27/2017 12:10 PM, Mitch Gitman wrote:
> I'm using Apache Cassandra 3.10.

> this is a dev cluster I'm talking about.

> Further insights welcome...

Upgrade and see if one of the many fixes for 3.11.0 helped?

https://github.com/apache/cassandra/blob/cassandra-3.11.0/CHANGES.txt#L1-L129

If you can reproduce on 3.11.0, hit JIRA with the steps to repro. There
are several bug fixes committed to the cassandra-3.11 branch, pending a
3.11.1 release, but I don't see one that's particularly relevant to your
trace.

https://github.com/apache/cassandra/blob/cassandra-3.11/CHANGES.txt

-- 
Kind regards,
Michael

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

I want to add an extra data point to this thread having encountered much
the same problem. I'm using Apache Cassandra 3.10. I attempted to run an
incremental repair that was optimized to take advantage of some downtime
where the cluster is not fielding traffic and only repair each node's
primary partitioner range:
nodetool repair --partitioner-range

On a couple nodes, I was seeing the repair fail with the vague "Some repair
failed" message:
[2017-07-27 15:30:59,283] Some repair failed
[2017-07-27 15:30:59,286] Repair command #2 finished in 10 seconds
error: Repair job has failed with the error message: [2017-07-27
15:30:59,283] Some repair failed
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message:
[2017-07-27 15:30:59,283] Some repair failed
at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
at
org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)

Running with the --trace option yielded no additional relevant information.

On one node where this was arising, I was able to run the repair again with
just the keyspace of interest, see that work, run the repair another time
across all keyspaces, and see that work as well.

On another node, just trying again did not work. What did work was running
a "nodetool compact". The subsequent repair on that node succeeded, even
though it took inordinately long. Strangely, another repair after that
failed. But then the next couple succeeded.

I proceeded to do a "df -h" on the Ubuntu hosts and noticed that the disk
usage was inordinately high. This is my hypothesis as to the underlying
cause. Fortunately for me, this is a dev cluster I'm talking about.

Pertinent troubleshooting steps:
* nodetool compact
* Check disk usage. Better yet, preemptively alert on disk usage exceeding
a certain threshold.

Further insights welcome...

Re: "nodetool repair -dc"

2017-07-11 Thread Anuj Wadehra

Hi, 
I have not used dc local repair specifically but generally repair syncs all 
local tokens of the node with other replicas (full repair) or a subset of local 
tokens (-pr and subrange). Full repair with - Dc option should only sync data 
for all the tokens present on the node where the command is run with other 
replicas in local dc.
You should run full repair on all nodes of the DC unless RF of all keyspaces in 
local DC =number of nodes in DC. E.g if you have 3 nodes in dc1 and RF is 
DC1:3, repairing single node should sync all data within a DC. This doesnt hold 
true if you have 5 nodes and no node holds 100% data. 
Running full repair on all nodes in a dc may lead to repairing every data RF 
times. Inefficient!!  And you cant use pr with dc option.   Even if its allowed 
it wont repair entire ring as a dc owns a subset of entire token ring. 
ThanksAnuj
 
 
  On Tue, 11 Jul 2017 at 20:08, vasu gunja wrote:   Hi ,
My Question specific to -dc option 
Do we need to run this on all nodes that belongs to that DC ?Or only on one of 
the nodes that belongs to that DC then it will repair all nodes ?

On Sat, Jul 8, 2017 at 10:56 PM, Varun Gupta  wrote:

I do not see the need to run repair, as long as cluster was in healthy state on 
adding new nodes.
On Fri, Jul 7, 2017 at 8:37 AM, vasu gunja  wrote:

Hi , 
I have a question regarding "nodetool repair -dc" option. recently we added 
multiple nodes to one DC center, we want to perform repair only on current DC. 
Here is my question.
Do we need to perform "nodetool repair -dc" on all nodes belongs to that DC ? 
or only one node of that DC?


thanks,V

Re: "nodetool repair -dc"

2017-07-11 Thread vasu gunja

Hi ,

My Question specific to -dc option

Do we need to run this on all nodes that belongs to that DC ?
Or only on one of the nodes that belongs to that DC then it will repair all
nodes ?


On Sat, Jul 8, 2017 at 10:56 PM, Varun Gupta  wrote:

> I do not see the need to run repair, as long as cluster was in healthy
> state on adding new nodes.
>
> On Fri, Jul 7, 2017 at 8:37 AM, vasu gunja  wrote:
>
>> Hi ,
>>
>> I have a question regarding "nodetool repair -dc" option. recently we
>> added multiple nodes to one DC center, we want to perform repair only on
>> current DC.
>>
>> Here is my question.
>>
>> Do we need to perform "nodetool repair -dc" on all nodes belongs to that
>> DC ?
>> or only one node of that DC?
>>
>>
>>
>> thanks,
>> V
>>
>
>

Re: "nodetool repair -dc"

2017-07-08 Thread Varun Gupta

I do not see the need to run repair, as long as cluster was in healthy
state on adding new nodes.

On Fri, Jul 7, 2017 at 8:37 AM, vasu gunja  wrote:

> Hi ,
>
> I have a question regarding "nodetool repair -dc" option. recently we
> added multiple nodes to one DC center, we want to perform repair only on
> current DC.
>
> Here is my question.
>
> Do we need to perform "nodetool repair -dc" on all nodes belongs to that
> DC ?
> or only one node of that DC?
>
>
>
> thanks,
> V
>

RE: nodetool repair failure

2017-06-30 Thread Anubhav Kale

If possible, simply read the table under question with consistency=ALL. This 
will trigger a repair and is far more reliable than the nodetool command.

From: Balaji Venkatesan [mailto:venkatesan.bal...@gmail.com]
Sent: Thursday, June 29, 2017 7:26 PM
To: user@cassandra.apache.org
Subject: Re: nodetool repair failure

It did not help much. But other issue or error I saw when I repair the keyspace 
was it says

"Sync failed between /xx.xx.xx.93 and /xx.xx.xx.94" this was run from .91 node.



On Thu, Jun 29, 2017 at 4:44 PM, Akhil Mehra 
<akhilme...@gmail.com<mailto:akhilme...@gmail.com>> wrote:
Run the following query and see if it gives you more information:

select * from system_distributed.repair_history;

Also is there any additional logging on the nodes where the error is coming 
from. Seems to be xx.xx.xx.94 for your last run.


On 30/06/2017, at 9:43 AM, Balaji Venkatesan 
<venkatesan.bal...@gmail.com<mailto:venkatesan.bal...@gmail.com>> wrote:

The verify and scrub went without any error on the keyspace. I ran it again 
with trace mode and still the same issue


[2017-06-29 21:37:45,578] Parsing UPDATE 
system_distributed.parent_repair_history SET finished_at = toTimestamp(now()), 
successful_ranges = {'} WHERE parent_id=f1f10af0-5d12-11e7-8df9-59d19ef3dd23
[2017-06-29 21:37:45,580] Preparing statement
[2017-06-29 21:37:45,580] Determining replicas for mutation
[2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.95
[2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.94
[2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.93
[2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from /xx.xx.xx.93
[2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from /xx.xx.xx.94
[2017-06-29 21:37:45,581] Processing response from /xx.xx.xx.93
[2017-06-29 21:37:45,581] /xx.xx.xx.94: MUTATION message received from 
/xx.xx.xx.91
[2017-06-29 21:37:45,582] Processing response from /xx.xx.xx.94
[2017-06-29 21:37:45,582] /xx.xx.xx.93: MUTATION message received from 
/xx.xx.xx.91
[2017-06-29 21:37:45,582] /xx.xx.xx.95: MUTATION message received from 
/xx.xx.xx.91
[2017-06-29 21:37:45,582] /xx.xx.xx.94: Appending to commitlog
[2017-06-29 21:37:45,582] /xx.xx.xx.94: Adding to parent_repair_history memtable
[2017-06-29 21:37:45,582] Some repair failed
[2017-06-29 21:37:45,582] Repair command #3 finished in 1 minute 44 seconds
error: Repair job has failed with the error message: [2017-06-29 21:37:45,582] 
Some repair failed
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message: 
[2017-06-29 21:37:45,582] Some repair failed
at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
at 
org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
at 
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
at 
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
at 
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
at 
com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)



On Thu, Jun 29, 2017 at 1:36 PM, Subroto Barua 
<sbarua...@yahoo.com.invalid<mailto:sbarua...@yahoo.com.invalid>> wrote:
Balaji,

Are you repairing a specific keyspace/table? if the failure is tied to a table, 
try 'verify' and 'scrub' options on .91...see if you get any errors.




On Thursday, June 29, 2017, 12:12:14 PM PDT, Balaji Venkatesan 
<venkatesan.bal...@gmail.com<mailto:venkatesan.bal...@gmail.com>> wrote:


Thanks. I tried with trace option and there is not much info. Here are the few 
log lines just before it failed.


[2017-06-29 19:01:54,969] /xx.xx.xx.93: Sending REPAIR_MESSAGE message to 
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: E

Re: nodetool repair failure

2017-06-29 Thread Balaji Venkatesan

It did not help much. But other issue or error I saw when I repair the
keyspace was it says

"Sync failed between /xx.xx.xx.93 and /xx.xx.xx.94" this was run from .91
node.



On Thu, Jun 29, 2017 at 4:44 PM, Akhil Mehra  wrote:

> Run the following query and see if it gives you more information:
>
> select * from system_distributed.repair_history;
>
> Also is there any additional logging on the nodes where the error is
> coming from. Seems to be xx.xx.xx.94 for your last run.
>
>
> On 30/06/2017, at 9:43 AM, Balaji Venkatesan 
> wrote:
>
> The verify and scrub went without any error on the keyspace. I ran it
> again with trace mode and still the same issue
>
>
> [2017-06-29 21:37:45,578] Parsing UPDATE 
> system_distributed.parent_repair_history
> SET finished_at = toTimestamp(now()), successful_ranges = {'} WHERE
> parent_id=f1f10af0-5d12-11e7-8df9-59d19ef3dd23
> [2017-06-29 21:37:45,580] Preparing statement
> [2017-06-29 21:37:45,580] Determining replicas for mutation
> [2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.95
> [2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.94
> [2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.93
> [2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from
> /xx.xx.xx.93
> [2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from
> /xx.xx.xx.94
> [2017-06-29 21:37:45,581] Processing response from /xx.xx.xx.93
> [2017-06-29 21:37:45,581] /xx.xx.xx.94: MUTATION message received from
> /xx.xx.xx.91
> [2017-06-29 21:37:45,582] Processing response from /xx.xx.xx.94
> [2017-06-29 21:37:45,582] /xx.xx.xx.93: MUTATION message received from
> /xx.xx.xx.91
> [2017-06-29 21:37:45,582] /xx.xx.xx.95: MUTATION message received from
> /xx.xx.xx.91
> [2017-06-29 21:37:45,582] /xx.xx.xx.94: Appending to commitlog
> [2017-06-29 21:37:45,582] /xx.xx.xx.94: Adding to parent_repair_history
> memtable
> [2017-06-29 21:37:45,582] Some repair failed
> [2017-06-29 21:37:45,582] Repair command #3 finished in 1 minute 44 seconds
> error: Repair job has failed with the error message: [2017-06-29
> 21:37:45,582] Some repair failed
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message:
> [2017-06-29 21:37:45,582] Some repair failed
> at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
> at org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListene
> r.handleNotification(JMXNotificationProgressListener.java:77)
> at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.
> dispatchNotification(ClientNotifForwarder.java:583)
> at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(
> ClientNotifForwarder.java:533)
> at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(
> ClientNotifForwarder.java:452)
> at com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(
> ClientNotifForwarder.java:108)
>
>
>
> On Thu, Jun 29, 2017 at 1:36 PM, Subroto Barua <
> sbarua...@yahoo.com.invalid> wrote:
>
>> Balaji,
>>
>> Are you repairing a specific keyspace/table? if the failure is tied to a
>> table, try 'verify' and 'scrub' options on .91...see if you get any errors.
>>
>>
>>
>>
>> On Thursday, June 29, 2017, 12:12:14 PM PDT, Balaji Venkatesan <
>> venkatesan.bal...@gmail.com> wrote:
>>
>>
>> Thanks. I tried with trace option and there is not much info. Here are
>> the few log lines just before it failed.
>>
>>
>> [2017-06-29 19:01:54,969] /xx.xx.xx.93: Sending REPAIR_MESSAGE message to
>> /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
>> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message

Re: nodetool repair failure

2017-06-29 Thread Akhil Mehra

Run the following query and see if it gives you more information:

select * from system_distributed.repair_history;

Also is there any additional logging on the nodes where the error is coming 
from. Seems to be xx.xx.xx.94 for your last run.


> On 30/06/2017, at 9:43 AM, Balaji Venkatesan  
> wrote:
> 
> The verify and scrub went without any error on the keyspace. I ran it again 
> with trace mode and still the same issue
> 
> 
> [2017-06-29 21:37:45,578] Parsing UPDATE 
> system_distributed.parent_repair_history SET finished_at = 
> toTimestamp(now()), successful_ranges = {'} WHERE 
> parent_id=f1f10af0-5d12-11e7-8df9-59d19ef3dd23
> [2017-06-29 21:37:45,580] Preparing statement
> [2017-06-29 21:37:45,580] Determining replicas for mutation
> [2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.95
> [2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.94
> [2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.93
> [2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from /xx.xx.xx.93
> [2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from /xx.xx.xx.94
> [2017-06-29 21:37:45,581] Processing response from /xx.xx.xx.93
> [2017-06-29 21:37:45,581] /xx.xx.xx.94: MUTATION message received from 
> /xx.xx.xx.91
> [2017-06-29 21:37:45,582] Processing response from /xx.xx.xx.94
> [2017-06-29 21:37:45,582] /xx.xx.xx.93: MUTATION message received from 
> /xx.xx.xx.91
> [2017-06-29 21:37:45,582] /xx.xx.xx.95: MUTATION message received from 
> /xx.xx.xx.91
> [2017-06-29 21:37:45,582] /xx.xx.xx.94: Appending to commitlog
> [2017-06-29 21:37:45,582] /xx.xx.xx.94: Adding to parent_repair_history 
> memtable
> [2017-06-29 21:37:45,582] Some repair failed
> [2017-06-29 21:37:45,582] Repair command #3 finished in 1 minute 44 seconds
> error: Repair job has failed with the error message: [2017-06-29 
> 21:37:45,582] Some repair failed
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message: 
> [2017-06-29 21:37:45,582] Some repair failed
>   at 
> org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
>   at 
> org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
>   at 
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
>   at 
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
>   at 
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
>   at 
> com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)
> 
> 
> 
> On Thu, Jun 29, 2017 at 1:36 PM, Subroto Barua  > wrote:
> Balaji,
> 
> Are you repairing a specific keyspace/table? if the failure is tied to a 
> table, try 'verify' and 'scrub' options on .91...see if you get any errors.
> 
> 
> 
> 
> On Thursday, June 29, 2017, 12:12:14 PM PDT, Balaji Venkatesan 
> > wrote:
> 
> 
> Thanks. I tried with trace option and there is not much info. Here are the 
> few log lines just before it failed.
> 
> 
> [2017-06-29 19:01:54,969] /xx.xx.xx.93: Sending REPAIR_MESSAGE message to 
> /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to 
> /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to 
> /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending

Re: nodetool repair failure

2017-06-29 Thread Balaji Venkatesan

The verify and scrub went without any error on the keyspace. I ran it again
with trace mode and still the same issue


[2017-06-29 21:37:45,578] Parsing UPDATE
system_distributed.parent_repair_history SET finished_at =
toTimestamp(now()), successful_ranges = {'} WHERE
parent_id=f1f10af0-5d12-11e7-8df9-59d19ef3dd23
[2017-06-29 21:37:45,580] Preparing statement
[2017-06-29 21:37:45,580] Determining replicas for mutation
[2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.95
[2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.94
[2017-06-29 21:37:45,580] Sending MUTATION message to /xx.xx.xx.93
[2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from
/xx.xx.xx.93
[2017-06-29 21:37:45,581] REQUEST_RESPONSE message received from
/xx.xx.xx.94
[2017-06-29 21:37:45,581] Processing response from /xx.xx.xx.93
[2017-06-29 21:37:45,581] /xx.xx.xx.94: MUTATION message received from
/xx.xx.xx.91
[2017-06-29 21:37:45,582] Processing response from /xx.xx.xx.94
[2017-06-29 21:37:45,582] /xx.xx.xx.93: MUTATION message received from
/xx.xx.xx.91
[2017-06-29 21:37:45,582] /xx.xx.xx.95: MUTATION message received from
/xx.xx.xx.91
[2017-06-29 21:37:45,582] /xx.xx.xx.94: Appending to commitlog
[2017-06-29 21:37:45,582] /xx.xx.xx.94: Adding to parent_repair_history
memtable
[2017-06-29 21:37:45,582] Some repair failed
[2017-06-29 21:37:45,582] Repair command #3 finished in 1 minute 44 seconds
error: Repair job has failed with the error message: [2017-06-29
21:37:45,582] Some repair failed
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message:
[2017-06-29 21:37:45,582] Some repair failed
at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
at
org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)



On Thu, Jun 29, 2017 at 1:36 PM, Subroto Barua 
wrote:

> Balaji,
>
> Are you repairing a specific keyspace/table? if the failure is tied to a
> table, try 'verify' and 'scrub' options on .91...see if you get any errors.
>
>
>
>
> On Thursday, June 29, 2017, 12:12:14 PM PDT, Balaji Venkatesan <
> venkatesan.bal...@gmail.com> wrote:
>
>
> Thanks. I tried with trace option and there is not much info. Here are the
> few log lines just before it failed.
>
>
> [2017-06-29 19:01:54,969] /xx.xx.xx.93: Sending REPAIR_MESSAGE message to
> /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message
> to /xx.xx.xx.91
> [2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE

Re: nodetool repair failure

2017-06-29 Thread Subroto Barua

Balaji,
Are you repairing a specific keyspace/table? if the failure is tied to a table, 
try 'verify' and 'scrub' options on .91...see if you get any errors.



On Thursday, June 29, 2017, 12:12:14 PM PDT, Balaji Venkatesan 
 wrote:

Thanks. I tried with trace option and there is not much info. Here are the few 
log lines just before it failed.

[2017-06-29 19:01:54,969] /xx.xx.xx.93: Sending REPAIR_MESSAGE message to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to 
commitlog[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history 
memtable[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to 
commitlog[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history 
memtable[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to 
commitlog[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history 
memtable[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to 
commitlog[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history 
memtable[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to 
commitlog[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history 
memtable[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to 
commitlog[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history 
memtable[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE 
message to /xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending 
REQUEST_RESPONSE message to /xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: 
Sending REQUEST_RESPONSE message to /xx.xx.xx.91[2017-06-29 19:01:54,969] 
/xx.xx.xx.92: Sending REQUEST_RESPONSE message to /xx.xx.xx.91[2017-06-29 
19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE 
message to /xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending 
REQUEST_RESPONSE message to /xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: 
Sending REQUEST_RESPONSE message to /xx.xx.xx.91[2017-06-29 19:01:54,969] 
/xx.xx.xx.92: Sending REQUEST_RESPONSE message to /xx.xx.xx.91[2017-06-29 
19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to 
/xx.xx.xx.91[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE 
message to /xx.xx.xx.91[2017-06-29 19:02:04,842] Some repair failed[2017-06-29 
19:02:04,848] Repair command #1 finished in 1 minute 2 secondserror: Repair job 
has failed with the error message: [2017-06-29 19:02:04,842] Some repair 
failed-- StackTrace --java.lang.RuntimeException: Repair job has failed with 
the error message: [2017-06-29 19:02:04,842] Some repair failed at 
org.apache.cassandra.tools. RepairRunner.progress( RepairRunner.java:116) at 
org.apache.cassandra.utils. progress.jmx. JMXNotificationProgressListene 
r.handleNotification( JMXNotificationProgressListene r.java:77) at 
com.sun.jmx.remote.internal. ClientNotifForwarder$ NotifFetcher. 
dispatchNotification( ClientNotifForwarder.java:583) at 
com.sun.jmx.remote.internal. ClientNotifForwarder$ NotifFetcher.doRun( 
ClientNotifForwarder.java:533) at com.sun.jmx.remote.internal. 
ClientNotifForwarder$ NotifFetcher.run( ClientNotifForwarder.java:452) at 
com.sun.jmx.remote.internal. ClientNotifForwarder$ LinearExecutor$1.run( 
ClientNotifForwarder.java:108)


FYI I am running repair from xx.xx.xx.91 node and its a 5 node cluster 
xx.xx.xx.91-xx.xx.xx.95
On Wed, Jun 28, 2017 at 5:16 PM, Akhil Mehra  wrote:

nodetool repair has a trace option 
nodetool repair -tr yourkeyspacename
see if that provides you with additional information.
Regards,Akhil 

On 28/06/2017, at 2:25 AM, Balaji Venkatesan  
wrote:

We use Apache Cassandra 3.10-13 

On Jun 26, 2017 8:41 PM, "Michael Shuler"  wrote:

What version of Cassandra?

--
Michael

On 06/26/2017 09:53 PM, Balaji Venkatesan wrote:
> Hi All,
>
> When I run nodetool repair on a keyspace I constantly get  "Some repair
> failed" error, there are no sufficient info to debug more. Any help?
>
> Here is the stacktrace
>
> == == ==
> [2017-06-27 02:44:34,275] Some repair failed
> [2017-06-27 02:44:34,279] Repair command #3 finished in 33 seconds
> error: Repair job has failed with the error message: [2017-06-27
> 02:44:34,275] Some repair failed
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error
> message: [2017-06-27 02:44:34,275] Some repair failed
> at org.apache.cassandra.tools.Rep

Re: nodetool repair failure

2017-06-29 Thread Balaji Venkatesan

Thanks. I tried with trace option and there is not much info. Here are the
few log lines just before it failed.


[2017-06-29 19:01:54,969] /xx.xx.xx.93: Sending REPAIR_MESSAGE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Appending to commitlog
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Adding to repair_history memtable
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Enqueuing response to /xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:01:54,969] /xx.xx.xx.92: Sending REQUEST_RESPONSE message to
/xx.xx.xx.91
[2017-06-29 19:02:04,842] Some repair failed
[2017-06-29 19:02:04,848] Repair command #1 finished in 1 minute 2 seconds
error: Repair job has failed with the error message: [2017-06-29
19:02:04,842] Some repair failed
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message:
[2017-06-29 19:02:04,842] Some repair failed
at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
at org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListene
r.handleNotification(JMXNotificationProgressListener.java:77)
at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.
dispatchNotification(ClientNotifForwarder.java:583)
at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(
ClientNotifForwarder.java:533)
at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(
ClientNotifForwarder.java:452)
at com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(
ClientNotifForwarder.java:108)



FYI I am running repair from xx.xx.xx.91 node and its a 5 node cluster
xx.xx.xx.91-xx.xx.xx.95

On Wed, Jun 28, 2017 at 5:16 PM, Akhil Mehra  wrote:

> nodetool repair has a trace option
>
> nodetool repair -tr yourkeyspacename
>
> see if that provides you with additional information.
>
> Regards,
> Akhil
>
> On 28/06/2017, at 2:25 AM, Balaji Venkatesan 
> wrote:
>
>
> We use Apache Cassandra 3.10-13
>
> On Jun 26, 2017 8:41 PM, "Michael Shuler"  wrote:
>
> What version of Cassandra?
>
> --
> Michael
>
> On 06/26/2017 09:53 PM, Balaji Venkatesan wrote:
> > Hi All,
> >
> > When I run nodetool repair on a keyspace I constantly get  "Some repair
> > failed" error, there are no sufficient info to debug more. Any help?
> >
> > Here is the stacktrace
> >
> > ==
> > [2017-06-27 02:44:34,275] Some repair failed
> > [2017-06-27 02:44:34,279] Repair command #3 finished in 33 seconds
> > error: Repair job has failed with the error message: [2017-06-27
> > 02:44:34,275] Some repair failed
> > -- StackTrace --
> > java.lang.RuntimeException: Repair job has failed with the error
> > message: [2017-06-27 02:44:34,275] Some repair failed
> > at org.apache.cassandra.tools.RepairRunner.progress(RepairRunne
> r.java:116)
> > at
> > org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.
> handleNotification(JMXNotificationProgressListener.java:77)
> > at
> >

Re: nodetool repair failure

2017-06-28 Thread Akhil Mehra

nodetool repair has a trace option 

nodetool repair -tr yourkeyspacename

see if that provides you with additional information.

Regards,
Akhil 

> On 28/06/2017, at 2:25 AM, Balaji Venkatesan  
> wrote:
> 
> 
> We use Apache Cassandra 3.10-13 
> 
> On Jun 26, 2017 8:41 PM, "Michael Shuler"  > wrote:
> What version of Cassandra?
> 
> --
> Michael
> 
> On 06/26/2017 09:53 PM, Balaji Venkatesan wrote:
> > Hi All,
> >
> > When I run nodetool repair on a keyspace I constantly get  "Some repair
> > failed" error, there are no sufficient info to debug more. Any help?
> >
> > Here is the stacktrace
> >
> > ==
> > [2017-06-27 02:44:34,275] Some repair failed
> > [2017-06-27 02:44:34,279] Repair command #3 finished in 33 seconds
> > error: Repair job has failed with the error message: [2017-06-27
> > 02:44:34,275] Some repair failed
> > -- StackTrace --
> > java.lang.RuntimeException: Repair job has failed with the error
> > message: [2017-06-27 02:44:34,275] Some repair failed
> > at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
> > at
> > org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
> > at
> > com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
> > at
> > com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
> > at
> > com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
> > at
> > com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)
> > ==
> >
> >
> > --
> > Thanks,
> > Balaji Venkatesan.
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 
>

Re: nodetool repair failure

2017-06-27 Thread Balaji Venkatesan

We use Apache Cassandra 3.10-13

On Jun 26, 2017 8:41 PM, "Michael Shuler"  wrote:

What version of Cassandra?

--
Michael

On 06/26/2017 09:53 PM, Balaji Venkatesan wrote:
> Hi All,
>
> When I run nodetool repair on a keyspace I constantly get  "Some repair
> failed" error, there are no sufficient info to debug more. Any help?
>
> Here is the stacktrace
>
> ==
> [2017-06-27 02:44:34,275] Some repair failed
> [2017-06-27 02:44:34,279] Repair command #3 finished in 33 seconds
> error: Repair job has failed with the error message: [2017-06-27
> 02:44:34,275] Some repair failed
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error
> message: [2017-06-27 02:44:34,275] Some repair failed
> at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
> at
> org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListene
r.handleNotification(JMXNotificationProgressListener.java:77)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.
dispatchNotification(ClientNotifForwarder.java:583)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(
ClientNotifForwarder.java:533)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(
ClientNotifForwarder.java:452)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(
ClientNotifForwarder.java:108)
> ==
>
>
> --
> Thanks,
> Balaji Venkatesan.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: nodetool repair failure

2017-06-26 Thread Michael Shuler

What version of Cassandra?

-- 
Michael

On 06/26/2017 09:53 PM, Balaji Venkatesan wrote:
> Hi All,
> 
> When I run nodetool repair on a keyspace I constantly get  "Some repair
> failed" error, there are no sufficient info to debug more. Any help? 
> 
> Here is the stacktrace
> 
> ==
> [2017-06-27 02:44:34,275] Some repair failed
> [2017-06-27 02:44:34,279] Repair command #3 finished in 33 seconds
> error: Repair job has failed with the error message: [2017-06-27
> 02:44:34,275] Some repair failed
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error
> message: [2017-06-27 02:44:34,275] Some repair failed
> at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
> at
> org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)
> ==
> 
> 
> -- 
> Thanks,
> Balaji Venkatesan.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

1 2 3 >

1 - 100 of 217 matches

Mail list logo