Re: Hundreds of sstables after every Repair

2015-06-10 Thread Anuj Wadehra
NTP output attached. Any other comments on the two queries ?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Tue, 9 Jun, 2015 at 10:59 pm
Subject:Re: Hundreds of sstables after every Repair

Yes. We use NTP. We also thought that drift is creating problems. Our NTP 
Output is as under:


[root@node1 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  237 1024  377    1.199    0.062   0.554 
*10.x.x.x    10.x.x.x 2 u  178 1024  377    0.479   -0.350   0.626 
 
 
 
[root@node2 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  124 1024  377    0.939   -0.001   0.614 
*10.x.x.x    10.x.x.x 2 u  722 1024  377    0.567   -0.241   0.585 
 
 
 
[root@node3 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  514 1024  377    0.716   -0.103   1.315 
*10.x.x.x    10.x.x.x 2 u   21 1024  377    0.402   -0.262   1.070 
 
 
***IPs are masked


Thanks

Anuj Wadehra




On Tuesday, 9 June 2015 9:12 PM, Carlos Rolo r...@pythian.com wrote:



Hello,

Do you have your clocks synced across your cluster? Are you using NTP and have 
it properly configured?

Sometimes clock out of sync can trigger weird behaviour.


Regards,


Carlos Juzarte Rolo

Cassandra Consultant

 

Pythian - Love your data


rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo

Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

www.pythian.com


On Tue, Jun 9, 2015 at 5:11 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

We were facing dropped mutations earlier and we increased flush writers. Now 
there are no dropped mutations in tpstats. To repair the damaged vnodes / 
inconsistent data we executed repair -pr on all nodes. Still, we see the same 
problem. 


When we analyze repair logs we see 2 strange things:


1. Out of sync ranges for cf which are not being actively being 
written/updated while the repair is going on. When we repaired all data by 
repair -pr on all nodes, why out of sync data?


2. For some cf , repair logs shows that all ranges are consistent. Still we get 
so many sstables created during repair. When everything is in sync , why repair 
creates tiny sstables to repair data?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Ken Hancock ken.hanc...@schange.com
Date:Tue, 9 Jun, 2015 at 8:24 pm
Subject:Re: Hundreds of sstables after every Repair

I think this came up recently in another thread.  If you're getting large 
numbers of SSTables after repairs, that means that your nodes are diverging 
from the keys that they're supposed to be having.  Likely you're dropping 
mutations.  Do a nodetool tpstats on each of your nodes and look at the 
mutation droppped counters.  If you're seeing dropped message, my money you 
have a non-zero FlushWriter All time blocked stat which is causing mutations 
to be dropped.



On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

Any suggestions or comments on this one?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Sun, 7 Jun, 2015 at 1:54 am
Subject:Hundreds of sstables after every Repair

Hi,


We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(


I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?


Is there any alternative to regular major compaction to deal with situation? 



Thanks

Anuj Wadehra










--







Re: Hundreds of sstables after every Repair

2015-06-10 Thread Ken Hancock
Perhaps doing a sstable2json on some of the small tables would shed some
illumination.  I was going to suggest the anticompaction feature of C*2.1
(which I'm not familiar with), but you're on 2.0.

On Tue, Jun 9, 2015 at 11:11 AM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 We were facing dropped mutations earlier and we increased flush writers.
 Now there are no dropped mutations in tpstats. To repair the damaged vnodes
 / inconsistent data we executed repair -pr on all nodes. Still, we see the
 same problem.

 When we analyze repair logs we see 2 strange things:

 1. Out of sync ranges for cf which are not being actively being
 written/updated while the repair is going on. When we repaired all data by
 repair -pr on all nodes, why out of sync data?

 2. For some cf , repair logs shows that all ranges are consistent. Still
 we get so many sstables created during repair. When everything is in sync ,
 why repair creates tiny sstables to repair data?

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Ken Hancock ken.hanc...@schange.com
 *Date*:Tue, 9 Jun, 2015 at 8:24 pm
 *Subject*:Re: Hundreds of sstables after every Repair

 I think this came up recently in another thread.  If you're getting large
 numbers of SSTables after repairs, that means that your nodes are diverging
 from the keys that they're supposed to be having.  Likely you're dropping
 mutations.  Do a nodetool tpstats on each of your nodes and look at the
 mutation droppped counters.  If you're seeing dropped message, my money you
 have a non-zero FlushWriter All time blocked stat which is causing
 mutations to be dropped.



 On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in
 wrote:

 Any suggestions or comments on this one?

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Sun, 7 Jun, 2015 at 1:54 am
 *Subject*:Hundreds of sstables after every Repair

 Hi,

 We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny
 sstables( 10K) get created. And these sstables never get compacted due to
 coldness issue. I have raised
 https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but
 I have been told to upgrade. Till we upgrade to latest 2.0.x , we are
 stuck. Upgrade takes time, testing and planning in Production systems :(

 I have observed that even if vnodes are NOT damaged, hundreds of tiny
 sstables are created during repair for a wide row CF. This is beyond my
 understanding. If everything is consistent, and for the entire repair
 process Cassandra is saying Endpoints /x.x.x.x and /x.x.x.y are consistent
 for CF. Whats the need of creating sstables?

 Is there any alternative to regular major compaction to deal with
 situation?


 Thanks
 Anuj Wadehra










Re: Hundreds of sstables after every Repair

2015-06-10 Thread Anuj Wadehra
NTP output attached. Any other comments on the two queries ?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Tue, 9 Jun, 2015 at 10:59 pm
Subject:Re: Hundreds of sstables after every Repair

Yes. We use NTP. We also thought that drift is creating problems. Our NTP 
Output is as under:


[root@node1 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  237 1024  377    1.199    0.062   0.554 
*10.x.x.x    10.x.x.x 2 u  178 1024  377    0.479   -0.350   0.626 
 
 
 
[root@node2 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  124 1024  377    0.939   -0.001   0.614 
*10.x.x.x    10.x.x.x 2 u  722 1024  377    0.567   -0.241   0.585 
 
 
 
[root@node3 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  514 1024  377    0.716   -0.103   1.315 
*10.x.x.x    10.x.x.x 2 u   21 1024  377    0.402   -0.262   1.070 
 
 
***IPs are masked


Thanks

Anuj Wadehra




On Tuesday, 9 June 2015 9:12 PM, Carlos Rolo r...@pythian.com wrote:



Hello,

Do you have your clocks synced across your cluster? Are you using NTP and have 
it properly configured?

Sometimes clock out of sync can trigger weird behaviour.


Regards,


Carlos Juzarte Rolo

Cassandra Consultant

 

Pythian - Love your data


rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo

Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

www.pythian.com


On Tue, Jun 9, 2015 at 5:11 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

We were facing dropped mutations earlier and we increased flush writers. Now 
there are no dropped mutations in tpstats. To repair the damaged vnodes / 
inconsistent data we executed repair -pr on all nodes. Still, we see the same 
problem. 


When we analyze repair logs we see 2 strange things:


1. Out of sync ranges for cf which are not being actively being 
written/updated while the repair is going on. When we repaired all data by 
repair -pr on all nodes, why out of sync data?


2. For some cf , repair logs shows that all ranges are consistent. Still we get 
so many sstables created during repair. When everything is in sync , why repair 
creates tiny sstables to repair data?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Ken Hancock ken.hanc...@schange.com
Date:Tue, 9 Jun, 2015 at 8:24 pm
Subject:Re: Hundreds of sstables after every Repair

I think this came up recently in another thread.  If you're getting large 
numbers of SSTables after repairs, that means that your nodes are diverging 
from the keys that they're supposed to be having.  Likely you're dropping 
mutations.  Do a nodetool tpstats on each of your nodes and look at the 
mutation droppped counters.  If you're seeing dropped message, my money you 
have a non-zero FlushWriter All time blocked stat which is causing mutations 
to be dropped.



On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

Any suggestions or comments on this one?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Sun, 7 Jun, 2015 at 1:54 am
Subject:Hundreds of sstables after every Repair

Hi,


We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(


I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?


Is there any alternative to regular major compaction to deal with situation? 



Thanks

Anuj Wadehra










--







Re: Hundreds of sstables after every Repair

2015-06-10 Thread Anuj Wadehra
NTP output attached. Any other comments on the two queries ?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Tue, 9 Jun, 2015 at 10:59 pm
Subject:Re: Hundreds of sstables after every Repair

Yes. We use NTP. We also thought that drift is creating problems. Our NTP 
Output is as under:


[root@node1 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  237 1024  377    1.199    0.062   0.554 
*10.x.x.x    10.x.x.x 2 u  178 1024  377    0.479   -0.350   0.626 
 
 
 
[root@node2 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  124 1024  377    0.939   -0.001   0.614 
*10.x.x.x    10.x.x.x 2 u  722 1024  377    0.567   -0.241   0.585 
 
 
 
[root@node3 ~]# ntpq -p 
 remote   refid  st t when poll reach   delay   offset  jitter 
== 
+10.x.x.x 10.x.x.x 2 u  514 1024  377    0.716   -0.103   1.315 
*10.x.x.x    10.x.x.x 2 u   21 1024  377    0.402   -0.262   1.070 
 
 
***IPs are masked


Thanks

Anuj Wadehra




On Tuesday, 9 June 2015 9:12 PM, Carlos Rolo r...@pythian.com wrote:



Hello,

Do you have your clocks synced across your cluster? Are you using NTP and have 
it properly configured?

Sometimes clock out of sync can trigger weird behaviour.


Regards,


Carlos Juzarte Rolo

Cassandra Consultant

 

Pythian - Love your data


rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo

Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

www.pythian.com


On Tue, Jun 9, 2015 at 5:11 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

We were facing dropped mutations earlier and we increased flush writers. Now 
there are no dropped mutations in tpstats. To repair the damaged vnodes / 
inconsistent data we executed repair -pr on all nodes. Still, we see the same 
problem. 


When we analyze repair logs we see 2 strange things:


1. Out of sync ranges for cf which are not being actively being 
written/updated while the repair is going on. When we repaired all data by 
repair -pr on all nodes, why out of sync data?


2. For some cf , repair logs shows that all ranges are consistent. Still we get 
so many sstables created during repair. When everything is in sync , why repair 
creates tiny sstables to repair data?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Ken Hancock ken.hanc...@schange.com
Date:Tue, 9 Jun, 2015 at 8:24 pm
Subject:Re: Hundreds of sstables after every Repair

I think this came up recently in another thread.  If you're getting large 
numbers of SSTables after repairs, that means that your nodes are diverging 
from the keys that they're supposed to be having.  Likely you're dropping 
mutations.  Do a nodetool tpstats on each of your nodes and look at the 
mutation droppped counters.  If you're seeing dropped message, my money you 
have a non-zero FlushWriter All time blocked stat which is causing mutations 
to be dropped.



On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

Any suggestions or comments on this one?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Sun, 7 Jun, 2015 at 1:54 am
Subject:Hundreds of sstables after every Repair

Hi,


We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(


I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?


Is there any alternative to regular major compaction to deal with situation? 



Thanks

Anuj Wadehra










--







Re: Hundreds of sstables after every Repair

2015-06-09 Thread Carlos Rolo
Hello,

Do you have your clocks synced across your cluster? Are you using NTP and
have it properly configured?

Sometimes clock out of sync can trigger weird behaviour.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Tue, Jun 9, 2015 at 5:11 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

 We were facing dropped mutations earlier and we increased flush writers.
 Now there are no dropped mutations in tpstats. To repair the damaged vnodes
 / inconsistent data we executed repair -pr on all nodes. Still, we see the
 same problem.

 When we analyze repair logs we see 2 strange things:

 1. Out of sync ranges for cf which are not being actively being
 written/updated while the repair is going on. When we repaired all data by
 repair -pr on all nodes, why out of sync data?

 2. For some cf , repair logs shows that all ranges are consistent. Still
 we get so many sstables created during repair. When everything is in sync ,
 why repair creates tiny sstables to repair data?

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Ken Hancock ken.hanc...@schange.com
 *Date*:Tue, 9 Jun, 2015 at 8:24 pm
 *Subject*:Re: Hundreds of sstables after every Repair

 I think this came up recently in another thread.  If you're getting large
 numbers of SSTables after repairs, that means that your nodes are diverging
 from the keys that they're supposed to be having.  Likely you're dropping
 mutations.  Do a nodetool tpstats on each of your nodes and look at the
 mutation droppped counters.  If you're seeing dropped message, my money you
 have a non-zero FlushWriter All time blocked stat which is causing
 mutations to be dropped.



 On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in
 wrote:

 Any suggestions or comments on this one?

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Sun, 7 Jun, 2015 at 1:54 am
 *Subject*:Hundreds of sstables after every Repair

 Hi,

 We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny
 sstables( 10K) get created. And these sstables never get compacted due to
 coldness issue. I have raised
 https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but
 I have been told to upgrade. Till we upgrade to latest 2.0.x , we are
 stuck. Upgrade takes time, testing and planning in Production systems :(

 I have observed that even if vnodes are NOT damaged, hundreds of tiny
 sstables are created during repair for a wide row CF. This is beyond my
 understanding. If everything is consistent, and for the entire repair
 process Cassandra is saying Endpoints /x.x.x.x and /x.x.x.y are consistent
 for CF. Whats the need of creating sstables?

 Is there any alternative to regular major compaction to deal with
 situation?


 Thanks
 Anuj Wadehra









-- 


--





Re: Hundreds of sstables after every Repair

2015-06-09 Thread Anuj Wadehra
Any suggestions or comments on this one?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Sun, 7 Jun, 2015 at 1:54 am
Subject:Hundreds of sstables after every Repair

Hi,


We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(


I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?


Is there any alternative to regular major compaction to deal with situation? 



Thanks

Anuj Wadehra




Re: Hundreds of sstables after every Repair

2015-06-09 Thread Ken Hancock
I think this came up recently in another thread.  If you're getting large
numbers of SSTables after repairs, that means that your nodes are diverging
from the keys that they're supposed to be having.  Likely you're dropping
mutations.  Do a nodetool tpstats on each of your nodes and look at the
mutation droppped counters.  If you're seeing dropped message, my money you
have a non-zero FlushWriter All time blocked stat which is causing
mutations to be dropped.



On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 Any suggestions or comments on this one?

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Sun, 7 Jun, 2015 at 1:54 am
 *Subject*:Hundreds of sstables after every Repair

 Hi,

 We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny
 sstables( 10K) get created. And these sstables never get compacted due to
 coldness issue. I have raised
 https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I
 have been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck.
 Upgrade takes time, testing and planning in Production systems :(

 I have observed that even if vnodes are NOT damaged, hundreds of tiny
 sstables are created during repair for a wide row CF. This is beyond my
 understanding. If everything is consistent, and for the entire repair
 process Cassandra is saying Endpoints /x.x.x.x and /x.x.x.y are consistent
 for CF. Whats the need of creating sstables?

 Is there any alternative to regular major compaction to deal with
 situation?


 Thanks
 Anuj Wadehra




Re: Hundreds of sstables after every Repair

2015-06-09 Thread Anuj Wadehra
We were facing dropped mutations earlier and we increased flush writers. Now 
there are no dropped mutations in tpstats. To repair the damaged vnodes / 
inconsistent data we executed repair -pr on all nodes. Still, we see the same 
problem. 


When we analyze repair logs we see 2 strange things:


1. Out of sync ranges for cf which are not being actively being 
written/updated while the repair is going on. When we repaired all data by 
repair -pr on all nodes, why out of sync data?


2. For some cf , repair logs shows that all ranges are consistent. Still we get 
so many sstables created during repair. When everything is in sync , why repair 
creates tiny sstables to repair data?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Ken Hancock ken.hanc...@schange.com
Date:Tue, 9 Jun, 2015 at 8:24 pm
Subject:Re: Hundreds of sstables after every Repair

I think this came up recently in another thread.  If you're getting large 
numbers of SSTables after repairs, that means that your nodes are diverging 
from the keys that they're supposed to be having.  Likely you're dropping 
mutations.  Do a nodetool tpstats on each of your nodes and look at the 
mutation droppped counters.  If you're seeing dropped message, my money you 
have a non-zero FlushWriter All time blocked stat which is causing mutations 
to be dropped.



On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:

Any suggestions or comments on this one?


Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Sun, 7 Jun, 2015 at 1:54 am
Subject:Hundreds of sstables after every Repair

Hi,


We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(


I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?


Is there any alternative to regular major compaction to deal with situation? 



Thanks

Anuj Wadehra










Re: Hundreds of sstables after every Repair

2015-06-09 Thread Anuj Wadehra
Yes. We use NTP. We also thought that drift is creating problems. Our NTP 
Output is as under:
[root@node1 ~]# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
+10.x.x.x 10.x.x.x 2 u  237 1024  377    1.199    0.062   0.554
*10.x.x.x    10.x.x.x 2 u  178 1024  377    0.479   -0.350   0.626
 
 
 
[root@node2 ~]# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
+10.x.x.x 10.x.x.x 2 u  124 1024  377    0.939   -0.001   0.614
*10.x.x.x    10.x.x.x 2 u  722 1024  377    0.567   -0.241   0.585
 
 
 
[root@node3 ~]# ntpq -p
 remote   refid  st t when poll reach   delay   offset  jitter
==
+10.x.x.x 10.x.x.x 2 u  514 1024  377    0.716   -0.103   1.315
*10.x.x.x    10.x.x.x 2 u   21 1024  377    0.402   -0.262   1.070
 
 
***IPs are masked
ThanksAnuj Wadehra
  


 On Tuesday, 9 June 2015 9:12 PM, Carlos Rolo r...@pythian.com wrote:
   

 Hello,

Do you have your clocks synced across your cluster? Are you using NTP and have 
it properly configured?

Sometimes clock out of sync can trigger weird behaviour.

Regards,

Carlos Juzarte RoloCassandra Consultant Pythian - Love your data
rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarteroloMobile: +31 6 159 61 814 | Tel: +1 613 565 
8696 x1649www.pythian.com
On Tue, Jun 9, 2015 at 5:11 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:


| We were facing dropped mutations earlier and we increased flush writers. Now 
there are no dropped mutations in tpstats. To repair the damaged vnodes / 
inconsistent data we executed repair -pr on all nodes. Still, we see the same 
problem. 
When we analyze repair logs we see 2 strange things:
1. Out of sync ranges for cf which are not being actively being 
written/updated while the repair is going on. When we repaired all data by 
repair -pr on all nodes, why out of sync data?
2. For some cf , repair logs shows that all ranges are consistent. Still we get 
so many sstables created during repair. When everything is in sync , why repair 
creates tiny sstables to repair data?
ThanksAnuj Wadehra

Sent from Yahoo Mail on Android 
|  From:Ken Hancock ken.hanc...@schange.com
Date:Tue, 9 Jun, 2015 at 8:24 pm
Subject:Re: Hundreds of sstables after every Repair

 I think this came up recently in another thread.  If you're getting large 
numbers of SSTables after repairs, that means that your nodes are diverging 
from the keys that they're supposed to be having.  Likely you're dropping 
mutations.  Do a nodetool tpstats on each of your nodes and look at the 
mutation droppped counters.  If you're seeing dropped message, my money you 
have a non-zero FlushWriter All time blocked stat which is causing mutations 
to be dropped.



On Tue, Jun 9, 2015 at 10:35 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote:


| Any suggestions or comments on this one?
ThanksAnuj Wadehra

Sent from Yahoo Mail on Android
|  From:Anuj Wadehra anujw_2...@yahoo.co.in
Date:Sun, 7 Jun, 2015 at 1:54 am
Subject:Hundreds of sstables after every Repair

 Hi,
We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(

I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?
Is there any alternative to regular major compaction to deal with situation? 


ThanksAnuj Wadehra

 |

 |





| 
 | 
 |
| 
 |
| 
 |

 |

 |




--



  

Hundreds of sstables after every Repair

2015-06-06 Thread Anuj Wadehra
Hi,
We are using 2.0.3 and vnodes. After every repair -pr operation  50+ tiny 
sstables( 10K) get created. And these sstables never get compacted due to 
coldness issue. I have raised 
https://issues.apache.org/jira/browse/CASSANDRA-9146 for this issue but I have 
been told to upgrade. Till we upgrade to latest 2.0.x , we are stuck. Upgrade 
takes time, testing and planning in Production systems :(

I have observed that even if vnodes are NOT damaged, hundreds of tiny sstables 
are created during repair for a wide row CF. This is beyond my understanding. 
If everything is consistent, and for the entire repair process Cassandra is 
saying Endpoints /x.x.x.x and /x.x.x.y are consistent for CF. Whats the 
need of creating sstables?
Is there any alternative to regular major compaction to deal with situation? 


ThanksAnuj Wadehra