Re: nodes are always out of sync
Btw.: I created an issue for that some months ago https://issues.apache.org/jira/browse/CASSANDRA-12991 2017-04-01 22:25 GMT+02:00 Roland Otta <roland.o...@willhaben.at>: > thank you both chris and benjamin for taking time to clarify that. > > > On Sat, 2017-04-01 at 21:17 +0200, benjamin roth wrote: > > Tl;Dr: there are race conditions in a repair and it is not trivial to fix > them. So we rather stay with these race conditions. Actually they don't > really hurt. The worst case is that ranges are repaired that don't really > need a repair. > > Am 01.04.2017 21:14 schrieb "Chris Lohfink" <clohfin...@gmail.com>: > > Repairs do not have an ability to instantly build a perfect view of its > data between your 3 nodes at an exact time. When a piece of data is written > there is a delay between when they applied between the nodes, even if its > just 500ms. So if a request to read the data and build the merkle tree of > the data occurs and it finishes on node1 at 12:01 while node2 finishes at > 12:02 the 1 minute or so delta (even if a few seconds, or if using snapshot > repairs) between the partition/range hashes in the merkle tree can be > different. On a moving data set its almost impossible to have the clusters > perfectly in sync for a repair. I wouldnt worry about that log message. If > you are worried about consistency between your read/writes use each or > local quorum for both. > > Chris > > On Thu, Mar 30, 2017 at 1:22 AM, Roland Otta <roland.o...@willhaben.at> > wrote: > > hi, > > we see the following behaviour in our environment: > > cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a > replication factor 3. > clients are writing data to the keyspace with consistency one. > > we are doing parallel, incremental repairs with cassandra reaper. > > even if a repair just finished and we are starting a new one > immediately, we can see the following entries in our logs: > > INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 > and /192.168.0.189 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 > and /192.168.0.189 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 > and /192.168.0.191 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.190 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > > we cant see any hints on the systems ... so we thought everything is > running smoothly with the writes. > > do we have to be concerned about the nodes always being out of sync or > is this a normal behaviour in a write intensive table (as the tables > will never be 100% in sync for the latest inserts)? > > bg, > roland > > > > > >
Re: nodes are always out of sync
thank you both chris and benjamin for taking time to clarify that. On Sat, 2017-04-01 at 21:17 +0200, benjamin roth wrote: Tl;Dr: there are race conditions in a repair and it is not trivial to fix them. So we rather stay with these race conditions. Actually they don't really hurt. The worst case is that ranges are repaired that don't really need a repair. Am 01.04.2017 21:14 schrieb "Chris Lohfink" <clohfin...@gmail.com<mailto:clohfin...@gmail.com>>: Repairs do not have an ability to instantly build a perfect view of its data between your 3 nodes at an exact time. When a piece of data is written there is a delay between when they applied between the nodes, even if its just 500ms. So if a request to read the data and build the merkle tree of the data occurs and it finishes on node1 at 12:01 while node2 finishes at 12:02 the 1 minute or so delta (even if a few seconds, or if using snapshot repairs) between the partition/range hashes in the merkle tree can be different. On a moving data set its almost impossible to have the clusters perfectly in sync for a repair. I wouldnt worry about that log message. If you are worried about consistency between your read/writes use each or local quorum for both. Chris On Thu, Mar 30, 2017 at 1:22 AM, Roland Otta <roland.o...@willhaben.at<mailto:roland.o...@willhaben.at>> wrote: hi, we see the following behaviour in our environment: cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a replication factor 3. clients are writing data to the keyspace with consistency one. we are doing parallel, incremental repairs with cassandra reaper. even if a repair just finished and we are starting a new one immediately, we can see the following entries in our logs: INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188<http://192.168.0.188> and /192.168.0.191<http://192.168.0.191> have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188<http://192.168.0.188> and /192.168.0.189<http://192.168.0.189> have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189<http://192.168.0.189> and /192.168.0.191<http://192.168.0.191> have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26<http://192.168.0.26> and /192.168.0.189<http://192.168.0.189> have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26<http://192.168.0.26> and /192.168.0.191<http://192.168.0.191> have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189<http://192.168.0.189> and /192.168.0.191<http://192.168.0.191> have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189<http://192.168.0.189> and /192.168.0.191<http://192.168.0.191> have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189<http://192.168.0.189> and /192.168.0.190<http://192.168.0.190> have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190<http://192.168.0.190> and /192.168.0.191<http://192.168.0.191> have 1 range(s) out of sync for ad_event_history we cant see any hints on the systems ... so we thought everything is running smoothly with the writes. do we have to be concerned about the nodes always being out of sync or is this a normal behaviour in a write intensive table (as the tables will never be 100% in sync for the latest inserts)? bg, roland
Re: nodes are always out of sync
Tl;Dr: there are race conditions in a repair and it is not trivial to fix them. So we rather stay with these race conditions. Actually they don't really hurt. The worst case is that ranges are repaired that don't really need a repair. Am 01.04.2017 21:14 schrieb "Chris Lohfink" <clohfin...@gmail.com>: > Repairs do not have an ability to instantly build a perfect view of its > data between your 3 nodes at an exact time. When a piece of data is written > there is a delay between when they applied between the nodes, even if its > just 500ms. So if a request to read the data and build the merkle tree of > the data occurs and it finishes on node1 at 12:01 while node2 finishes at > 12:02 the 1 minute or so delta (even if a few seconds, or if using snapshot > repairs) between the partition/range hashes in the merkle tree can be > different. On a moving data set its almost impossible to have the clusters > perfectly in sync for a repair. I wouldnt worry about that log message. If > you are worried about consistency between your read/writes use each or > local quorum for both. > > Chris > > On Thu, Mar 30, 2017 at 1:22 AM, Roland Otta <roland.o...@willhaben.at> > wrote: > >> hi, >> >> we see the following behaviour in our environment: >> >> cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a >> replication factor 3. >> clients are writing data to the keyspace with consistency one. >> >> we are doing parallel, incremental repairs with cassandra reaper. >> >> even if a repair just finished and we are starting a new one >> immediately, we can see the following entries in our logs: >> >> INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - >> [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - >> [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 >> and /192.168.0.189 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - >> [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - >> [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 >> and /192.168.0.189 have 2 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - >> [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 >> and /192.168.0.191 have 2 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - >> [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.191 have 2 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - >> [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - >> [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.190 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - >> [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> >> we cant see any hints on the systems ... so we thought everything is >> running smoothly with the writes. >> >> do we have to be concerned about the nodes always being out of sync or >> is this a normal behaviour in a write intensive table (as the tables >> will never be 100% in sync for the latest inserts)? >> >> bg, >> roland >> >> >> >
Re: nodes are always out of sync
I think your way to communicate needs work. No one forces you to answer on questions. Am 01.04.2017 21:09 schrieb "daemeon reiydelle" <daeme...@gmail.com>: > What you are doing is correctly going to result in this, IF there is > substantial backlog/network/disk or whatever pressure. > > What do you think will happen when you write with a replication factor > greater than consistency level of write? Perhaps your mental model of how > C* works needs work? > > > *...* > > > > *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London > (+44) (0) 20 8144 9872 <+44%2020%208144%209872>* > > On Sat, Apr 1, 2017 at 11:09 AM, Vladimir Yudovin <vla...@winguzone.com> > wrote: > >> Hi, >> >> did you try to read data with consistency ALL immediately after write >> with consistency ONE? Does it succeed? >> >> Best regards, Vladimir Yudovin, >> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting* >> >> >> On Thu, 30 Mar 2017 04:22:28 -0400 *Roland Otta >> <roland.o...@willhaben.at <roland.o...@willhaben.at>>* wrote >> >> hi, >> >> we see the following behaviour in our environment: >> >> cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a >> replication factor 3. >> clients are writing data to the keyspace with consistency one. >> >> we are doing parallel, incremental repairs with cassandra reaper. >> >> even if a repair just finished and we are starting a new one >> immediately, we can see the following entries in our logs: >> >> INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - >> [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - >> [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 >> and /192.168.0.189 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - >> [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - >> [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 >> and /192.168.0.189 have 2 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - >> [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 >> and /192.168.0.191 have 2 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - >> [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.191 have 2 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - >> [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - >> [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 >> and /192.168.0.190 have 1 range(s) out of sync for ad_event_history >> INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - >> [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 >> and /192.168.0.191 have 1 range(s) out of sync for ad_event_history >> >> we cant see any hints on the systems ... so we thought everything is >> running smoothly with the writes. >> >> do we have to be concerned about the nodes always being out of sync or >> is this a normal behaviour in a write intensive table (as the tables >> will never be 100% in sync for the latest inserts)? >> >> bg, >> roland >> >> >> >> >
Re: nodes are always out of sync
Repairs do not have an ability to instantly build a perfect view of its data between your 3 nodes at an exact time. When a piece of data is written there is a delay between when they applied between the nodes, even if its just 500ms. So if a request to read the data and build the merkle tree of the data occurs and it finishes on node1 at 12:01 while node2 finishes at 12:02 the 1 minute or so delta (even if a few seconds, or if using snapshot repairs) between the partition/range hashes in the merkle tree can be different. On a moving data set its almost impossible to have the clusters perfectly in sync for a repair. I wouldnt worry about that log message. If you are worried about consistency between your read/writes use each or local quorum for both. Chris On Thu, Mar 30, 2017 at 1:22 AM, Roland Otta <roland.o...@willhaben.at> wrote: > hi, > > we see the following behaviour in our environment: > > cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a > replication factor 3. > clients are writing data to the keyspace with consistency one. > > we are doing parallel, incremental repairs with cassandra reaper. > > even if a repair just finished and we are starting a new one > immediately, we can see the following entries in our logs: > > INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 > and /192.168.0.189 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 > and /192.168.0.189 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 > and /192.168.0.191 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.190 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > > we cant see any hints on the systems ... so we thought everything is > running smoothly with the writes. > > do we have to be concerned about the nodes always being out of sync or > is this a normal behaviour in a write intensive table (as the tables > will never be 100% in sync for the latest inserts)? > > bg, > roland > > >
Re: nodes are always out of sync
What you are doing is correctly going to result in this, IF there is substantial backlog/network/disk or whatever pressure. What do you think will happen when you write with a replication factor greater than consistency level of write? Perhaps your mental model of how C* works needs work? *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Sat, Apr 1, 2017 at 11:09 AM, Vladimir Yudovin <vla...@winguzone.com> wrote: > Hi, > > did you try to read data with consistency ALL immediately after write with > consistency ONE? Does it succeed? > > Best regards, Vladimir Yudovin, > *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting* > > > On Thu, 30 Mar 2017 04:22:28 -0400 *Roland Otta > <roland.o...@willhaben.at <roland.o...@willhaben.at>>* wrote > > hi, > > we see the following behaviour in our environment: > > cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a > replication factor 3. > clients are writing data to the keyspace with consistency one. > > we are doing parallel, incremental repairs with cassandra reaper. > > even if a repair just finished and we are starting a new one > immediately, we can see the following entries in our logs: > > INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 > and /192.168.0.189 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - > [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 > and /192.168.0.189 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 > and /192.168.0.191 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - > [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 2 range(s) out of sync for ad_event_history > INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 > and /192.168.0.190 have 1 range(s) out of sync for ad_event_history > INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - > [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 > and /192.168.0.191 have 1 range(s) out of sync for ad_event_history > > we cant see any hints on the systems ... so we thought everything is > running smoothly with the writes. > > do we have to be concerned about the nodes always being out of sync or > is this a normal behaviour in a write intensive table (as the tables > will never be 100% in sync for the latest inserts)? > > bg, > roland > > > >
Re: nodes are always out of sync
Hi, did you try to read data with consistency ALL immediately after write with consistency ONE? Does it succeed? Best regards, Vladimir Yudovin, Winguzone - Cloud Cassandra Hosting On Thu, 30 Mar 2017 04:22:28 -0400 Roland Otta roland.o...@willhaben.at wrote hi, we see the following behaviour in our environment: cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a replication factor 3. clients are writing data to the keyspace with consistency one. we are doing parallel, incremental repairs with cassandra reaper. even if a repair just finished and we are starting a new one immediately, we can see the following entries in our logs: INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 and /192.168.0.189 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 and /192.168.0.189 have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 and /192.168.0.191 have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.191 have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.190 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history we cant see any hints on the systems ... so we thought everything is running smoothly with the writes. do we have to be concerned about the nodes always being out of sync or is this a normal behaviour in a write intensive table (as the tables will never be 100% in sync for the latest inserts)? bg, roland
nodes are always out of sync
hi, we see the following behaviour in our environment: cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a replication factor 3. clients are writing data to the keyspace with consistency one. we are doing parallel, incremental repairs with cassandra reaper. even if a repair just finished and we are starting a new one immediately, we can see the following entries in our logs: INFO [RepairJobTask:1] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.188 and /192.168.0.189 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:00,782 SyncTask.java:73 - [repair #d0f651f6-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 and /192.168.0.189 have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:1] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.26 and /192.168.0.191 have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:03,997 SyncTask.java:73 - [repair #d0fa70a1-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.191 have 2 range(s) out of sync for ad_event_history INFO [RepairJobTask:1] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:2] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.189 and /192.168.0.190 have 1 range(s) out of sync for ad_event_history INFO [RepairJobTask:4] 2017-03-30 10:14:05,375 SyncTask.java:73 - [repair #d0fbd033-1520-11e7-a443-d9f5b942818e] Endpoints /192.168.0.190 and /192.168.0.191 have 1 range(s) out of sync for ad_event_history we cant see any hints on the systems ... so we thought everything is running smoothly with the writes. do we have to be concerned about the nodes always being out of sync or is this a normal behaviour in a write intensive table (as the tables will never be 100% in sync for the latest inserts)? bg, roland