Most welcome, hopefully the bug is easy to find and kill :) On Tue, Nov 8, 2011 at 3:28 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:
> Sylvain, here is my ticket, but I guess you already know it since you are > the assignee :) -->https://issues.apache.org/jira/browse/CASSANDRA-3465 > Riyad, Thanks for your help. > > Alain > > 2011/11/7 Riyad Kalla <rka...@gmail.com> > >> Alain thank you for all the clarification, I understand exactly what you >> meant now... and as a result am just as confused as you are :) >> >> What version of Cassandra are you using? Can you share the important >> parts of your config? (you double checked that your replication factor is >> set on all 3 to "3"?) >> >> Also out of curiosity, if you keep querying for up to 5 mins (say every >> 10 seconds) do counter1, 2 and 3 still show the same wrong values for >> getValue or do the values eventually converge on the correct amounts? >> >> (I assume 5mins is a long enough window to test, maybe I'm wrong and >> another Cassandra dev can correct me here). >> >> -R >> >> >> On Mon, Nov 7, 2011 at 9:57 AM, Alain RODRIGUEZ <arodr...@gmail.com>wrote: >> >>> I retried it after restarting all the servers. >>> >>> I still have wrong results (I simulated an event 5 times and it was >>> counted 3 times by some counters 4 or 5 times by others. >>> >>> What I meant by "but now every request returns me always the same count >>> value..." will be easier to explain with an example : >>> >>> event 1: >>> >>> counter1.increment >>> counter2.increment >>> counter3.increment >>> >>> . >>> . >>> . >>> >>> event 5: >>> >>> counter1.increment >>> counter2.increment >>> counter3.increment >>> >>> Show results : >>> >>> counter1.getValue = returns 4 >>> counter2.getValue = returns 3 >>> counter3.getValue = returns 5 >>> >>> counter1.getValue = returns 5 >>> counter2.getValue = returns 3 >>> counter3.getValue = returns 5 >>> >>> counter1.getValue = returns 4 >>> counter2.getValue = returns 4 >>> counter3.getValue = returns 5 >>> >>> ... >>> >>> So I've got wrong values, and not always the same ones. In my previous >>> email I tried to tell you by saying "but now every request returns me >>> always the same count value..." that I had all the time the same wrong >>> values, let us say : >>> >>> counter1.getValue = returns 4 >>> counter2.getValue = returns 3 >>> counter3.getValue = returns 5 >>> >>> counter1.getValue = returns 4 >>> counter2.getValue = returns 3 >>> counter3.getValue = returns 5 >>> >>> counter1.getValue = returns 4 >>> counter2.getValue = returns 3 >>> counter3.getValue = returns 5 >>> >>> But that is not true, I still have some "random" wrong values, maybe >>> haven't I query to get counter values often enough to see it last time. >>> >>> Sorry of not being clearer, that is not easy to explain, neither to >>> understand for me. >>> >>> Thanks for help. >>> >>> Alain >>> >>> >>> 2011/11/7 Riyad Kalla <rka...@gmail.com> >>> >>>> Alain, >>>> >>>> When you tried CL.All was that only after you had made the change of >>>> ReplicationFactor=3 and restarted all the servers? >>>> >>>> If you hadn't restarted the servers with the new RF, I am not sure that >>>> CL.All would have the intended effect. >>>> >>>> Also, I wasn't sure what you meant by "but know every request returns >>>> me always the same count value..." -- didn't want the requests to always >>>> return you the same values? >>>> >>>> Or maybe you are saying that it always returns the same *wrong* value? >>>> Like you do: >>>> >>>> counter.increment (v=1) >>>> counter.increment (v=2) >>>> counter.increment (v=3) >>>> >>>> counter.getValue = returns 7 >>>> counter.getValue = returns 7 >>>> counter.getValue = returns 7 >>>> >>>> or something inconsistent like that? >>>> >>>> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <arodr...@gmail.com>wrote: >>>> >>>>> I've tried with CL.All, but it doesn't wotk better. I still have >>>>> strange values (between 4 and 10 events counted instead of 10) but know >>>>> every request returns me always the same count value... >>>>> >>>>> It's very strange. >>>>> >>>>> Any other idea ? >>>>> >>>>> Alain >>>>> >>>>> >>>>> 2011/11/7 Riyad Kalla <rka...@gmail.com> >>>>> >>>>>> Alain, >>>>>> >>>>>> Try using a CL of 3 or "ALL" and see if that the problem goes away. >>>>>> >>>>>> Your replication factor (as I just learned) dictates how many nodes >>>>>> each piece of data is replicated to; by using a RF of 3 you are saying >>>>>> "replicate all my data to all my nodes" (in this case counters). >>>>>> >>>>>> This doesn't happen immediately, but you can *force* it to happen on >>>>>> write by specifying a CL of "ALL". If you specify "1" then your counter >>>>>> value is written to one member of the ring, then your command returns. >>>>>> >>>>>> If you keep querying you will bounce around your ring, reading the >>>>>> values from the different nodes until a future date at *which point* all >>>>>> the values will likely agree. >>>>>> >>>>>> If you keep all your code you have now exactly the same, just change >>>>>> the code at the end where you read the counter value back, to keep >>>>>> reading >>>>>> the counter value back every second for 60 seconds and see if all the >>>>>> values eventually match up -- they should (as the counter value is >>>>>> replicated to all the nodes and their old values discarded). >>>>>> >>>>>> -R >>>>>> >>>>>> >>>>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ >>>>>> <arodr...@gmail.com>wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values >>>>>>> from counters when doing so... >>>>>>> >>>>>>> I got a CF that contains many counters of some events. When I'm at >>>>>>> RF = 1 and simulate 10 events, they are well counted. >>>>>>> However, when I switch to a RF = 3, my counter show a wrong value >>>>>>> that sometimes change when requested twice (it can return 7, then 5 >>>>>>> instead >>>>>>> of 10 all the time). >>>>>>> >>>>>>> I first thought that it was a problem of CL because I seem to >>>>>>> remember that I read once that I had to use CL.One for reads and writes >>>>>>> with counters. So I tried with CL.One, without success... >>>>>>> >>>>>>> What am I doing wrong ? Is that some precaution to take when >>>>>>> replicating counters ? >>>>>>> >>>>>>> Alain >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >