It looks like nobody has already experiment this kind of trouble or even has a clue about it.
Under heavy load this creates a high latency (because of iowait) in my app in prod and we can't handle it longer. If there is nothing new in the few upcoming days I think I'll drop this node and replace it, hopping this will fix my issue... I wait a bit more because I am hopping we will find out what is the issue and this will help the C* community. 2012/12/20 Alain RODRIGUEZ <arodr...@gmail.com> > "routing more traffic to it?" > > So shouldn't I see more "network in" on that node in the AWS console ? > > It seems that each node is recieving and sending an equal amount of data. > > What value should I use for dynamic-snitch-badness-threshold to give it a > try ? > Le 20 déc. 2012 00:37, "Bryan Talbot" <btal...@aeriagames.com> a écrit : > > Oh, you're on ec2. Maybe the dynamic snitch is detecting that one node is >> performing better than the others so is routing more traffic to it? >> >> >> http://www.datastax.com/docs/1.1/configuration/node_configuration#dynamic-snitch-badness-threshold >> >> -Bryan >> >> >> >> >> On Wed, Dec 19, 2012 at 2:30 PM, Alain RODRIGUEZ <arodr...@gmail.com>wrote: >> >>> @Aaron >>> "Is there a sustained difference or did it settle back ? " >>> >>> Sustained, clearly. During the day all nodes read at about 6MB/s while >>> this one reads at 30-40 MB/s. At night while other reads 2MB/s the "broken" >>> nodes reads at 8-10MB/s >>> >>> "Could this have been compaction or repair or upgrade tables working ? " >>> >>> Was my first thought but definitely no. this occurs continuously. >>> >>> "Do the read / write counts available in nodetool cfstats show anything >>> different ? " >>> >>> The cfstats shows different counts (a lot less reads/writes for the >>> "bad" node) but they didn't join the ring at the same time. I join you the >>> cfstats just in case it could help somehow. >>> >>> Node 38: http://pastebin.com/ViS1MR8d (bad one) >>> Node 32: http://pastebin.com/MrSTHH9F >>> Node 154: http://pastebin.com/7p0Usvwd >>> >>> @Bryan >>> >>> "clients always connect to that server" >>> >>> I didn't join it in the screenshot from AWS console, but AWS report an >>> (almost) equal network within the nodes (same for output and cpu). The cpu >>> load is a lot higher in the broken node as shown by the OpsCenter, but >>> that's due to the high iowait...) >>> >> >> >> >> -- >> Bryan Talbot >> Architect / Platform team lead, Aeria Games and Entertainment >> Silicon Valley | Berlin | Tokyo | Sao Paulo >> >> >>