Hey all.

We've been seeing this warning on one of our clusters:

2015-10-18 14:28:52,898 WARN  [ValidationExecutor:14]
org.apache.cassandra.db.context.CounterContext invalid global counter shard
detected; (4aa69016-4cf8-4585-8f23-e59af050d174, 1, 67158) and
(4aa69016-4cf8-4585-8f23-e59af050d174, 1, 21486) differ only in count; will
pick highest to self-heal on compaction


>From what I've read and heard in the IRC channel, this warning could be
related to not running upgradesstables after upgrading from 2.0.x to
2.1.x.  I don't think we ran that then, but we've been at 2.1 since last
November.  Looking back, the warnings start appearing around June, when no
maintenance had been performed on the cluster.  At that time, we had been
on 2.1.3 for a couple of months.  We've been on 2.1.10 for the last week
(the upgrade was when we noticed this warning for the first time).

>From a suggestion in IRC, I went ahead and ran upgradesstables on all the
nodes.  Our weekly repair also ran this morning.  But the warnings still
show up throughout the day.

So, we have many questions:

   - How much should we be freaking out?
   - Why is this recurring?  If I understand what's happening, this is a
   self-healing process.  So, why would it keep happening?  Are we possibly
   using counters incorrectly?
   - What does it even mean that there were multiple shards for the same
   counter?  How does that situation even occur?

We're pretty lost here, so any help would be greatly appreciated.

Thanks!

Reply via email to