Thanks for the response. Yes, I went through 1.1, 1.2, 2.0 as rolling updates (entire cluster for each minor version) and ran upgradesstables each time. Yes, nodes are using the same tokens. I can see the tokens when running nodetool ring. They're consistent with what we used to have. Repairs were very infrequent because we do not delete data. Once every 2 or 3 months with a forced repair if a node ever went down for a period of time greater than a few hours.
I'll keep an eye on the number of repaired rows. How should I go about inspecting SSTables? Thanks again. On Thu, Nov 20, 2014 at 11:15 AM, Robert Coli <[email protected]> wrote: > On Thu, Nov 20, 2014 at 8:36 AM, Stephane Legay <[email protected]> > wrote: > >> I upgraded a 2 node cluster with RF = 2 from 1.0.9 to 2.0.11. I did >> rolling upgrades and upgradesstables after each upgrade. >> > > To be clear, did you go through 1.1, and 1.2, or did you go directly from > 1.0 to 2.0? > > >> We then moved our data to new hardware by shutting down each node, moving >> data to new machine, and starting up with auto_bootstrap = false. >> > > This should not be implicated, especially if you verified the upgraded > nodes came up with the same tokens they had before. > > >> When all was done I ran a repair. Data went from 250GB to 400 GB per >> node. A week later, I am doing another repair, data filling the 800GB drive >> on each machine. Huge compaction on each node, constantly. >> > > How frequently had you been running repair in 1.0.9? How often do you > DELETE? > > >> Where should I go from here? Will scrubbing fix the issue? >> > > I would inspect the newly created SSTables from a repair and see what they > contain. I would also look at log lines which indicate how many rows are > being repaired, with a special eye towards whether the number of rows > repaired each time you repair is decreasing. > > Also note that repair in 2.0 is serial by default, you probably want the > old behavior, which you can get with "-par" flag. > > =Rob > http://twitter.com/rcolidba > -- Stephane Legay Co-founder and CTO LoopLogic, LLC [email protected] 480-326-4080
