in your data directory, for each keyspace there is a solr.json. cassandra stores the SSTABLEs it knows about when using leveled compaction. take a look at that file and see if it looks accurate. if not, this is a bug with cassandra that we are checking into as well
On Thu, Dec 6, 2012 at 7:38 PM, aaron morton <aa...@thelastpickle.com>wrote: > The log message matches what I would expect to see for nodetool -pr > > Not using pr means repair all the ranges the node is a replica for. If you > have RF == number of nodes, then it will repair all the data. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 6/12/2012, at 9:42 PM, Andras Szerdahelyi < > andras.szerdahe...@ignitionone.com> wrote: > > Thanks! > > i'm also thinking a repair run without -pr could have caused this maybe > ? > > > Andras Szerdahelyi* > *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A > M: +32 493 05 50 88 | Skype: sandrew84 > > > <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png> > > > On 06 Dec 2012, at 04:05, aaron morton <aa...@thelastpickle.com> wrote: > > - how do i stop repair before i run out of storage? ( can't let this > finish ) > > > To stop the validation part of the repair… > > nodetool -h localhost stop VALIDATION > > > The only way I know to stop streaming is restart the node, their may be > a better way though. > > > INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 > AntiEntropyService.java (line 666) [repair > #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, > /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( .. ) > > Am assuming this was ran on the first node in DC west with -pr as you said. > The log message is saying this is going to repair the primary range for > the node for the node. The repair is then actually performed one CF at a > time. > > You should also see log messages ending with "range(s) out of sync" > which will say how out of sync the data is. > > > - how do i clean up my stables ( grew from 6k to 20k since this started, > while i shut writes off completely ) > > Sounds like repair is streaming a lot of differences. > If you have the space I would give Levelled compaction time to take > care of it. > > Hope that helps. > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 6/12/2012, at 1:32 AM, Andras Szerdahelyi < > andras.szerdahe...@ignitionone.com> wrote: > > hi list, > > AntiEntropyService started syncing ranges of entire nodes ( ?! ) across > my data centers and i'd like to understand why. > > I see log lines like this on all my nodes in my two ( east/west ) data > centres... > > INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 > AntiEntropyService.java (line 666) [repair > #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, > /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( .. ) > > ( this is around 80-100 GB of data for a single node. ) > > - i did not observe any network failures or nodes falling off the ring > - good distribution of data ( load is equal on all nodes ) > - hinted handoff is on > - read repair chance is 0.1 on the CF > - 2 replicas in each data centre ( which is also the number of nodes in > each ) with NetworkTopologyStrategy > - repair -pr is scheduled to run off-peak hours, daily > - leveled compaction with stable max size 256mb ( i have found this to > trigger compaction in acceptable intervals while still keeping the stable > count down ) > - i am on 1.1.6 > - java heap 10G > - max memtables 2G > - 1G row cache > - 256M key cache > > my nodes' ranges are: > > DC west > 0 > 85070591730234615865843651857942052864 > > DC east > 100 > 85070591730234615865843651857942052964 > > symptoms are: > - logs show sstables being streamed over to other nodes > - 140k files in data dir of CF on all nodes > - cfstats reports 20k sstables, up from 6 on all nodes > - compaction continuously running with no results whatsoever ( number of > stables growing ) > > i tried the following: > - offline scrub ( has gone OOM, i noticed the script in the debian package > specifies 256MB heap? ) > - online scrub ( no effect ) > - repair ( no effect ) > - cleanup ( no effect ) > > my questions are: > - how do i stop repair before i run out of storage? ( can't let this > finish ) > - how do i clean up my stables ( grew from 6k to 20k since this started, > while i shut writes off completely ) > > thanks, > Andras > > Andras Szerdahelyi* > *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A > M: +32 493 05 50 88 | Skype: sandrew84 > > > <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png> > > > > > >