On Wed, May 23, 2012 at 8:09 PM, aaron morton <aa...@thelastpickle.com>wrote:
> We were thinking of doing a major compaction after each year is 'closed > off'. > > Not a terrible idea. Years tend to happen annually, so their growth > pattern is well understood. > > This would mean that compactions for the current year were dealing with a > smaller amount of data and hence be faster and have less impact on a > day-to-day basis. > > Older data is compacted into higher tiers / generations so will not be > included when compacting new data (background > http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra). That > said, there is a chance that at some point you the big older files get > compacted. i.e. if you get (by default) 4 X 100GB files they will get > compacted into 1. > I'm a bit nervous about leveled compaction as it's new(ish) > > It feels a bit like a premature optimisation. > Yep, that's certainly possible - it's habit I tend towards ;-( cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 23/05/2012, at 1:52 PM, Franc Carter wrote: > > On Wed, May 23, 2012 at 7:42 AM, aaron morton <aa...@thelastpickle.com>wrote: > >> 1 KS with 24 CF's will use roughly the same resources as 24 KS's with 1 >> CF. Each CF: >> >> * loads the bloom filter for each SSTable >> * samples the index for each sstable >> * uses row and key cache >> * has a current memtable and potentially memtables waiting to flush. >> * had secondary index CF's >> >> I would generally avoid a data model that calls for CF's to be added in >> response to new entities or new data. Older data will move moved to larger >> files, and not included in compaction for newer data. >> > > We were thinking of doing a major compaction after each year is 'closed > off'. This would mean that compactions for the current year were dealing > with a smaller amount of data and hence be faster and have less impact on a > day-to-day basis. Our query patterns will only infrequently cross year > boundaries. > > Are we being naive ? > > cheers > > >> >> Hope that helps. >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 23/05/2012, at 3:31 AM, Luís Ferreira wrote: >> >> I have 24 keyspaces, each with a columns family and am considering >> changing it to 1 keyspace with 24 CFs. Would this be beneficial? >> On May 22, 2012, at 12:56 PM, samal wrote: >> >> Not ideally, now cass has global memtable tuning. Each cf correspond to >> memory in ram. Year wise cf means it will be in read only state for next >> year, memtable will still consume ram. >> On 22-May-2012 5:01 PM, "Franc Carter" <franc.car...@sirca.org.au> wrote: >> >>> On Tue, May 22, 2012 at 9:19 PM, aaron morton >>> <aa...@thelastpickle.com>wrote: >>> >>>> It's more the number of CF's than keyspaces. >>>> >>> >>> Oh - does increasing the number of Column Families affect performance ? >>> >>> The design we are working on at the moment is considering using a Column >>> Family per year. We were thinking this would isolate compactions to a more >>> manageable size as we don't update previous years. >>> >>> cheers >>> >>> >>>> >>>> Cheers >>>> >>>> ----------------- >>>> Aaron Morton >>>> Freelance Developer >>>> @aaronmorton >>>> http://www.thelastpickle.com >>>> >>>> On 22/05/2012, at 6:58 PM, R. Verlangen wrote: >>>> >>>> Yes, it does. However there's no real answer what's the limit: it >>>> depends on your hardware and cluster configuration. >>>> >>>> You might even want to search the archives of this mailinglist, I >>>> remember this has been asked before. >>>> >>>> Cheers! >>>> >>>> 2012/5/21 Luís Ferreira <zamith...@gmail.com> >>>> >>>>> Hi, >>>>> >>>>> Does the number of keyspaces affect the overall cassandra performance? >>>>> >>>>> >>>>> Cumprimentos, >>>>> Luís Ferreira >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> With kind regards, >>>> >>>> Robin Verlangen >>>> www.robinverlangen.nl >>>> >>>> >>>> >>> >>> >>> -- >>> *Franc Carter* | Systems architect | Sirca Ltd >>> <marc.zianideferra...@sirca.org.au> >>> franc.car...@sirca.org.au | www.sirca.org.au >>> Tel: +61 2 9236 9118 >>> Level 9, 80 Clarence St, Sydney NSW 2000 >>> PO Box H58, Australia Square, Sydney NSW 1215 >>> >>> >> Cumprimentos, >> Luís Ferreira >> >> >> >> >> > > > -- > *Franc Carter* | Systems architect | Sirca Ltd > <marc.zianideferra...@sirca.org.au> > franc.car...@sirca.org.au | www.sirca.org.au > Tel: +61 2 9236 9118 > Level 9, 80 Clarence St, Sydney NSW 2000 > PO Box H58, Australia Square, Sydney NSW 1215 > > > -- *Franc Carter* | Systems architect | Sirca Ltd <marc.zianideferra...@sirca.org.au> franc.car...@sirca.org.au | www.sirca.org.au Tel: +61 2 9236 9118 Level 9, 80 Clarence St, Sydney NSW 2000 PO Box H58, Australia Square, Sydney NSW 1215