Guys, thank you very much. For my case scenario, I'm gonna need to change a little bit my data model by spliting my row nto N pieces. And implement a further control of it. That will mitigate the problem. Also, I'll try LeveledCompaction after.
Thanks! On Mon, Mar 4, 2013 at 3:25 AM, aaron morton <aa...@thelastpickle.com>wrote: > I need something to keep the deleted columns away from my query fetch. Not > only the tombstones. > It looks like the min compaction might help on this. But I'm not sure yet > on what would be a reasonable value for its threeshold. > > Your tombstones will not be purged in a compaction until after gc_grace > and only if all fragments of the row are in the compaction. You right that > you would probably want to run repair during the day if you are going to > dramatically reduce gc_grace to avoid deleted data coming back to life. > > If you are using a single cassandra row as a queue, you are going to have > trouble. Levelled compaction may help a little. > > If you are reading the "most recent" entries in the row, assuming the > columns are sorted by some time stamp. Use the Reverse Comparator and issue > slice commands to get the first X cols. That will remove tombstones from > the problem. (Am guessing this is not something you do, just mentioning > it). > > You next option is to change the data model so you don't use the same row > all day. > > After that, consider a message queue. > > Cheers > > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 2/03/2013, at 12:03 PM, Víctor Hugo Oliveira Molinar < > vhmoli...@gmail.com> wrote: > > Tombstones stay around until gc grace so you could lower that to see of > that fixes the performance issues. > > If the tombstones get collected,the column will live again, causing data > inconsistency since I cant run a repair during the regular operations. Not > sure if I got your thoughts on this. > > > Size tiered or leveled comparison? > > > I'm actuallly running on Size Tiered Compaction, but I've been looking > into changing it for Leveled. It seems to be the case. Although even if I > achieve some performance, I would still have the same problem with the > deleted columns. > > > I need something to keep the deleted columns away from my query fetch. Not > only the tombstones. > It looks like the min compaction might help on this. But I'm not sure yet > on what would be a reasonable value for its threeshold. > > > On Sat, Mar 2, 2013 at 4:22 PM, Michael Kjellman > <mkjell...@barracuda.com>wrote: > >> Tombstones stay around until gc grace so you could lower that to see of >> that fixes the performance issues. >> >> Size tiered or leveled comparison? >> >> On Mar 2, 2013, at 11:15 AM, "Víctor Hugo Oliveira Molinar" < >> vhmoli...@gmail.com> wrote: >> >> What is your gc_grace set to? Sounds like as the number of tombstones >> records increase your performance decreases. (Which I would expect) >> >> >> gr_grace is default. >> >> >> Casandra's data files are write once. Deletes are another write. Until >> compaction they all live on disk.Making really big rows has these problem. >> >> Oh, so it looks like I should lower the min_compaction_threshold for this >> column family. Right? >> What does realy mean this threeshold value? >> >> >> Guys, thanks for the help so far. >> >> On Sat, Mar 2, 2013 at 3:42 PM, Michael Kjellman <mkjell...@barracuda.com >> > wrote: >> >>> What is your gc_grace set to? Sounds like as the number of tombstones >>> records increase your performance decreases. (Which I would expect) >>> >>> On Mar 2, 2013, at 10:28 AM, "Víctor Hugo Oliveira Molinar" < >>> vhmoli...@gmail.com> wrote: >>> >>> I have a daily maintenance of my cluster where I truncate this column >>> family. Because its data doesnt need to be kept more than a day. >>> Since all the regular operations on it finishes around 4 hours before >>> finishing the day. I regurlarly run a truncate on it followed by a repair >>> at the end of the day. >>> >>> And every day, when the operations are started(when are only few deleted >>> columns), the performance looks pretty well. >>> Unfortunately it is degraded along the day. >>> >>> >>> On Sat, Mar 2, 2013 at 2:54 PM, Michael Kjellman < >>> mkjell...@barracuda.com> wrote: >>> >>>> When is the last time you did a cleanup on the cf? >>>> >>>> On Mar 2, 2013, at 9:48 AM, "Víctor Hugo Oliveira Molinar" < >>>> vhmoli...@gmail.com> wrote: >>>> >>>> > Hello guys. >>>> > I'm investigating the reasons of performance degradation for my case >>>> scenario which follows: >>>> > >>>> > - I do have a column family which is filled of thousands of columns >>>> inside a unique row(varies between 10k ~ 200k). And I do have also >>>> thousands of rows, not much more than 15k. >>>> > - This rows are constantly updated. But the write-load is not that >>>> intensive. I estimate it as 100w/sec in the column family. >>>> > - Each column represents a message which is read and processed by >>>> another process. After reading it, the column is marked for deletion in >>>> order to keep it out from the next query on this row. >>>> > >>>> > Ok, so, I've been figured out that after many insertions plus >>>> deletion updates, my queries( column slice query ) are taking more time to >>>> be performed. Even if there are only few columns, lower than 100. >>>> > >>>> > So it looks like that the longer is the number of columns being >>>> deleted, the longer is the time spent for a query. >>>> > -> Internally at C*, does column slice query ranges among deleted >>>> columns? >>>> > If so, how can I mitigate the impact in my queries? Or, how can I >>>> avoid those deleted columns? >>>> >>>> Copy, by Barracuda, helps you store, protect, and share all your amazing >>>> things. Start today: www.copy.com. >>>> >>> >>> >>> ---------------------------------- >>> Copy, by Barracuda, helps you store, protect, and share all your amazing >>> things. Start today: www.copy.com <http://www.copy.com/?a=em_footer>. >>> >>> >> >> >> ---------------------------------- >> Copy, by Barracuda, helps you store, protect, and share all your amazing >> things. Start today: www.copy.com <http://www.copy.com/?a=em_footer>. >> >> > > >