Okay, in that case it doesn't hurt to update just in case but I think you're going to need that test case. :)
On Fri, Dec 4, 2009 at 2:45 PM, Ramzi Rabah <[email protected]> wrote: > I have a two week old version of trunk. Probably need to update it to > latest build. > > On Fri, Dec 4, 2009 at 12:34 PM, Jonathan Ellis <[email protected]> wrote: >> Are you testing trunk? If not, you should check that first to see if >> it's already fixed. >> >> On Fri, Dec 4, 2009 at 1:55 PM, Ramzi Rabah <[email protected]> wrote: >>> Just to be clear what I meant is that I ran the deletions and >>> compaction with GCGraceSeconds set to 1 hour, so there was enough time >>> for the tombstones to expire. >>> Anyway I will try to make a simpler test case to hopefully reproduce >>> this, and I will share the code if I can reproduce. >>> >>> Ray >>> >>> On Fri, Dec 4, 2009 at 11:04 AM, Ramzi Rabah <[email protected]> wrote: >>>> Hi Jonathan I have changed that to 3600(one hour) based on your >>>> recommendation before. >>>> >>>> On Fri, Dec 4, 2009 at 11:01 AM, Jonathan Ellis <[email protected]> wrote: >>>>> this is what I was referring to by "the period specified in your config >>>>> file": >>>>> >>>>> <!-- >>>>> ~ Time to wait before garbage-collection deletion markers. Set this to >>>>> ~ a large enough value that you are confident that the deletion marker >>>>> ~ will be propagated to all replicas by the time this many seconds has >>>>> ~ elapsed, even in the face of hardware failures. The default value is >>>>> ~ ten days. >>>>> --> >>>>> <GCGraceSeconds>864000</GCGraceSeconds> >>>>> >>>>> On Fri, Dec 4, 2009 at 12:51 PM, Ramzi Rabah <[email protected]> wrote: >>>>>> I think there might be a bug in the deletion logic. I removed all the >>>>>> data on the cluster by running remove on every single key I entered, >>>>>> and I run major compaction >>>>>> nodeprobe -host hostname compact on a certain node, and after the >>>>>> compaction is over, I am left with one data file/ one index file and >>>>>> the bloom filter file, >>>>>> and they are the same size of data as before I started doing the deletes. >>>>>> >>>>>> On Thu, Dec 3, 2009 at 6:09 PM, Jonathan Ellis <[email protected]> wrote: >>>>>>> cassandra never modifies data in-place. so it writes tombstones to >>>>>>> supress the older writes, and when compaction occurs the data and >>>>>>> tombstones get GC'd (after the period specified in your config file). >>>>>>> >>>>>>> On Thu, Dec 3, 2009 at 8:07 PM, Ramzi Rabah <[email protected]> wrote: >>>>>>>> Looking at jconsole I see a high number of writes when I do removes, >>>>>>>> so I am guessing these are tombstones being written? If that's the >>>>>>>> case, is the data being removed and replaced by tombstones? and will >>>>>>>> they all be deleted eventually when compaction runs? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Dec 3, 2009 at 3:18 PM, Ramzi Rabah <[email protected]> wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I ran a test where I inserted about 1.2 Gigabytes worth of data into >>>>>>>>> each node of a 4 node cluster. >>>>>>>>> I ran a script that first calls a get on each column inserted followed >>>>>>>>> by a remove. Since I was basically removing every entry >>>>>>>>> I inserted before, I expected that the disk space occupied by the >>>>>>>>> nodes will go down and eventually become 0. The disk space >>>>>>>>> actually goes up when I do the bulk removes to about 1.8 gigs per >>>>>>>>> node. Am I missing something here? >>>>>>>>> >>>>>>>>> Thanks a lot for your help >>>>>>>>> Ray >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
