this is what I was referring to by "the period specified in your config file":
<!-- ~ Time to wait before garbage-collection deletion markers. Set this to ~ a large enough value that you are confident that the deletion marker ~ will be propagated to all replicas by the time this many seconds has ~ elapsed, even in the face of hardware failures. The default value is ~ ten days. --> <GCGraceSeconds>864000</GCGraceSeconds> On Fri, Dec 4, 2009 at 12:51 PM, Ramzi Rabah <[email protected]> wrote: > I think there might be a bug in the deletion logic. I removed all the > data on the cluster by running remove on every single key I entered, > and I run major compaction > nodeprobe -host hostname compact on a certain node, and after the > compaction is over, I am left with one data file/ one index file and > the bloom filter file, > and they are the same size of data as before I started doing the deletes. > > On Thu, Dec 3, 2009 at 6:09 PM, Jonathan Ellis <[email protected]> wrote: >> cassandra never modifies data in-place. so it writes tombstones to >> supress the older writes, and when compaction occurs the data and >> tombstones get GC'd (after the period specified in your config file). >> >> On Thu, Dec 3, 2009 at 8:07 PM, Ramzi Rabah <[email protected]> wrote: >>> Looking at jconsole I see a high number of writes when I do removes, >>> so I am guessing these are tombstones being written? If that's the >>> case, is the data being removed and replaced by tombstones? and will >>> they all be deleted eventually when compaction runs? >>> >>> >>> >>> On Thu, Dec 3, 2009 at 3:18 PM, Ramzi Rabah <[email protected]> wrote: >>>> Hi all, >>>> >>>> I ran a test where I inserted about 1.2 Gigabytes worth of data into >>>> each node of a 4 node cluster. >>>> I ran a script that first calls a get on each column inserted followed >>>> by a remove. Since I was basically removing every entry >>>> I inserted before, I expected that the disk space occupied by the >>>> nodes will go down and eventually become 0. The disk space >>>> actually goes up when I do the bulk removes to about 1.8 gigs per >>>> node. Am I missing something here? >>>> >>>> Thanks a lot for your help >>>> Ray >>>> >>> >> >
