Leveldb, as written by Google, does not actively clean up delete "tombstones" or prior data records with the same key. The old data and tombstones stay on disk until they happen to participate in compaction at the highest "level". The clean up can therefore happen days, weeks, or even months later depending upon the size of your dataset, speed of incoming writes, and distribution of new keys versus deleted keys.
Basho has added code to leveldb in Riak 2.0 to more aggressively free up disk space. Details on this 2.0 feature are here: https://github.com/basho/leveldb/wiki/mv-aggressive-delete Matthew Von-Maszewski On Mar 22, 2014, at 1:53, István <[email protected]> wrote: > All good, all the keys are gone! :) > > I am just waiting Riak to free up the space. It seems it is not > instant... Or I am missing something. I need to read up on how LevelDB > actually frees up space. I have updated the code to stop on {ReqID, > done}. I think you get this only when you have no keys left. I have > verified that that there are no keys left in the bucket. > > > # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream > HTTP/1.1 200 OK > Vary: Accept-Encoding > Transfer-Encoding: chunked > Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact) > Date: Sat, 22 Mar 2014 05:51:48 GMT > Content-Type: application/json > > {"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]}{"keys":[]} > > Thanks Evan! > I. > > On Fri, Mar 21, 2014 at 9:44 PM, Evan Vigil-McClanahan > <[email protected]> wrote: >> Did some double checking on the off chance that I gave you some bad >> advice. Here's the function that the erlang client uses to accumulate >> the outcome of stream_list_keys et al: >> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L2146-L2155 >> >> here is how you get the request id: >> https://github.com/basho/riak-erlang-client/blob/master/src/riakc_pb_socket.erl#L490-L494 >> >> On Fri, Mar 21, 2014 at 9:29 PM, Evan Vigil-McClanahan >> <[email protected]> wrote: >>> You don't want to recurse when you get the {ReqID, done} message, you >>> should just stop there. >>> >>> On Fri, Mar 21, 2014 at 6:20 PM, István <[email protected]> wrote: >>>> With help of Evan (evanmcc) on the IRC channel I was able to kick off >>>> the clean up job using riak-erlang-client. >>>> >>>> Here is the code: >>>> >>>> https://gist.github.com/l1x/9698847 >>>> >>>> It sometimes behaves a bit weirdly, the PB client returns {40127151, >>>> done} or something similar, that I can't recognize why but it >>>> definitely deleted some of the keys so far. I am letting it run for a >>>> while and see what happens. >>>> >>>> Regards, >>>> Istvan >>>> >>>> >>>> On Wed, Mar 19, 2014 at 1:02 AM, Christian Dahlqvist >>>> <[email protected]> wrote: >>>>> Hi Istvan, >>>>> >>>>> Did you run the Basho Bench clean-up job with the following settings? >>>>> >>>>> {driver, basho_bench_driver_riakc_pb}. >>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}. >>>>> {operations, [{delete, 1}]}. >>>>> >>>>> Also, how did you verify that the data was not deleted? >>>>> >>>>> Best regards, >>>>> >>>>> Christian >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Mar 19, 2014 at 6:49 AM, István <[email protected]> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I was trying to delete all of the keys generated with the following: >>>>>> >>>>>> {key_generator, {int_to_bin, {uniform_int, 10000000}}}. >>>>>> >>>>>> I have used this for the deletion: >>>>>> >>>>>> {key_generator, {int_to_bin, {partitioned_sequential_int, 10000000}}}. >>>>>> >>>>>> I has completed but unfortunately was not deleting any data.... >>>>>> >>>>>> Next is to use the Erlang client and see if I can list the keys and >>>>>> delete them, or try to use the Erlang interface for MR. >>>>>> >>>>>> Regards, >>>>>> Istvan >>>>>> >>>>>> >>>>>> >>>>>> On Sat, Mar 15, 2014 at 1:38 PM, Christian Dahlqvist >>>>>> <[email protected]> wrote: >>>>>>> Hi Istvan, >>>>>>> >>>>>>> Depending on how you have run your Basho Bench job(s), you could try >>>>>>> deleting the generated keys by running a separate Basho Bench job based >>>>>>> on a >>>>>>> partitioned_sequential_int key generator and only delete operations. >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Christian >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Mar 14, 2014 at 5:00 PM, István <[email protected]> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am trying to clean up some of the test data that was inserted by >>>>>>>> basho_bench. The first approach to use curl and streaming the keys >>>>>>>> fails like this: >>>>>>>> >>>>>>>> # curl -XGET -i http://127.0.0.1:8098/buckets/test/keys?keys=stream >>>>>>>> HTTP/1.1 200 OK >>>>>>>> Vary: Accept-Encoding >>>>>>>> Transfer-Encoding: chunked >>>>>>>> Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact) >>>>>>>> Date: Fri, 14 Mar 2014 16:59:08 GMT >>>>>>>> Content-Type: application/json >>>>>>>> >>>>>>>> curl: (18) transfer closed with outstanding read data remaining >>>>>>>> >>>>>>>> When I am trying to the same thing with MapReduce it fails like this: >>>>>>>> >>>>>>>> curl -X POST "http://localhost:8098/mapred" -H "Content-Type: >>>>>>>> application/json" -d '{ >>>>>>>> "inputs": "test", >>>>>>>> "query": [ >>>>>>>> { >>>>>>>> "map": { >>>>>>>> "language": "javascript", >>>>>>>> "source": "function(riakObject) { return >>>>>>>> [riakObject.key]; >>>>>>>> }" >>>>>>>> } >>>>>>>> } >>>>>>>> ] >>>>>>>> }' >>>>>>>> >>>>>>>> Error: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> {"phase":0,"error":"bad_utf8_character_code","input":"{ok,{r_object,<<\"test\">>,<<0,116,71,0>>,[{r_content,{dict,3,16,16,8,80,48,{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},{{[],[],[],[],[],[],[],[],[],[],[[<<\"X-Riak-VTag\">>,71,81,80,81,87,76,105,54,113,120,97,116,114,106,51,86,72,53,67,50,82]],[[<<\"index\">>]],[],[[<<\"X-Riak-Last-Modified\">>|{1391,27501,255280}]],[],[]}}},<<75,191,51,171,193,113,206,163,24,68,247,188,84,72,5,72,179,195,99,44,202,122,136,31,250,94,166,5,160,199,182,137,40,6,253,115,100,4,34,67,64,10,25,210,58,23,104,97,228,...>>}],...},...}"} >>>>>>>> >>>>>>>> I am wondering how else could I just get a list of keys in that >>>>>>>> bucket. The ultimate goal is to be able to delete them all. >>>>>>>> >>>>>>>> Thank you in advance, >>>>>>>> Istvan >>>>>>>> >>>>>>>> -- >>>>>>>> the sun shines for all >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> riak-users mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> the sun shines for all >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> the sun shines for all >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> [email protected] >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > -- > the sun shines for all > > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
