Edgar,

Today we disclosed a new feature for Riak's leveldb, Tiered Storage.  The 
details are here:

https://github.com/basho/leveldb/wiki/mv-tiered-options

This feature might give you another option in managing your storage volume. 

Matthew

> On Apr 8, 2014, at 11:07 AM, Edgar Veiga <[email protected]> wrote:
> 
>> It makes sense, I do a lot, and I really mean a LOT of updates per key, 
>> maybe thousands a day! The cluster is experiencing a lot more updates per 
>> each key, than new keys being inserted.
>> 
>> The hash trees will rebuild during the next weekend (normally it takes about 
>> two days to complete the operation) so I'll come back and give you some 
>> feedback (hopefully good) on the next Monday!
>> 
>> Again, thanks a lot, You've been very helpful.
>> Edgar
>> 
>> 
>> On 8 April 2014 15:47, Matthew Von-Maszewski <[email protected]> wrote:
>> Edgar,
>> 
>> The test I have running currently has reach 1 Billion keys.  It is running 
>> against a single node with N=1.  It has 42G of AAE data.  Here is my 
>> extrapolation to compare your numbers:
>> 
>> You have ~2.5 Billion keys.  I assume you are running N=3 (the default).  
>> AAE therefore is actually tracking ~7.5 Billion keys.  You have six nodes, 
>> therefore tracking ~1.25 Billion keys per node.
>> 
>> Raw math would suggest that my 42G of AAE data for 1 billion keys would 
>> extrapolate to 52.5G of AAE data for you.  Yet you have ~120G of AAE data.  
>> Is something wrong?  No.  My data is still loading and has experience zero 
>> key/value updates/edits.
>> 
>> AAE hashes get rewritten every time a user updates the value of a key.  
>> AAE's leveldb is just like the user leveldb, all prior values of a key 
>> accumulate in the .sst table files until compaction removes duplicates.  
>> Similarly, a user delete of a key causes a delete tombstone in the AAE hash 
>> tree.  Those delete tombstones have to await compactions too before leveldb 
>> recovers the disk space.
>> 
>> AAE's hash trees rebuild weekly.  I am told that the rebuild operation will 
>> actually destroy the existing files and start over.  That is when you should 
>> see AAE space usage dropping dramatically.
>> 
>> Matthew
>> 
>> 
>> On Apr 8, 2014, at 9:31 AM, Edgar Veiga <[email protected]> wrote:
>> 
>>> Thanks a lot Matthew!
>>> 
>>> A little bit of more info, I've gathered a sample of the contents of 
>>> anti-entropy data of one of my machines:
>>> - 44 folders with the name equal to the name of the folders in level-db dir 
>>> (i.e. 393920363186844927172086927568060657641638068224/)
>>> - each folder has a 5 files (log, current, log, etc) and 5 sst_* folders.
>>> - The biggest sst folder is sst_3 with 4.3G
>>> - Inside sst_3 folder there are 1219 files name 00****.sst.
>>> - Each of the 00*****.sst files has ~3.7M
>>> 
>>> Hope this info gives you some more help! 
>>> 
>>> Best regards, and again, thanks a lot
>>> Edgar
>>> 
>>> 
>>> On 8 April 2014 13:24, Matthew Von-Maszewski <[email protected]> wrote:
>>> Argh. Missed where you said you had upgraded. Ok it will proceed with 
>>> getting you comparison numbers. 
>>> 
>>> Sent from my iPhone
>>> 
>>> On Apr 8, 2014, at 6:51 AM, Edgar Veiga <[email protected]> wrote:
>>> 
>>>> Thanks again Matthew, you've been very helpful!
>>>> 
>>>> Maybe you can give me some kind of advise on this issue I'm having since 
>>>> I've upgraded to 1.4.8.
>>>> 
>>>> Since I've upgraded my anti-entropy data has been growing a lot and has 
>>>> only stabilised in very high values... Write now my cluster has 6 machines 
>>>> each one with ~120G of anti-entropy data and 600G of level-db data. This 
>>>> seems to be quite a lot no? My total amount of keys is ~2.5 Billions.
>>>> 
>>>> Best regards,
>>>> Edgar
>>>> 
>>>> On 6 April 2014 23:30, Matthew Von-Maszewski <[email protected]> wrote:
>>>> Edgar,
>>>> 
>>>> This is indirectly related to you key deletion discussion.  I made changes 
>>>> recently to the aggressive delete code.  The second section of the 
>>>> following (updated) web page discusses the adjustments:
>>>> 
>>>>     https://github.com/basho/leveldb/wiki/Mv-aggressive-delete
>>>> 
>>>> Matthew
>>>> 
>>>> 
>>>> On Apr 6, 2014, at 4:29 PM, Edgar Veiga <[email protected]> wrote:
>>>> 
>>>>> Matthew, thanks again for the response!
>>>>> 
>>>>> That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :)
>>>>> 
>>>>> Best regards
>>>>> 
>>>>> 
>>>>> On 6 April 2014 15:02, Matthew Von-Maszewski <[email protected]> wrote:
>>>>> Edgar,
>>>>> 
>>>>> In Riak 1.4, there is no advantage to using empty values versus deleting.
>>>>> 
>>>>> leveldb is a "write once" data store.  New data for a given key never 
>>>>> physically overwrites old data for the same key.  New data "hides" the 
>>>>> old data by being in a lower level, and therefore picked first.
>>>>> 
>>>>> leveldb's compaction operation will remove older key/value pairs only 
>>>>> when the newer key/value is pair is part of a compaction involving both 
>>>>> new and old.  The new and the old key/value pairs must have migrated to 
>>>>> adjacent levels through normal compaction operations before leveldb will 
>>>>> see them in the same compaction.  The migration could take days, weeks, 
>>>>> or even months depending upon the size of your entire dataset and the 
>>>>> rate of incoming write operations.
>>>>> 
>>>>> leveldb's "delete" object is exactly the same as your empty JSON object.  
>>>>> The delete object simply has one more flag set that allows it to also be 
>>>>> removed if and only if there is no chance for an identical key to exist 
>>>>> on a higher level.
>>>>> 
>>>>> I apologize that I cannot give you a more useful answer.  2.0 is on the 
>>>>> horizon.
>>>>> 
>>>>> Matthew
>>>>> 
>>>>> 
>>>>> On Apr 6, 2014, at 7:04 AM, Edgar Veiga <[email protected]> wrote:
>>>>> 
>>>>>> Hi again!
>>>>>> 
>>>>>> Sorry to reopen this discussion, but I have another question regarding 
>>>>>> the former post.
>>>>>> 
>>>>>> What if, instead of doing a mass deletion (We've already seen that it 
>>>>>> will be non profitable, regarding disk space) I update all the values 
>>>>>> with an empty JSON object "{}" ? Do you see any problem with this? I no 
>>>>>> longer need those millions of values that are living in the cluster... 
>>>>>> 
>>>>>> When the version 2.0 of riak runs stable I'll do the update and only 
>>>>>> then delete those keys!
>>>>>> 
>>>>>> Best regards
>>>>>> 
>>>>>> 
>>>>>> On 18 February 2014 16:32, Edgar Veiga <[email protected]> wrote:
>>>>>> Ok, thanks a lot Matthew.
>>>>>> 
>>>>>> 
>>>>>> On 18 February 2014 16:18, Matthew Von-Maszewski <[email protected]> 
>>>>>> wrote:
>>>>>> Riak 2.0 is coming.  Hold your mass delete until then.  The "bug" is 
>>>>>> within Google's original leveldb architecture.  Riak 2.0 sneaks around 
>>>>>> to get the disk space freed.
>>>>>> 
>>>>>> Matthew
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Feb 18, 2014, at 11:10 AM, Edgar Veiga <[email protected]> wrote:
>>>>>> 
>>>>>>> The only/main purpose is to free disk space..
>>>>>>> 
>>>>>>> I was a little bit concerned regarding this operation, but now with 
>>>>>>> your feedback I'm tending to don't do nothing, I can't risk the growing 
>>>>>>> of space... 
>>>>>>> Regarding the overhead I think that with a tight throttling system I 
>>>>>>> could control and avoid overloading the cluster.
>>>>>>> 
>>>>>>> Mixed feelings :S
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 18 February 2014 15:45, Matthew Von-Maszewski <[email protected]> 
>>>>>>> wrote:
>>>>>>> Edgar,
>>>>>>> 
>>>>>>> The first "concern" I have is that leveldb's delete does not free disk 
>>>>>>> space.  Others have executed mass delete operations only to discover 
>>>>>>> they are now using more disk space instead of less.  Here is a 
>>>>>>> discussion of the problem:
>>>>>>> 
>>>>>>> https://github.com/basho/leveldb/wiki/mv-aggressive-delete
>>>>>>> 
>>>>>>> The link also describes Riak's database operation overhead.  This is a 
>>>>>>> second "concern".  You will need to carefully throttle your delete rate 
>>>>>>> or the overhead will likely impact your production throughput.
>>>>>>> 
>>>>>>> We have new code to help quicken the actual purge of deleted data in 
>>>>>>> Riak 2.0.  But that release is not quite ready for production usage.
>>>>>>> 
>>>>>>> 
>>>>>>> What do you hope to achieve by the mass delete?
>>>>>>> 
>>>>>>> Matthew
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Feb 18, 2014, at 10:29 AM, Edgar Veiga <[email protected]> wrote:
>>>>>>> 
>>>>>>>> Sorry, forgot that info!
>>>>>>>> 
>>>>>>>> It's leveldb.
>>>>>>>> 
>>>>>>>> Best regards
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 18 February 2014 15:27, Matthew Von-Maszewski <[email protected]> 
>>>>>>>> wrote:
>>>>>>>> Which Riak backend are you using:  bitcask, leveldb, multi?
>>>>>>>> 
>>>>>>>> Matthew
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Feb 18, 2014, at 10:17 AM, Edgar Veiga <[email protected]> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> > Hi all!
>>>>>>>> >
>>>>>>>> > I have a fairly trivial question regarding mass deletion on a riak 
>>>>>>>> > cluster, but firstly let me give you just some context. My cluster 
>>>>>>>> > is running with riak 1.4.6 on 6 machines with a ring of 256 nodes 
>>>>>>>> > and 1Tb ssd disks.
>>>>>>>> >
>>>>>>>> > I need to execute a massive object deletion on a bucket, I'm talking 
>>>>>>>> > of ~1 billion keys (The object average size is ~1Kb). I will not 
>>>>>>>> > retrive the keys from riak because a I have a file with all of them. 
>>>>>>>> > I'll just start a script that reads them from the file and triggers 
>>>>>>>> > an HTTP DELETE for each one.
>>>>>>>> > The cluster will continue running on production with a quite high 
>>>>>>>> > load serving all other applications, while running this deletion.
>>>>>>>> >
>>>>>>>> > My question is simple, do I need to have any kind of extra concerns 
>>>>>>>> > regarding this action? Do you advise me on taking special attention 
>>>>>>>> > to any kind of metrics regarding riak or event the servers where 
>>>>>>>> > it's running?
>>>>>>>> >
>>>>>>>> > Best regards!
>>>>>>>> > _______________________________________________
>>>>>>>> > riak-users mailing list
>>>>>>>> > [email protected]
>>>>>>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
> 

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to