Re: RIAK 1.4.6 - Mass key deletion

2014-07-20 Thread Matthew Von-Maszewski
Simon, The aggressive delete code is only in the 2.0 release. There are currently no plans to backport that feature to 1.4. A couple of generic performance improvements for compactions are now in the 1.4.10 release. These improvements relate to general compactions. They do not speed the rem

Re: RIAK 1.4.6 - Mass key deletion

2014-07-20 Thread Simon Effenberg
Hi Matthew, so is there a awy to improve the compaction rate in Riak < 2.0 or do I have to upgrade to 2.0 to get this? Cheers Simon On Sun, Apr 06, 2014 at 06:30:30PM -0400, Matthew Von-Maszewski wrote: >Edgar, >This is indirectly related to you key deletion discussion. I made changes >

Re: RIAK 1.4.6 - Mass key deletion

2014-04-10 Thread Edgar Veiga
Thanks, I'll start the process and give you guys some feedback in the mean while. The plan is 1 - Disable AAE in the cluster via riak attach: a. rpc:multicall(riak_kv_entropy_manager, disable, []). rpc:multicall(riak_kv_entropy_manager, cancel_exchanges, []). z. 2 - Update the app.config changi

Re: RIAK 1.4.6 - Mass key deletion

2014-04-10 Thread Matthew Von-Maszewski
Yes, you can send the AAE (active anti-entropy) data to a different disk. AAE calculates a hash each time you PUT new data to the regular database. AAE then buffers around 1,000 hashes (I forget the exact value) to write as a block to the AAE database. The AAE write is NOT in series with the

Re: RIAK 1.4.6 - Mass key deletion

2014-04-10 Thread Edgar Veiga
Hi Matthew! I have a possibility of moving the data of anti-entropy directory to a mechanic disk 7200, that exists on each of the machines. I was thinking of changing the anti_entropy data dir config in app.config file and restart the riak process. Is there any problem using a mechanic disk to st

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
I'll wait a few more days, see if the AAE maybe "stabilises" and only after that make a decision regarding this. The cluster expanding was on the roadmap, but not right now :) I've attached a few screenshot, you can clearly observe the evolution of one of the machines after the anti-entropy data

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski
No. I do not see a problem with your plan. But ... I would prefer to see you add servers to your cluster. Scalabilty is one of Riak's fundamental characteristics. As your database needs grow, we grow with you … just add another server and migrate some of the vnodes there. I obviously cannot

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
Thanks Matthew! Today this situation has become unsustainable, In two of the machines I have an anti-entropy dir of 250G... It just keeps growing and growing and I'm almost reaching max size of the disks. Maybe I'll just turn off aae in the cluster, remove all the data in the anti-entropy directo

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski
Edgar, Today we disclosed a new feature for Riak's leveldb, Tiered Storage. The details are here: https://github.com/basho/leveldb/wiki/mv-tiered-options This feature might give you another option in managing your storage volume. Matthew > On Apr 8, 2014, at 11:07 AM, Edgar Veiga wrote: >

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski
Edgar, The test I have running currently has reach 1 Billion keys. It is running against a single node with N=1. It has 42G of AAE data. Here is my extrapolation to compare your numbers: You have ~2.5 Billion keys. I assume you are running N=3 (the default). AAE therefore is actually trac

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
It makes sense, I do a lot, and I really mean a LOT of updates per key, maybe thousands a day! The cluster is experiencing a lot more updates per each key, than new keys being inserted. The hash trees will rebuild during the next weekend (normally it takes about two days to complete the operation)

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
Thanks a lot Matthew! A little bit of more info, I've gathered a sample of the contents of anti-entropy data of one of my machines: - 44 folders with the name equal to the name of the folders in level-db dir (i.e. 393920363186844927172086927568060657641638068224/) - each folder has a 5 files (log,

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski
Argh. Missed where you said you had upgraded. Ok it will proceed with getting you comparison numbers. Sent from my iPhone > On Apr 8, 2014, at 6:51 AM, Edgar Veiga wrote: > > Thanks again Matthew, you've been very helpful! > > Maybe you can give me some kind of advise on this issue I'm havin

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Matthew Von-Maszewski
AAE is broken and brain dead in releases 1.4.3 through 1.4.7. That might be your problem. I have a two billion key data set building now. I will forward node disk usage when available. Matthew Sent from my iPhone > On Apr 8, 2014, at 6:51 AM, Edgar Veiga wrote: > > Thanks again Matthew,

Re: RIAK 1.4.6 - Mass key deletion

2014-04-08 Thread Edgar Veiga
Thanks again Matthew, you've been very helpful! Maybe you can give me some kind of advise on this issue I'm having since I've upgraded to 1.4.8. Since I've upgraded my anti-entropy data has been growing a lot and has only stabilised in very high values... Write now my cluster has 6 machines each

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Matthew Von-Maszewski
Edgar, This is indirectly related to you key deletion discussion. I made changes recently to the aggressive delete code. The second section of the following (updated) web page discusses the adjustments: https://github.com/basho/leveldb/wiki/Mv-aggressive-delete Matthew On Apr 6, 2014,

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Edgar Veiga
Matthew, thanks again for the response! That said, I'll wait again for the 2.0 (and maybe buy some bigger disks :) Best regards On 6 April 2014 15:02, Matthew Von-Maszewski wrote: > Edgar, > > In Riak 1.4, there is no advantage to using empty values versus deleting. > > leveldb is a "write on

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Matthew Von-Maszewski
Edgar, In Riak 1.4, there is no advantage to using empty values versus deleting. leveldb is a "write once" data store. New data for a given key never physically overwrites old data for the same key. New data "hides" the old data by being in a lower level, and therefore picked first. leveldb'

Re: RIAK 1.4.6 - Mass key deletion

2014-04-06 Thread Edgar Veiga
Hi again! Sorry to reopen this discussion, but I have another question regarding the former post. What if, instead of doing a mass deletion (We've already seen that it will be non profitable, regarding disk space) I update all the values with an empty JSON object "{}" ? Do you see any problem wit

Re: RIAK 1.4.6 - Mass key deletion

2014-02-21 Thread Matthew Von-Maszewski
Toby, At some point, yes the "missing" space is recovered. But that does not mean the deletes will be found immediately upon 2.0 upgrade. 2.0 adds statistical data to each newly generated .sst table file so that it "knows" where delete entries are hidden. The statistical data drives the aggr

Re: RIAK 1.4.6 - Mass key deletion

2014-02-20 Thread Toby Corkindale
On 19 February 2014 03:18, Matthew Von-Maszewski wrote: > Riak 2.0 is coming. Hold your mass delete until then. The "bug" is within > Google's original leveldb architecture. Riak 2.0 sneaks around to get the > disk space freed. I'm interested to know what happens if someone deletes a lot of da

Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
Ok, thanks a lot Matthew. On 18 February 2014 16:18, Matthew Von-Maszewski wrote: > Riak 2.0 is coming. Hold your mass delete until then. The "bug" is > within Google's original leveldb architecture. Riak 2.0 sneaks around to > get the disk space freed. > > Matthew > > > > On Feb 18, 2014, a

Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Matthew Von-Maszewski
Riak 2.0 is coming. Hold your mass delete until then. The "bug" is within Google's original leveldb architecture. Riak 2.0 sneaks around to get the disk space freed. Matthew On Feb 18, 2014, at 11:10 AM, Edgar Veiga wrote: > The only/main purpose is to free disk space.. > > I was a litt

Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
The only/main purpose is to free disk space.. I was a little bit concerned regarding this operation, but now with your feedback I'm tending to don't do nothing, I can't risk the growing of space... Regarding the overhead I think that with a tight throttling system I could control and avoid overloa

Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Matthew Von-Maszewski
Edgar, The first "concern" I have is that leveldb's delete does not free disk space. Others have executed mass delete operations only to discover they are now using more disk space instead of less. Here is a discussion of the problem: https://github.com/basho/leveldb/wiki/mv-aggressive-delete

Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
Sorry, forgot that info! It's leveldb. Best regards On 18 February 2014 15:27, Matthew Von-Maszewski wrote: > Which Riak backend are you using: bitcask, leveldb, multi? > > Matthew > > > On Feb 18, 2014, at 10:17 AM, Edgar Veiga wrote: > > > Hi all! > > > > I have a fairly trivial question

Re: RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Matthew Von-Maszewski
Which Riak backend are you using: bitcask, leveldb, multi? Matthew On Feb 18, 2014, at 10:17 AM, Edgar Veiga wrote: > Hi all! > > I have a fairly trivial question regarding mass deletion on a riak cluster, > but firstly let me give you just some context. My cluster is running with > riak 1

RIAK 1.4.6 - Mass key deletion

2014-02-18 Thread Edgar Veiga
Hi all! I have a fairly trivial question regarding mass deletion on a riak cluster, but firstly let me give you just some context. My cluster is running with riak 1.4.6 on 6 machines with a ring of 256 nodes and 1Tb ssd disks. I need to execute a massive object deletion on a bucket, I'm talking o