On Sat, Feb 16, 2013 at 3:49 PM, Emmanuel Lécharny <[email protected]>wrote:
> Hi guys, > > this week I worked on some improvements for the insertion operations > (add, delete, modify, etc), because since the last version, the > performances are really bad (well, way slower than it was 6 months > earlier). > > The reason we have worst performance is simplein order to guarantee that > the database is stable even if the server crashes, I have activated the > transaction support. It's great, from the data acces POV (even after a > crash, the base is restored at startup and we can access the data, which > was not the case before). The drawback is that inserting data is now 10 > times slower: with M8, I was able to inject 600 entries per second, I'm > now leveling out at 50 per second - and it depends on the number of > indexes). > > So I have modified the code yesterday to increase the performances. > There are two things ot consider when injectong data in a BTree with > transaction enabled : > - the number of entries stored in memory before being committed into a > journal > - the number of entries stored in the journal before being flushed to > the database > > Obviously, teh second number must be higher than the first number (it > makes no sense to flush a journal which has not been updated yet). > > I onducted some basic tests on JDBM, with various values. Here are some > of the results : > > Old JDBM, no TM : > ----------------- > > 13130ms transaction disabled > 3770ms transaction disabled > 3768ms transaction disabled > 3756ms transaction disabled > 3678ms transaction disabled > 3678ms transaction disabled > 3667ms transaction disabled > 3687ms transaction disabled > 3732ms transaction disabled > 3721ms transaction disabled > > average 3731ms > > JDBM2, no TM : > -------------- > > 12339ms transaction disabled > 4087ms transaction disabled > 4081ms transaction disabled > 4050ms transaction disabled > 4015ms transaction disabled > 4003ms transaction disabled > 4003ms transaction disabled > 3973ms transaction disabled > 4003ms transaction disabled > 3986ms transaction disabled > > average 4022ms (8% slower) > > old JDBM, with a transaction manager (TM) : > ------------------------------------------- > > Here, CI = commits, and Sync = flush on disk. > > 18130ms with CI, Sync [1,1] > 14448ms with CI, Sync [1,2] > 12934ms with CI, Sync [1,5] > 12264ms with CI, Sync [1,10] > 11845ms with CI, Sync [1,20] > 11481ms with CI, Sync [1,50] > 11499ms with CI, Sync [1,100] > 11423ms with CI, Sync [1,200] > 11487ms with CI, Sync [1,500] > 11481ms with CI, Sync [1,1000] > 11534ms with CI, Sync [1,2000] > 11471ms with CI, Sync [1,5000] > 11550ms with CI, Sync [1,10000] > > 11300ms with CI, Sync [2,2] > 9923ms with CI, Sync [2,5] > 9177ms with CI, Sync [2,10] > 8672ms with CI, Sync [2,20] > 8454ms with CI, Sync [2,50] > 1922030ms with CI, Sync [2,100] > 8530ms with CI, Sync [2,200] > 8324ms with CI, Sync [2,500] > 8363ms with CI, Sync [2,1000] > 8381ms with CI, Sync [2,2000] > 8417ms with CI, Sync [2,5000] > 8361ms with CI, Sync [2,10000] > > 7772ms with CI, Sync [5,5] > 7191ms with CI, Sync [5,10] > 6737ms with CI, Sync [5,20] > 6503ms with CI, Sync [5,50] > 6398ms with CI, Sync [5,100] > 6398ms with CI, Sync [5,200] > 6448ms with CI, Sync [5,500] > 6428ms with CI, Sync [5,1000] > 6371ms with CI, Sync [5,2000] > 6385ms with CI, Sync [5,5000] > 6422ms with CI, Sync [5,10000] > > 6348ms with CI, Sync [10,10] > 5844ms with CI, Sync [10,20] > 5565ms with CI, Sync [10,50] > 5473ms with CI, Sync [10,100] > 5471ms with CI, Sync [10,200] > 5453ms with CI, Sync [10,500] > 5491ms with CI, Sync [10,1000] > 5517ms with CI, Sync [10,2000] > 5950ms with CI, Sync [10,5000] > 5806ms with CI, Sync [10,10000] > > 5054ms with CI, Sync [20,20] > 4769ms with CI, Sync [20,50] > 4657ms with CI, Sync [20,100] > 4663ms with CI, Sync [20,200] > 4650ms with CI, Sync [20,500] > 4622ms with CI, Sync [20,1000] > 4664ms with CI, Sync [20,2000] > 4634ms with CI, Sync [20,5000] > 4662ms with CI, Sync [20,10000] > > 4243ms with CI, Sync [50,50] > 4134ms with CI, Sync [50,100] > 4123ms with CI, Sync [50,200] > 4107ms with CI, Sync [50,500] > 4162ms with CI, Sync [50,1000] > 4115ms with CI, Sync [50,2000] > 4105ms with CI, Sync [50,5000] > 4143ms with CI, Sync [50,10000] > > 3982ms with CI, Sync [100,100] > 3926ms with CI, Sync [100,200] > 3913ms with CI, Sync [100,500] > 3948ms with CI, Sync [100,1000] > 3942ms with CI, Sync [100,2000] > 3934ms with CI, Sync [100,5000] > 3936ms with CI, Sync [100,10000] > > 3928ms with CI, Sync [200,200] > 3909ms with CI, Sync [200,500] > 4116ms with CI, Sync [200,1000] > 4016ms with CI, Sync [200,2000] > 3919ms with CI, Sync [200,5000] > 3892ms with CI, Sync [200,10000] > > 3797ms with CI, Sync [500,500] > 3798ms with CI, Sync [500,1000] > 3814ms with CI, Sync [500,2000] > 3790ms with CI, Sync [500,5000] > 3818ms with CI, Sync [500,10000] > > 3725ms with CI, Sync [1000,1000] > 3786ms with CI, Sync [1000,2000] > 3736ms with CI, Sync [1000,5000] > 3769ms with CI, Sync [1000,10000] > > 3682ms with CI, Sync [2000,2000] > 3678ms with CI, Sync [2000,5000] > 3691ms with CI, Sync [2000,10000] > > 3459ms with CI, Sync [5000,5000] > 3493ms with CI, Sync [5000,10000] > 3277ms with CI, Sync [10000,10000] > > As we can see, the more we differ the writings, the better the > performances. > > So I ajusted the parameters on the server, and I currently do a commit > every 2000 changes, and a flsuh every 4000 changes. It's hard coded atm, > but I will make it configurable. > > The performances went up from 50 add per second to 185 add per second. > Good, but nothing to get excited with, compared to the 600 add/s we had > :/ With such numbers, it will take roughly 1h30 to inject 1M entries... > > What can we do to improve those performances then ? > > > Atm, with JDBM, I see no bright future. There are ways to squeeze a bit > of performance though, but not a lot : we can bypass the full chain, and > inject the data directly into the btrees, but the gain will be marginal. > > So here are a few possibities : > > 1) Add raw injection of data feature in JDBM > > The idea is to order the data before injecting them, and apply an > algoithm that build the pages, without having to pass through the tree > reordering. If we can have N elements in each page, we order the data, > and we store N elements in each page keeping the largest element of each > page in a separate list. This is for the leaves. Then reproduce the same > procssing on the gathered list, but for parent pages - and link to the > leaves. etc, up to the root. > > This is the fastest way to store data in a btree. > > This requires a few operations : > - order the data on disk : an O(N x log(N)) operation > - modify JDBM to allow us to inject data in a page directly > > I evaluate this to two or three weeks of work. The gan will be quite > big, the slower operation being the ordering of data > > Although we are stuck with JDBM lack of globa transaction system, which > forces us to lock every operatio (including searches) while we add some > entries. > > 2) Switch to JDBM 4 > > We have to evaluate what JDBM 4 is briging to us. It's obviously a > better base, but we don't know to what extent, and we don't know if : > - we can have global corss-btrees transactions > - we can inject raw data in the files without going through the btree > operations > > Evaluating JDBM4 is probably a week of work > I have already started evaluating this today will update this thread with my findings > > 3) Switch to Mavibot > > Mavibot is a labs I started last year, which implements a MVCC Btree in > memory. It's working, t's fast, but it's in memory only atm (although > the data are regularly flushed on disk). > > We need to improve it so that we add those features : > - Make it backed with a physical layer that access data on disk, and > avoid storing everything in memory > - Add a raw load feature > - Add a global transaction system, cross trees. > > The last feature is probably easier to implement, as we do have a > version associated with each new modification : if we retin the version > we started with, we can gather all the btree's revision, and validate > them all in one shot, or rollback them all in one shot (a revision is > visible only if it has been validated). > > One very interestig aspect of Mavibot is that you can update the btrees > concurrently with searches : searches are done on older revisions, which > don't interract with the pages being added. > > I think there is one to two more month of work on Mavibot to get it to > the level we need. > > Proposal > -------- > > IMO, we should investigate JDBM4 asap. It might solve many of the issues > we have, at a low price. Improving the existing JDBM is just a waste of > our precious time. > > I the long term, I would favor Mavibot for a couple of reasons : > - first, it's a MVCC base, so we can conduct concurrent searches and > modifications (although we can't do many modifications at the same time). > - second, we have a complete understanding of the code : we wrote it. > It's easier to understand the impact of the modifications we hve done on > it. > - third, we can have a dedicated backend based on the current Mavibot > code, where all the data are in memory (that would work with a few tens > of thousands entries, and the performances will be blazing fast). > > I'd like to know what's your opinion. > > Thanks ! > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com > > -- Kiran Ayyagari http://keydap.com
