On Wed, Nov 12, 2014 at 6:40 AM, Emmanuel Lécharny <[email protected]> wrote:
> Some update... > > The multi-values support is not that easy. Thanks to Shawn, who pointed > me to the TreeMap data structure, I was able to fix the merge sort, > which was a requirement to be able to process the tuples and their > values across multiple files. > > But that was just one of my problems... > > Now, I realise that there is a big flaw in teh way the BTree is loaded : > when we have multiple values, we can't anymore know teh number of > elements we have to store, so all the pages number computation is > broken. The consequence is that we don't know when to stop building the > tree, or if we stop, we may have incomplete pages. > > The solutions : > - don't care about the real number of elements, fix the bundaries (ie > incomplte pages at teh end of each level) when the last element is fetch. > - as we have a list of files to read elements from, use that to create > one single file that contains all the elemnts, sorted. We will be able > to know the number of elements to prcoess, thus building the right BTree. > > I do think the second solution is better, even if it requires to write > another file. All in all, if we have N elements, processing the whole > bulk load is now a matter of reading one big file, writing into N > smaller files, read all of those smaller files and write into a bigger > file, then read all teh elemenst from this bigger file to write teh data > in the btree. It's simpler that the other algorihtm, as we don't have to > cleanup the btree (such a cleanup would require that for each created > level, we check the last page to see if it's uncomplete (ie < > PageSize/2) and if so, merge it with the previous page and spread the > result so that no page has less that PageSize/2 elements). > > +1, cause there seem to be too many boundaries, and IMO handling these necessarily not optimal effort in this scenario > This second algorithm will most of the time be faster, as we read and > write the big file one less time, but OTOH, it will move some pages > around in the resulting BTree, leaving some holes we will have to track. > That might be an optimization for later on... > > Otherwise, the values themselves are correctly created, even if some > sub-tree is necessary. However, the creation of subtrees is done the > standard way (ie through the creation of a btree and injection of the > values one by one). Here, we could benefit from an in-memory bulkload > instead of the current mechanism. Some more food for thought... > > the old code was doing bulk insert for sub-BTrees, but some more improvements I made are currently in this[1] free page management branch > That is my current status on this piece of code. I may let it aside for > a few days, as I need to cut an ApacheDS release, which requires a > mavibot release, but I'll discuss that in another thread. > > thanks for the heads up [1] https://svn.apache.org/repos/asf/directory/mavibot/branches/free-page-mgmt > thanks ! > > > Le 11/11/14 02:28, Emmanuel Lécharny a écrit : > > Hi guys, > > > > I have been pretty busy those last 2 weeks working on Fortress > > integration and on the LDAP API release (beside other things). > > > > As of today, the mavibot bulkoader, which has been put aside for days > > with a few pieces of the algorithm to implement, is working. I have > > tested it with 1 000 000 elements, and it's all good. (It also has been > > tested with many btrees with incremental sizes to be sure we don't > > forget a few corner cases). > > > > The goals where : > > - have the bulkloader part of Mavibot, instead of having it in > > Mavibot-Partition only. > > - have it accept as many entries as possible. The previous > > implementation was not capable to handle millions of elements > > - have it use a limited amoubt of memory : we now keep only one page per > > btree level > > > > In order to cope with the potential huge number of elements we have to > > sort before loading them in the btree, we use temporary files, which can > > hold N sorted elements. Then we do a kind of merge-sort, by pulling the > > right element from one of the files. One can configurate the number of > > elements in each file, assuming we sort them in memory. On my tests, > > above 16 384 elements per file, I don't see a huge improvement. > > > > The perf tests I have done on my laptop show that I can load up to 56 > > 600 tuples per second. Don't expect the same performances when it will > > come to load LDAP entries in the server ! I suspect that it will be 10 > > times slower (still, 5 000 entries per second added would be a great > > imrporvement over what we have now). > > > > There are some steps that need to be fulfilled still : > > - multi-values support : it's all aboyt bulkloading the values when we > > have many. Should be easy to implement. > > - use the bulkloader in ApacheDS mavibot-partition > > - add a CLI for the mavibot bulkloader and the mavibot-partition > bulkloader. > > - add a in-memory bulkloader. > > - cleanup the code which has many redonduncies atm. > > > > Anyway, it's making progress. I'll probably cut a mavibot release > > tomorrow, which will allow me to cut an ApacheDS release too. > > > > thanks ! > > > > > > > -- Kiran Ayyagari http://keydap.com
