The first part is relatively trivial for a person of my level of genius :)
The second part is so utterly Byzantine and unfathomable (and redundant apparently) that I would never even make the attempt. What Rocket adopts and also what they discard, has never been comprehendable. <<So there you go, Will. Your job for the weekend. Redesign the UniVerse indexing so it works for large lists, and get Rocket to adopt it.>> -----Original Message----- From: Brian Leach <br...@brianleach.co.uk> To: 'U2 Users List' <u2-users@listserver.u2ug.org> Sent: Fri, Oct 5, 2012 5:59 am Subject: Re: [U2] Consuming Web Services Will > I don't understand what's wrong with indexing, can you clarify this point, and I'll wipe out a fix in three days :) Well for a start I didn't say there's anything wrong, I said it could be improved - not the same thing! But as to specifics, take the following scenario (UniVerse specific): - Grab a transaction file for say, 10 million records. - Assume a reasonable key length say 10 chars. - Add a field with two states (open/closed, male/female, that kind of thing). - Index it, and watch what happens to the performance on that file. - Better still, don't use an existing file! Create a new file and a program to copy or build the content in basic and show a counter every 1000 records. At the start it will be quick. After about 500,000 you can grab a beer in between the counters. The problem is, that a UniVerse index is very good at establishing the index key: it has a nice B+tree format with a decent level of fan-out. But when it comes to the list of primary keys being indexed against each index key, that's really just treated as a block of data. If you have a good ratio with a lot of index keys (date*type*something_else) each of which gives a relatively short list of primary keys you can get very good indexing performance. But it isn't very clever when you have a small number of index keys to a large list of primary keys. So every time you changed the flag value in the file above it would have to load up the two lists (one for old value, one for new), locate and delete from the old and locate/fail/append to the new, each list averaging 11 byte * 5 million entries. And then write it back to a succession of oversize blocks in the index file. Now you might say - well, you wouldn't index a transaction file like that. And you would be right - because of the design of the index. But it's a perfectly legitimate and reasonable thing to want to do. How to better manage a large index list is, of course, the question. Since it is a large list into which elements are potentially inserted/deleted in order, the list itself could be made into a set of B+Tree pages over a certain threshold, reducing the cost of location/insertion and location/deletion. Other databases use techniques such as key deltas and compression to alleviate this. And I'm sure there are better options if I could be bothered to research them. So there you go, Will. Your job for the weekend. Redesign the UniVerse indexing so it works for large lists, and get Rocket to adopt it. :) Brian -----Original Message----- From: u2-users-boun...@listserver.u2ug.org [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson Sent: 04 October 2012 16:43 To: u2-users@listserver.u2ug.org Subject: Re: [U2] Consuming Web Services _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users