On 01/28/2013 04:05 AM, Alexey Tourbin wrote: > In the existing implementation, when a new macro is added, the whole > table has to be sorted again. Hence the cost of adding n macros is > worse than O(n^2), due to arithmetic progression. > > This change drops all qsort(3) stuff altogether, by carefully preserving > table in sorted order. In findEntry routine, bsearch(3) is replaced > with customized binary search which tracks position for insertion. > In the addMacro routine, if a matching entry is not found, this > position is used for direct insertion, after the rest of the elements > are "shifted to the right" with memmove(3). Likewise, in delMacro > routine, the elements are shifted back to the left when the last macro > definition is popped. Technically, shifting half of the array with > memmove(3) is still O(n^2); however, modern CPUs process contiguous > memory in a very efficient manner, and glibc provides a fine-tuned > memmove(3) implementation. > > Also, macro table entries are now allocated in a single chunk. > > This change reduces rpm startup costs by factor of 6 (see callgrind > annotations below). Also, this change improves specfile parser > performance by a factor of 2 (e.g. the parse time of texlive.spec > is reduced from 67s to 35s).
Impressive numbers. I wonder if thing could even improve further by switching to a hash table. RPM has it's own generic hash data type found in lib/rpmhash.H and lib/rpmhash.C These work with some kind of include magic. You #define HASHTYPE as a prefix used for the datatype and all related functions. HTKEYTYPE for the key data type and HTDATATYPE for value data type. This can also be a struct for keeping the value inside the hash. There can be more than one value per key. With HTDATATYPE undefined the hash acts as a set. After setting these macros the rpmhash.[HC] fiels are include to build the header or code for the given types. See lib/depends.c for two examples of how it is used. Florian _______________________________________________ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint