Hi all, especially Alex and David, tl;dr: I've done a proof-of-concept implementation of a metadata index for KPluginTrader::query(), the main entry point when it comes to finding binary plugins. This index considerably speeds up all current use cases, but comes at the cost of having to maintain the index. Code is in kservice[sebas/kpluginindex], speeds up plugin quering a few times.
The Slightly Longer Story... During Akademy's frameworks and plasma bofs, we talked about indexing plugins for faster lookups. One of the things we wanted to try in Plasma is to index packages, and thereby speeding up package metadata lookups and plugin queries. I have done a naive implementation of such an indexing mechanism, and have implemented this as a proof of concept in KService, specifically in KPluginTrader::query(). This is using Alex Richardson's recent work on KPluginMetadata, which I found very useful ( https://git.reviewboard.kde.org/r/120198/ and https://git.reviewboard.kde.org/r/120199/ ). I've put these patches in my branch kservice[sebas/kpluginindex]. Basic Mechanism - a small tool called kplugin-update-index collects the json metadata from the plugins, and puts the list of plugins in a given plugin directory into a QJsonArray, and dumps that in Qt's json binary format to disk - KPluginTrader::query checks if an index file exists in a given plugin directory -- if the index file exists, it reads it and creates a list of KPluginMetaData objects from it -- if the index file doesn't exist, it walks over each plugin to read its metadata, it basically falls back to the old code path Performance Measurement Method I've created a new autotest, kpluginmetadatatest, which runs two subsequent queries and measure the time it takes to return the results. I've instrumented the code in kplugintrader.cpp with QElapsedTimers. The autotest runs on an environment on rotation metal and ssd in separate test runs. Before cold cache tests, I've dropped page cache, dentries and inodes from memory using echo 3 > /proc/sys/vm/drop_caches Tests are running on Qt's 5.4 branch, they're fairly consistent with what I've seen on Qt 5.3. Performance Improvements Performance tests are promising: http://vizzzion.org/blog/wp-content/uploads/2014/11/performance-comparison-charts.png (note that the metal's left-most bar is truncated by /10 in the picture). In short, the indexed queries are roughly: * 60 times faster on a rotational medium with cold caches * 3 times faster on an SSD with cold caches * 7 times faster on a rotational disk with warm caches * 5 times faster on a SSD with warm caches More Observations - on ssds, we save most of the time in directory traversal and (de)serializing the json metadata - the index lookups spends almost all of its time in disk reads, deserializing the binary metadata is almost free (Qt's json binary representation is really fast to read) - I haven't seen any tests in which the indexed queries have been slower. These results can be explained as follows: - the bottleneck is reading the files from disk - on rotational media, expectedly we get huge performance penalties for every seek we cause, the more files we read, the more desastrous lookups times get. - Expectedly, warm pagecaches help a lot in all cases Cost: Maintaining the Cache These speedups do come at a cost, of course, and that is the added complexity of maintaining the caches. The idea from the bof sessions had been to update the caches at install time, this is essentially what can be done with kplugin- update-index (it needs some added logic to give the index files sensible permissions when run as root). That means that packagers will have to run the index updater in their postinstall routine. Not doing this at all means slower queries (or rather, no speedier queries), worse is if they forget to update once in a while, in which case newly installed or removed plugins might be missing or dangling in the index files. This will need at least some packaging discipline. Index File Location The indexer creates the index files in the plugin directories itself, not in $CACHE or $TMP. This seems the most straight-forward way to do it, since if a plugin is installed into a specific directory, the "installer" will have write permission there to update the index as well. One might consider putting these index files in the cache directory, like ksycoca does, but in that case, we need to be smarter to actually update the index files correctly, since at that point, it depends on the environment of the user and the plugin paths (which means, it can't sensibly be done at install-time). KServiceTyperTrader Comparison First off, for the current situation, the comparison to KServiceTypeTrader is not of much use, since it's orthogonal to KPluginTrader. That aside, I've run the same queries through KServiceTypeTrader (with different results, of course, and just on an ssd). With cold caches KServiceTypeTrader is 40 times faster than unindexed queries (current status quo), and still times faster with indices. Successive queries are about 100 times faster than indexed queries. KServiceTypeTrader is still a lot faster, supposedly since we're reading one larger file, instead of multiple ones. It may make sense to cache the index files read from disk, which should get us in the ballpark of KServiceTypeTrader again. Feedback, please! So, this code is in a bit of a draft stage, I'd very much welcome feedback about the approach, and of course the code itself. It can be found in kservice[sebas/kpluginindex]. the kpluginmetadata autotest gives a useful testing target. I didn't submit it to reviewboard yet, because I want to nail down the further direction, and provide a base to discuss on. Cheers, -- sebas http://www.kde.org | http://vizZzion.org | GPG Key ID: 9119 0EF9 _______________________________________________ Kde-frameworks-devel mailing list Kde-frameworks-devel@kde.org https://mail.kde.org/mailman/listinfo/kde-frameworks-devel