https://bugs.kde.org/show_bug.cgi?id=404057
--- Comment #8 from Kai Krakow <k...@kaishome.de> --- Created attachment 122919 --> https://bugs.kde.org/attachment.cgi?id=122919&action=edit Experimental: Reduce mmap by one magnitude This patch reduces the memory map size for LMDB by one order of magnitude (16 instead of 256 GB). After applying the patch, I purged the DB and restarted baloo. It churns along nicely now, I/O is down to less than 10 MB/s instead of 50-100 MB/s constantly before. Also, running action that obviously do a bunch of memory allocations in Plasma (like opening the app drawer) now run much smoother again (instantly instead of a noticeable but subjective delay). The whole system feels much smoother again. I'm guessing that a lot of ongoing dirty-page writebacks, page faults and VMM handling has a lot of drawbacks and introduces a lot of TLB flushes because mappings into the process are constantly updated. It also seems to introduce a lot of I/O overhead. I'm not sure why that is but it seems this big mmap has indeed drawbacks. A lot of random accesses into the mmap may cause unintentional read-ahead, unpredictable IO patterns and may dominate the cache which is what I believe causes the excessive I/O behavior. This patch (together with the previous patch) makes my system run much nicer again. I can actually use krunner again without causing a lot of additional IO and lags. My system has 32 GB of RAM. Looking at all this, I wonder if LMDB is really the right tool. It is tempting to use it, but from the project documentation it seems to be intended as a read-mostly database. This is clearly not what baloo does with it, especially during re-indexing/updating or first indexing. The mmap size seems to be tightly bound to the maximum DB size which, looking at my above test results, limits the scaling of baloo a lot. It should probably not be too difficult to swap LMDB with another key/value database better fitting the usage pattern (bursts of lots of write transactions with only occasional read transactions when a user actually searches for something). LMDB (as the DBE backing the OpenLDAP project) seems to be designed for exactly the opposite usage pattern. Are there any more thoughts of it? Any idea which key/value DBE could fit better? What about multi-threading? Current code seems to run with only 1 thread in parallel only anyways despite using the thread pool classes of Qt. I'd volunteer to invest some spare time into swapping out LMDB for something different. -- You are receiving this mail because: You are watching all bug changes.