https://bugs.kde.org/show_bug.cgi?id=404057

--- Comment #8 from Kai Krakow <k...@kaishome.de> ---
Created attachment 122919
  --> https://bugs.kde.org/attachment.cgi?id=122919&action=edit
Experimental: Reduce mmap by one magnitude

This patch reduces the memory map size for LMDB by one order of magnitude (16
instead of 256 GB). After applying the patch, I purged the DB and restarted
baloo.

It churns along nicely now, I/O is down to less than 10 MB/s instead of 50-100
MB/s constantly before. Also, running action that obviously do a bunch of
memory allocations in Plasma (like opening the app drawer) now run much
smoother again (instantly instead of a noticeable but subjective delay). The
whole system feels much smoother again. I'm guessing that a lot of ongoing
dirty-page writebacks, page faults and VMM handling has a lot of drawbacks and
introduces a lot of TLB flushes because mappings into the process are
constantly updated. It also seems to introduce a lot of I/O overhead. I'm not
sure why that is but it seems this big mmap has indeed drawbacks. A lot of
random accesses into the mmap may cause unintentional read-ahead, unpredictable
IO patterns and may dominate the cache which is what I believe causes the
excessive I/O behavior.

This patch (together with the previous patch) makes my system run much nicer
again. I can actually use krunner again without causing a lot of additional IO
and lags. My system has 32 GB of RAM.

Looking at all this, I wonder if LMDB is really the right tool. It is tempting
to use it, but from the project documentation it seems to be intended as a
read-mostly database. This is clearly not what baloo does with it, especially
during re-indexing/updating or first indexing. The mmap size seems to be
tightly bound to the maximum DB size which, looking at my above test results,
limits the scaling of baloo a lot.

It should probably not be too difficult to swap LMDB with another key/value
database better fitting the usage pattern (bursts of lots of write transactions
with only occasional read transactions when a user actually searches for
something). LMDB (as the DBE backing the OpenLDAP project) seems to be designed
for exactly the opposite usage pattern.

Are there any more thoughts of it? Any idea which key/value DBE could fit
better? What about multi-threading? Current code seems to run with only 1
thread in parallel only anyways despite using the thread pool classes of Qt.
I'd volunteer to invest some spare time into swapping out LMDB for something
different.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to