https://bugs.kde.org/show_bug.cgi?id=428416

--- Comment #7 from [email protected] ---
(In reply to postix from comment #6)
> Apparently most or almost all cycles are spent in
> (libKF5BalooEngine.so.5.85.0)
> 
> - Baloo:WriteTransaction:commit() 
> 
> --->
> 
> (69.8%) aggregated sample costs in 
> - void Baloo::sortedIdInsert<...>(QVector<...>&, Baloo::PositionInfo const&)
> 
> (29.4%) aggregated sample costs in 
> - Baloo::sortedIdInsert<...>(QVector<...>&, unsigned long long const&) 
Don't mind saying this is somewhat over my head :-)

However, my assumptions (and recollections) are...

    The LMDB index requires (or perhaps works better?) if the records
    written are sorted.

    baloo_file_extractor processes batches of 40 files when doing content
    indexing so there are many, relatively frequent, small transactions.
    I don't think the initial indexing by baloo_file is the same; if
    there are a load of changes then there's one pretty large transaction.
    In the latter case, you would gain by having a "better" algorithm - or
    also by splitting up this indexing into batches.

I'm not sure whether there are any dev's following this thread but, if there
are, it would be nice to know if my guesswork is OK.

(In reply to postix from comment #4)
> I have experienced the same today, whereby Baloo has used 100% of one core
> for twenty minutes or so right after the login.
The questions to ask is "what's happened" that means that baloo has to catch up
with loads of changes.

Interesting that you got the issue today and not every time you log on and also
that it lasted for 20 minutes and stopped.

Find one of the files indexed and try the following...

    stat testfile
    balooshow -x testfile

and

    baloosearch -i filename:testfile

The "stat" would give you the device and inode number of the file. You should
see the same numbers listed in the "balooshow -x" results. See:

    https://bugs.kde.org/show_bug.cgi?id=402154#c12

If the device/inode numbers change for a file, baloo will think it is a
different file and index it again. You can see this evidenced in the
"baloosearch -i" results, you could get multiple results (different ID's; same
file)

My guess, as you both say you are using Tumbleweed and openSUSE uses BTRFS
(with multiple subvols), is that you are "suddenly" getting a different minor
device number and "suddenly" there's a load of "new files" to index :-/

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to