https://bugs.kde.org/show_bug.cgi?id=515918

            Bug ID: 515918
           Summary: File extractor keeps asserting txn != nullptr
    Classification: Frameworks and Libraries
           Product: frameworks-baloo
      Version First unspecified
       Reported In:
          Platform: Other
                OS: Linux
            Status: REPORTED
          Severity: crash
          Priority: NOR
         Component: Baloo File Daemon
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

SUMMARY
Baloo file extractor keeps asserting txn != nullptr frequently on large(r)
files, e.g. PDFs and large text files

STEPS TO REPRODUCE
1. Have baloo enabled, log into your system, perhaps clear the index so it runs
again

OBSERVED RESULT
Baloo keeps crashing all the freaking time spawning milloins of drkonqis
Without asserts enabled it prints "m_writeTrans is null" in the log

EXPECTED RESULT
Baloo works as it used to

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: git master as of 2025-2-12

Git bisect suggests
e75cdd6016ba5433c05fbb04f4424b630c40dfbf is the first bad commit
commit e75cdd6016ba5433c05fbb04f4424b630c40dfbf
Author: Stefan Brüns <[email protected]>
Date:   Thu Jan 8 20:18:28 2026 +0100

    [Extractor] Release DB write lock while content is extracted

    The extractor process held the DB write lock during the complete index
    batch, which may last for several seconds, or on rare occasions even
    minutes or hours.

    This had several negative side effects:
    - Any filesystem changes had to be queued in the scheduler, as these
      can not be commited to the DB.
    - Even deleted files may be commited to the DB, to be immediately deleted
      when the pending event queue is processed.
    - Any search may return fairly obsolete results, including deleted files.
      (Searching may still return incorrect results for files still
      pending, but this is out of scope.)
    - When an extractor crashes, the write transaction was still open. Although
      this is detected and handled, but may still cause further problems.

    Create a preliminary workload which is processed without holding any
    transactions, and only create the write transaction when the content
    extraction has completed. The completed workload is then checked if it
    matches the original state (url/id), and commited. For the unlikely case
    the state has changed the mismatching document(s) is discarded.

 src/file/extractor/app.cpp | 145 ++++++++++++++++++++++++++++-----------------
 src/file/extractor/app.h   |  27 ++++++---

Qt Version: 6.10.2

ADDITIONAL INFORMATION
I suspect it’s got something to do with the changes in early January re
splitting stuff into multiple transactions.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to