[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2024-08-09 Thread Bug Janitor Service
https://bugs.kde.org/show_bug.cgi?id=354636

Bug Janitor Service  changed:

   What|Removed |Added

 Status|NEEDSINFO   |RESOLVED
 Resolution|WAITINGFORINFO  |WORKSFORME

--- Comment #20 from Bug Janitor Service  ---
๐Ÿ›๐Ÿงน This bug has been in NEEDSINFO status with no change for at least 30 days.
Closing as RESOLVED WORKSFORME.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2024-07-25 Thread Bug Janitor Service
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #19 from Bug Janitor Service  ---
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2024-07-10 Thread Michael
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #18 from Michael  ---
>From this thread, I checked my Baloo index size, which was 19GB(!) and decided
that I didn't need full text search, just file name search capabilities. So I
nuked it with:

balooctl6 purge

And now my index size is a comfortable 95MB and I don't have mysterious CPU
bursts and my laptop fan isn't kicking in after I download a pdf.

Going forward, I am not enabling full text search on new Kubuntu installations
that I set up for friends and clients until Baloo's index is tamed.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2024-07-04 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #17 from tagwer...@innerjoin.org ---
(In reply to Oded Arbel from comment #16)
>  Memory: 395.0M (high: 512.0M available: 116.9M)
> 
> I still think that's a lot for an idling indexer, but in my day to day
> (especially as I have a new beefy machine that laughs at applications taking
> a mere 0.5GB of RAM ๐Ÿ˜œ)
Maybe on your beefy machine, there's no memory pressure so Baloo's not having
to release pages :-)

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2024-07-04 Thread Oded Arbel
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #16 from Oded Arbel  ---
(In reply to tagwerk19 from comment #15)

On my system (KDE Neon testing, Plasma 6.1.3) system monitor reports baloo_file
at 505MB, while systemd has this to say:

 Memory: 395.0M (high: 512.0M available: 116.9M)

I still think that's a lot for an idling indexer, but in my day to day
(especially as I have a new beefy machine that laughs at applications taking a
mere 0.5GB of RAM ๐Ÿ˜œ) I am no longer troubled by baloo_file behavior. I'm not
closing this report as it isn't mine, but you can chalk me up at the "satisfied
enough" column.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2024-07-03 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=354636

tagwer...@innerjoin.org changed:

   What|Removed |Added

 Status|REOPENED|NEEDSINFO
 Resolution|--- |WAITINGFORINFO

--- Comment #15 from tagwer...@innerjoin.org ---
Revisiting after a fairly major set of patches, including using systemd/cgroups
to limit memory use:
 https://invent.kde.org/frameworks/baloo/-/merge_requests/121
Together with a fix for the initial scan, when run within constrained memory
 https://invent.kde.org/frameworks/baloo/-/merge_requests/148

There also fixes for the BTFRS issues (which probably didn't exist when the
call was opened but was affecting OpenSUSE by 2021...)
https://invent.kde.org/frameworks/baloo/-/merge_requests/131
and cherrypicked for KF5
https://invent.kde.org/frameworks/baloo/-/merge_requests/169

I'll set this to "Waiting for Info" in case anyone wants to keep the issue
open

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-25 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #14 from tagwer...@innerjoin.org ---
(In reply to Oded Arbel from comment #9)
> The weird excluded folder behavior may has something to do with the fact I
> have a trailing slash on my $HOME:
> 
> 8<
> $ balooctl config show excludeFolders
> kf.baloo: Folder cache: std::vector("/home/odeda//.cache/": excluded,
> "/home/odeda//snap/": excluded, "/home/odeda//mnt/": excluded,
> "/home/odeda/": included)
> /home/odeda//.cache/
> /home/odeda//snap/
> /home/odeda//mnt/
> 8<

Oooh. Indeed.

If I "bend things" so I have a trailing slash in my $HOME, the include/exclude
folders lines (for subfolders) in baloofilerc stop working.

If I include
folders[$e]=$HOME
then a
exclude folders[$e]=$HOME/.cache/
doesn't work

If I want to index a set of subfolders,
folders[$e]=$HOME/Documents/,$HOME/Music/,$HOME/Pictures/,$HOME/Videos/
doesn't work.

It's not going to catch many people but it's probably worth reporting as a
separate bug.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-23 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #13 from tagwer...@innerjoin.org ---
(In reply to Oded Arbel from comment #12)
> Shouldn't baloo "auto trim" the index by itself? This is not something a
> user would know to do. Also - doesn't explain the weird percentages.
I'm reading
http://www.lmdb.tech/doc/
Looks like if the database has 'grown' is does not shrink. Free pages are
however reused. Question is whether this has an impact on performance...

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-22 Thread Oded Arbel
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #12 from Oded Arbel  ---
(In reply to tagwerk19 from comment #11)
> (In reply to Oded Arbel from comment #10)
> > $ mdb_stat -af 
> > Freelist Status
> >   ...
> >   Free pages: 2566315
> If it says 2566315 free pages (and a page is 4K?), that's a lot of space in
> the file not being used.
> 
> Have you tried copying the index with mdb_copy?
> 
> I've just tried
> mdb_copy -n -c index index.copy
> It certainly seems to think for a while but the index.copy was smaller by
> 'more or less' the count of the free pages.

Shouldn't baloo "auto trim" the index by itself? This is not something a user
would know to do. Also - doesn't explain the weird percentages.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-22 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #11 from tagwer...@innerjoin.org ---
(In reply to Oded Arbel from comment #10)
> $ mdb_stat -af 
> Freelist Status
>   ...
>   Free pages: 2566315
If it says 2566315 free pages (and a page is 4K?), that's a lot of space in the
file not being used.

Have you tried copying the index with mdb_copy?

I've just tried
mdb_copy -n -c index index.copy
It certainly seems to think for a while but the index.copy was smaller by 'more
or less' the count of the free pages.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-22 Thread Oded Arbel
https://bugs.kde.org/show_bug.cgi?id=354636

--- Comment #10 from Oded Arbel  ---
> 3. The index file is huge - about 19GB, which doesn't make a lot of sense to
> me. `balooctl indexSize` has this to say:
> 
> 8<
> File Size: 18.75 GiB
> Used:  948.13 MiB
> 
>PostingDB:   2.93 GiB   316.627 %
>   PositionDB:  85.44 MiB 9.011 %
> DocTerms:   1.39 GiB   149.920 %
> DocFilenameTerms: 152.72 MiB16.107 %
>DocXattrTerms:   8.39 MiB 0.885 %
>   IdTree:  35.69 MiB 3.764 %
>   IdFileName: 175.18 MiB18.476 %
>  DocTime:  92.85 MiB 9.793 %
>  DocData:  43.49 MiB 4.587 %
>ContentIndexingDB: 448.00 KiB 0.046 %
>  FailedIdsDB:0 B 0.000 %
>  MTimeDB:  26.48 MiB 2.793 %
> 8<
> 
> and to that I can only say "wahhh?!?!?"

After reviewing the code at https://github.com/KDE/baloo/blob/master , I'm more
befuddled by the above numbers:

1. "Used" is `DatabaseSize.expectedSize`
2. The percentages are computed by 100 * "entry size" / "Used", so the 316%
makes sense as it is larger than "Used".
3. `DatabaseSize.expectedSize` is calculated (src/engine/transaction.cpp:474)
by adding up the sizes of all of the entries listed!! so it cannot be smaller
than the sum of its parts, unless one of the parts is negative - which it can't
be as the sizes are of type `size_t`, which - unless something really weird is
going on in the build server - should be unsigned long int.

There's something about page sizes, but that isn't relevant to the above
calculation which seem to suggest that a/(a+b) > 1 where both a and b are
non-negative integers.

BTW - here's the result of running the `mdb_stat` tool from lmdb-utils on the
baloo index:

8<
$ mdb_stat -af 
Freelist Status
  Tree depth: 2
  Branch pages: 1
  Leaf pages: 41
  Overflow pages: 5046
  Entries: 3253
  Free pages: 2566315
Status of Main DB
  Tree depth: 1
  Branch pages: 0
  Leaf pages: 1
  Overflow pages: 0
  Entries: 12
Status of docfilenameterms
  Tree depth: 4
  Branch pages: 315
  Leaf pages: 38726
  Overflow pages: 0
  Entries: 2104603
Status of docterms
  Tree depth: 4
  Branch pages: 633
  Leaf pages: 79407
  Overflow pages: 284028
  Entries: 2103699
Status of documentdatadb
  Tree depth: 3
  Branch pages: 90
  Leaf pages: 11012
  Overflow pages: 38
  Entries: 664790
Status of documenttimedb
  Tree depth: 3
  Branch pages: 187
  Leaf pages: 23555
  Overflow pages: 0
  Entries: 224
Status of docxatrrterms
  Tree depth: 3
  Branch pages: 21
  Leaf pages: 2040
  Overflow pages: 86
  Entries: 31253
Status of failediddb
  Tree depth: 0
  Branch pages: 0
  Leaf pages: 0
  Overflow pages: 0
  Entries: 0
Status of idfilename
  Tree depth: 4
  Branch pages: 363
  Leaf pages: 44411
  Overflow pages: 0
  Entries: 2120309
Status of idtree
  Tree depth: 3
  Branch pages: 52
  Leaf pages: 6960
  Overflow pages: 2118
  Entries: 223613
Status of indexingleveldb
  Tree depth: 3
  Branch pages: 3
  Leaf pages: 49
  Overflow pages: 0
  Entries: 5471
Status of mtimedb
  Tree depth: 3
  Branch pages: 42
  Leaf pages: 6719
  Overflow pages: 0
  Entries: 224
Status of positiondb
  Tree depth: 4
  Branch pages: 6657
  Leaf pages: 735531
  Overflow pages: 328761
  Entries: 42876611
Status of postingdb
  Tree depth: 4
  Branch pages: 6181
  Leaf pages: 657348
  Overflow pages: 105167
  Entries: 45851508
8<

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-22 Thread Oded Arbel
https://bugs.kde.org/show_bug.cgi?id=354636

Oded Arbel  changed:

   What|Removed |Added

 CC||o...@geek.co.il

--- Comment #9 from Oded Arbel  ---
For me baloo has been acting up every now and then, and I'd like to finally get
to the bottom of this. The behavior is currently catatonic and after posting
this issue I will kill baloo but not change any configuration or data, so the
behavior can be reproduced if someone wants to continue research on this issue.

1. baloo_file_extractor takes a lot of CPU and memory. Here's its line in htop:
8<
PID   USER  PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
74566 odeda 39   19  257G 10.1G 9006M S 102. 32.2  8h02:42
/usr/bin/baloo_file_extractor
8<

(its a 4 core system, CPU usage looks like a single thread that tries to take
up an entire CPU but is slowed down a bit by IO on my fast NVME and the "over
100%" is a sampling error on the part of htop)

2. `balooctl monitor` shows almost no activity, and from time to time bursts of
a couple dozen entries that look like this:

8<
Indexing:
/home/odeda/.cache/mozilla/firefox/i1m74zv1.default/cache2/entries/CE2BB927E036CFCEE27E7795DFB198E7C41A14B6:
Ok
8<

It should not be indexing `~/.cache` as `~/.config/baloofilerc` has this:

exclude folders[$e]=$HOME/.cache/,$HOME/mnt/,$HOME/snap/,[and a few other
things]

The weird excluded folder behavior may has something to do with the fact I have
a trailing slash on my $HOME:

8<
$ balooctl config show excludeFolders
kf.baloo: Folder cache: std::vector("/home/odeda//.cache/": excluded,
"/home/odeda//snap/": excluded, "/home/odeda//mnt/": excluded, "/home/odeda/":
included)
/home/odeda//.cache/
/home/odeda//snap/
/home/odeda//mnt/
8<

3. The index file is huge - about 19GB, which doesn't make a lot of sense to
me. `balooctl indexSize` has this to say:

8<
File Size: 18.75 GiB
Used:  948.13 MiB

   PostingDB:   2.93 GiB   316.627 %
  PositionDB:  85.44 MiB 9.011 %
DocTerms:   1.39 GiB   149.920 %
DocFilenameTerms: 152.72 MiB16.107 %
   DocXattrTerms:   8.39 MiB 0.885 %
  IdTree:  35.69 MiB 3.764 %
  IdFileName: 175.18 MiB18.476 %
 DocTime:  92.85 MiB 9.793 %
 DocData:  43.49 MiB 4.587 %
   ContentIndexingDB: 448.00 KiB 0.046 %
 FailedIdsDB:0 B 0.000 %
 MTimeDB:  26.48 MiB 2.793 %
8<

and to that I can only say "wahhh?!?!?"

Here's also `balooctl status`:

8<
Baloo File Indexer is running
Indexer state: Indexing file content
Total files indexed: 2,103,903
Files waiting for content indexing: 6,832
Files failed to index: 0
Current size of index is 18.75 GiB
8<

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-17 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=354636

tagwer...@innerjoin.org changed:

   What|Removed |Added

 CC||tagwer...@innerjoin.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 354636] baloo_file_extractor consumes an ever-increasing amount memory after upgrade to frameworks 5.80.0

2021-03-16 Thread Michael
https://bugs.kde.org/show_bug.cgi?id=354636

Michael  changed:

   What|Removed |Added

Version|unspecified |5.80.0
Product|Baloo   |frameworks-baloo
 CC||n...@kde.org
  Component|Baloo File Daemon   |Baloo File Daemon

-- 
You are receiving this mail because:
You are watching all bug changes.