[frameworks-baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-05-31 Thread Ben Cooksley via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

Bhushan Shah  changed:

   What|Removed |Added

  Latest Commit||https://quickgit.kde.org/?p
   ||=baloo.git=commit=06efd
   ||6c05c15a64b53daac9e598666af
   ||584488ec
 Status|UNCONFIRMED |RESOLVED
 CC||bhus...@gmail.com
 Resolution|--- |FIXED

--- Comment #7 from Bhushan Shah  ---
Marking as fixed.

--- Comment #8 from John Andersen  ---
Finally filtered down to both Manjaro and Opensuse, and working very well.  
(I use baloo search to manage a large software code base, and it was sorely
missed when it stopped indexing source code due to the txt issue.)

Thanks for your fine work.

For those arriving here after searching for this problem I have one minor thing
to add: The indexing of previously excluded text files with an extension of
other than "txt" did not take place automatically.  

I had to do: "balooctl disable" followed by "balooctl enable" and now they are
all indexed.  

Thanks again.

-- 
You are receiving this mail because:
You are watching all bug changes.


[frameworks-baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-03-14 Thread Boudhayan Gupta via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

Boudhayan Gupta  changed:

   What|Removed |Added

 CC||m...@baloneygeek.com

--- Comment #6 from Boudhayan Gupta  ---
Fixed in commit
https://quickgit.kde.org/?p=baloo.git=commit=06efd6c05c15a64b53daac9e598666af584488ec.
Not sure why the bug wasn't autoclosed.

I'll inform someone from the bugsquad to close this manually.

-- 
You are receiving this mail because:
You are watching all bug changes.


[frameworks-baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-03-13 Thread Pinak Ahuja via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

--- Comment #5 from Pinak Ahuja  ---
John I am familiar with the code. The blacklist/whitelist filters are still
there just have a look at ~/.config/baloofilerc

Maybe I wasn't clear enough, but the misinterpretation part I was talking about
is a separate thing which is somewhat related to this.

I know it was a temporary workaround and maybe it's time for it to go. I've
been testing locally by removing it seems to work fine on my system but people
have different configs and files on there system. Let's try removing it and see
how it goes for the next version.

-- 
You are receiving this mail because:
You are watching all bug changes.


[frameworks-baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-03-12 Thread John Andersen via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

--- Comment #4 from John Andersen  ---
(In reply to Pinak Ahuja from comment #3)
> This is the intended behavior, for files having text/plain mimetype. This
> was done to avoid the mess caused by applications which have log files in
> directories that are indexed by baloo.
> 
> Though text files with a valid extension like .md .markdown should still be
> indexed because they have the mimetype: text/markdown but right now they are
> also not being indexed because baloo is somehow misinterpreting the
> mimetype. I'm looking into it.

But this is fundamentally the wrong approach, as extensions have never been a
significant part of linux, and are (by your own admission) unreliable indicator
of file content.

This isn't a case of Baloo "misinterpreting" anything.  The link I posted
indicates that mimetype of plaintext is arbitrarily rejected for indexing
unless the extension is "txt" (and size less then 50K).  
When this was put in place (2 years ago) it was indicated as a temporary hack. 
Yet it still exists.  There is no indication that this was the intended
behavior, when the comments in the code clearly label it as some sort of short
term hack.

Someone chose to keep all plaintext out of baloo (a questionable decision at
best,).  Rather than doing this with blacklist/whitelist (exclude filters) to
address problematic file types, all plaintext was summarily rejected unless
extension was txt.

If all plaintext is to be rejected then the rational thing to do is to honor a
whitelist (include filters) to override this rejection.  (I believe that USED
TO EXIST, but was removed in the rush to simplify the control set).

If, on the other hand only SOME plaintext files are problematic, those should
be handled by the exclude filters.

Right now, logs could be handled by exclude filters.
There is no longer a whitelist capability.
But even the exclude filters is totally ignored for plaintext documents.  

So significant functionality has been lost ostensibly just to avoid logs (which
could have been avoided by the exclude filters).  

Look in app.cpp  : 
https://code.woboq.org/qt5/kf5/baloo/src/file/extractor/app.cpp.html
Look for the word HACK.

-- 
You are receiving this mail because:
You are watching all bug changes.


[frameworks-baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-03-12 Thread Pinak Ahuja via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

--- Comment #3 from Pinak Ahuja  ---
This is the intended behavior, for files having text/plain mimetype. This was
done to avoid the mess caused by applications which have log files in
directories that are indexed by baloo.

Though text files with a valid extension like .md .markdown should still be
indexed because they have the mimetype: text/markdown but right now they are
also not being indexed because baloo is somehow misinterpreting the mimetype.
I'm looking into it.

-- 
You are receiving this mail because:
You are watching all bug changes.


[frameworks-baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-03-11 Thread Alexander Potashev via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

Alexander Potashev  changed:

   What|Removed |Added

Version|unspecified |5.19.0
Product|Baloo   |frameworks-baloo
  Component|General |general
 CC||aspotas...@gmail.com

-- 
You are receiving this mail because:
You are watching all bug changes.


[Baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-02-21 Thread John Andersen via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

--- Comment #2 from John Andersen  ---
Persists in Baloo 5.19.0 as well.

There should be a method to white-list extensions one purposely wants to
content-index, perhaps stored in baloofilerc.

-- 
You are receiving this mail because:
You are watching all bug changes.


[Baloo] [Bug 358098] Baloo fails to index plain text files unless extension is .txt

2016-01-19 Thread John Andersen via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=358098

--- Comment #1 from John Andersen  ---
Error persists in Baloo 5.18 as well.

-- 
You are receiving this mail because:
You are watching all bug changes.