[frameworks-baloo] [Bug 405094] Symbols in name break search

2021-10-24 Thread Kishore Gopalakrishnan
https://bugs.kde.org/show_bug.cgi?id=405094

Kishore Gopalakrishnan  changed:

   What|Removed |Added

 CC||kishor...@gmail.com

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2021-03-30 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=405094

tagwer...@innerjoin.org changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|REPORTED|CONFIRMED

--- Comment #5 from tagwer...@innerjoin.org ---
Can confirm that this is still the case, repeating the test

cd ~/Documents
echo "Hello Penguin" > 'AB - ABC - 1st'
echo "Hello Penguin" > 'AB - ABC - 1 filler' 
echo "Hello Penguin" > 'AB - ABC - 1st filler'

balooshow -x 'AB - ABC - 1st'

1010f9fc01 64513 1052921 AB - ABC - 1st [/home/test/Documents/AB - ABC
- 1st]
Mtime: 1617093004 2021-03-30T10:30:04
Ctime: 1617093004 2021-03-30T10:30:04
Cached properties:
Line Count: 1

Internal Info
Terms: Mplain Mtext T5 T8 X20-1 hello penguin 
File Name Terms: F1st Fab Fabc 
XAttr Terms: 
lineCount: 1

balooshow -x 'AB - ABC - 1 filler' 

1019d7fc01 64513 1055191 AB - ABC - 1 filler [/home/test/Documents/AB -
ABC - 1 filler]
Mtime: 1617093011 2021-03-30T10:30:11
Ctime: 1617093011 2021-03-30T10:30:11
Cached properties:
Line Count: 1

Internal Info
Terms: Mplain Mtext T5 T8 X20-1 hello penguin 
File Name Terms: F1 Fab Fabc Ffiller 
XAttr Terms: 
lineCount: 1

balooshow -x 'AB - ABC - 1st filler' 

102bb6fc01 64513 1059766 AB - ABC - 1st filler [/home/test/Documents/AB
- ABC - 1st filler]
Mtime: 1617093015 2021-03-30T10:30:15
Ctime: 1617093015 2021-03-30T10:30:15
Cached properties:
Line Count: 1

Internal Info
Terms: Mplain Mtext T5 T8 X20-1 hello penguin 
File Name Terms: F1st Fab Fabc Ffiller 
XAttr Terms: 
lineCount: 1

baloosearch AB
/home/test/Documents/AB - ABC - 1st filler
/home/test/Documents/AB - ABC - 1 filler
/home/test/Documents/AB - ABC - 1st
Elapsed: 1,9261 msecs

baloosearch 'AB 1'
/home/test/Documents/AB - ABC - 1 filler
Elapsed: 0,269976 msecs

See also
Bug 434589

With
Neon Testing
Plasma: 5.21.3
Frameworks : 5.81.0
Qt : 5.15.2

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2021-03-30 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=405094

tagwer...@innerjoin.org changed:

   What|Removed |Added

 CC||tagwer...@innerjoin.org

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2020-05-05 Thread Ceaus
https://bugs.kde.org/show_bug.cgi?id=405094

--- Comment #4 from Ceaus  ---
My apologies to Igor if I sounded as if was to assign blame. That is certainly
not my intention.

Although I understand your logic, it not a real defense against an improvement
in this area:
1.
Looking at the home page of Baloo, on the top of the architecture page it says:
"Baloo is a metadata and search framework by KDE". The fact that meta is being
mentioned is a give away that filenames should be supported to their maximum
extent. Special characters not withstanding.

2. The 137K was a purely arbitrary number. I could also have said 5. The
special characters should not be held hostage to support the argument of file
content searches, or the problem of a list of results which is too long.

3. There is no mentioning at all, in any form, or in any MMI about the
restrictions  of the possible search parameters. If you cannot use certain
characters, or the search string must be of a certain minimum size, than it
should say so. You cannot confront the end-user with search results which are
incorrect, for which no explanation is given. In my case I was finally able to
understand the incorrect search results (that got me here in this bug report).
But it could be much worse: the end user is confronted with incorrect search
results, but s/he is unaware. Which can lead detrimental consequences on
her/his part: Taking action because s/he thinks the document(s) do not exist.

4. It is extremely silly to integrate Baloo in the Dolphin file manager, making
it  indistinguishable from Dolphin, and then only partly support a typical task
for a file manager: searching for files! Referring to a third party app (KFind)
to search for files, to me is inexplicable.

5. The chosen technical solution (preferring an index over in-situ search)
should not exceed the importance of a normal use case. If the size of the index
becomes too big, then "we" have done something wrong on a technical level. That
burden should not be put on the end user. 

6.
Now that I know that Baloo gives incorrect search result in certain
circumstances, makes me question if and how I can "trust" Baloo in future
times. Which undermines the whole purpose of its existence. How do I know I can
trust Baloo?

My real life use case:
Last week I was called by my friend. She had to copy her home directory (XFS)
to external hard drive (VFAT) for backup. That failed as VFAT does not support
filenames containing ':' and '?'. My friend had about 25 of those files. As she
does not have root access to the laptop, KFind was not an option for her to
install. Over the phone we had to divert to a very complicated session trying
to explain her how to use 'find' (command line) and how to rename the listed
files. I would rather never, ever do that again.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2020-05-04 Thread Igor Poboiko
https://bugs.kde.org/show_bug.cgi?id=405094

--- Comment #3 from Igor Poboiko  ---
It's not my logic, it's the logic of Baloo and its original developer :)

The logic is quite straightforward though. Most likely, user is searching for
some particular document. If his search term is contained in "137K files", it
wouldn't help at all - such term might as well be dropped. If those are only
terms user looking for, he won't be able to find anything; if his query
contains other terms, those will more likely to help Baloo identify the
document user is looking for.
I believe short terms are mostly there just to be able to search over filename
extensions (like "filename.jpg") and e-mail/domains (like
"john...@example.org"). In both cases, the "exact match" logic would suffice.

> [...] Apparently A-Z characters are first class citizens, whereas the other 
> characters are estranged cousins.  

That's intentional. Remember that Baloo provides search over file contents too.
And if you have it in mind, it doesn't sound that arbitrary: letters and words
(not necessarily A-Z: also numbers and other languages) contain the most
information to build index upon. What are the chances user is going to search
for a document that has "." or " " or "_" somewhere inside? And what are the
chances it will help to identify the document uniquely?
Not to mention that by restricting itself to alphabet, it reduces the size of
the index by a large factor.

If you're looking for a file with a name you know precisely, and which mostly
contains non-alphanumeric characters, then "find" / KFind or any other
filesystem crawler will most likely do better.

> Baloo = 5.55.0
I couldn't also help but notice that the version your distribution ships is a
bit outdated. There were large number of improvements to Baloo somewhat around
5.60+ (unrelated to this particular issue, though).

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2020-05-02 Thread Ceaus
https://bugs.kde.org/show_bug.cgi?id=405094

Ceaus  changed:

   What|Removed |Added

 CC||k...@pohw.nl

--- Comment #2 from Ceaus  ---
I'm facing the same problem. Files are not found whilst they are there.

The logic of Igor Poboiko (comment #1) just complicated things:
1. If I search for files with very common letters in their name, such as 'a' or
'e', baloo just reports a handful, while I have hundreds in my home dir. At
this point I have zero cue how to interpret the search results. I see some
files, but I also see many not. How do I now interpret this list? I know the
list is incorrect(because I know the file exists). But to what extend? Are
there other files not in the search results? And if so, why?

2. If I search for single non-alphabet characters, such as filenames with a
'-', '_' or ' '  baloo returns zero results. Which goes against the results of
option (1). So now the question becomes even more difficult to answer: what is
this list I am looking at?

Apparently A-Z characters are first class citizens, whereas the other
characters are estranged cousins.  To me this sounds rather arbitrary. baloo
should simply return them all. If there is a genuine concern for the list being
too long, then why not raise a warning: "Hey, are you sure you want a list
containing 137K filenames?" 

BTW: I'm on openSUSE Leap 15.1
Baloo = 5.55.0

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2019-06-30 Thread Igor Poboiko
https://bugs.kde.org/show_bug.cgi?id=405094

Igor Poboiko  changed:

   What|Removed |Added

 CC||igor.pobo...@gmail.com

--- Comment #1 from Igor Poboiko  ---
The problem here is that short terms (of length < 3) are only matched
*exactly*. This is intentional: because if we match them by prefix (like we do
with longer terms) there will most likely be just too many matches.

So in your case 'AB 1' doesn't match files #1 and #3 because they don't contain
"1" exactly. But all of them do contain "AB" exactly.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 405094] Symbols in name break search

2019-03-04 Thread bugzilla_noreply
https://bugs.kde.org/show_bug.cgi?id=405094

nathan.figue...@gmail.com changed:

   What|Removed |Added

Summary|Symbols in long name break  |Symbols in name break
   |search  |search

-- 
You are receiving this mail because:
You are watching all bug changes.