[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 Kishore Gopalakrishnan changed: What|Removed |Added CC||kishor...@gmail.com -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 tagwer...@innerjoin.org changed: What|Removed |Added Ever confirmed|0 |1 Status|REPORTED|CONFIRMED --- Comment #5 from tagwer...@innerjoin.org --- Can confirm that this is still the case, repeating the test cd ~/Documents echo "Hello Penguin" > 'AB - ABC - 1st' echo "Hello Penguin" > 'AB - ABC - 1 filler' echo "Hello Penguin" > 'AB - ABC - 1st filler' balooshow -x 'AB - ABC - 1st' 1010f9fc01 64513 1052921 AB - ABC - 1st [/home/test/Documents/AB - ABC - 1st] Mtime: 1617093004 2021-03-30T10:30:04 Ctime: 1617093004 2021-03-30T10:30:04 Cached properties: Line Count: 1 Internal Info Terms: Mplain Mtext T5 T8 X20-1 hello penguin File Name Terms: F1st Fab Fabc XAttr Terms: lineCount: 1 balooshow -x 'AB - ABC - 1 filler' 1019d7fc01 64513 1055191 AB - ABC - 1 filler [/home/test/Documents/AB - ABC - 1 filler] Mtime: 1617093011 2021-03-30T10:30:11 Ctime: 1617093011 2021-03-30T10:30:11 Cached properties: Line Count: 1 Internal Info Terms: Mplain Mtext T5 T8 X20-1 hello penguin File Name Terms: F1 Fab Fabc Ffiller XAttr Terms: lineCount: 1 balooshow -x 'AB - ABC - 1st filler' 102bb6fc01 64513 1059766 AB - ABC - 1st filler [/home/test/Documents/AB - ABC - 1st filler] Mtime: 1617093015 2021-03-30T10:30:15 Ctime: 1617093015 2021-03-30T10:30:15 Cached properties: Line Count: 1 Internal Info Terms: Mplain Mtext T5 T8 X20-1 hello penguin File Name Terms: F1st Fab Fabc Ffiller XAttr Terms: lineCount: 1 baloosearch AB /home/test/Documents/AB - ABC - 1st filler /home/test/Documents/AB - ABC - 1 filler /home/test/Documents/AB - ABC - 1st Elapsed: 1,9261 msecs baloosearch 'AB 1' /home/test/Documents/AB - ABC - 1 filler Elapsed: 0,269976 msecs See also Bug 434589 With Neon Testing Plasma: 5.21.3 Frameworks : 5.81.0 Qt : 5.15.2 -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 tagwer...@innerjoin.org changed: What|Removed |Added CC||tagwer...@innerjoin.org -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 --- Comment #4 from Ceaus --- My apologies to Igor if I sounded as if was to assign blame. That is certainly not my intention. Although I understand your logic, it not a real defense against an improvement in this area: 1. Looking at the home page of Baloo, on the top of the architecture page it says: "Baloo is a metadata and search framework by KDE". The fact that meta is being mentioned is a give away that filenames should be supported to their maximum extent. Special characters not withstanding. 2. The 137K was a purely arbitrary number. I could also have said 5. The special characters should not be held hostage to support the argument of file content searches, or the problem of a list of results which is too long. 3. There is no mentioning at all, in any form, or in any MMI about the restrictions of the possible search parameters. If you cannot use certain characters, or the search string must be of a certain minimum size, than it should say so. You cannot confront the end-user with search results which are incorrect, for which no explanation is given. In my case I was finally able to understand the incorrect search results (that got me here in this bug report). But it could be much worse: the end user is confronted with incorrect search results, but s/he is unaware. Which can lead detrimental consequences on her/his part: Taking action because s/he thinks the document(s) do not exist. 4. It is extremely silly to integrate Baloo in the Dolphin file manager, making it indistinguishable from Dolphin, and then only partly support a typical task for a file manager: searching for files! Referring to a third party app (KFind) to search for files, to me is inexplicable. 5. The chosen technical solution (preferring an index over in-situ search) should not exceed the importance of a normal use case. If the size of the index becomes too big, then "we" have done something wrong on a technical level. That burden should not be put on the end user. 6. Now that I know that Baloo gives incorrect search result in certain circumstances, makes me question if and how I can "trust" Baloo in future times. Which undermines the whole purpose of its existence. How do I know I can trust Baloo? My real life use case: Last week I was called by my friend. She had to copy her home directory (XFS) to external hard drive (VFAT) for backup. That failed as VFAT does not support filenames containing ':' and '?'. My friend had about 25 of those files. As she does not have root access to the laptop, KFind was not an option for her to install. Over the phone we had to divert to a very complicated session trying to explain her how to use 'find' (command line) and how to rename the listed files. I would rather never, ever do that again. -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 --- Comment #3 from Igor Poboiko --- It's not my logic, it's the logic of Baloo and its original developer :) The logic is quite straightforward though. Most likely, user is searching for some particular document. If his search term is contained in "137K files", it wouldn't help at all - such term might as well be dropped. If those are only terms user looking for, he won't be able to find anything; if his query contains other terms, those will more likely to help Baloo identify the document user is looking for. I believe short terms are mostly there just to be able to search over filename extensions (like "filename.jpg") and e-mail/domains (like "john...@example.org"). In both cases, the "exact match" logic would suffice. > [...] Apparently A-Z characters are first class citizens, whereas the other > characters are estranged cousins. That's intentional. Remember that Baloo provides search over file contents too. And if you have it in mind, it doesn't sound that arbitrary: letters and words (not necessarily A-Z: also numbers and other languages) contain the most information to build index upon. What are the chances user is going to search for a document that has "." or " " or "_" somewhere inside? And what are the chances it will help to identify the document uniquely? Not to mention that by restricting itself to alphabet, it reduces the size of the index by a large factor. If you're looking for a file with a name you know precisely, and which mostly contains non-alphanumeric characters, then "find" / KFind or any other filesystem crawler will most likely do better. > Baloo = 5.55.0 I couldn't also help but notice that the version your distribution ships is a bit outdated. There were large number of improvements to Baloo somewhat around 5.60+ (unrelated to this particular issue, though). -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 Ceaus changed: What|Removed |Added CC||k...@pohw.nl --- Comment #2 from Ceaus --- I'm facing the same problem. Files are not found whilst they are there. The logic of Igor Poboiko (comment #1) just complicated things: 1. If I search for files with very common letters in their name, such as 'a' or 'e', baloo just reports a handful, while I have hundreds in my home dir. At this point I have zero cue how to interpret the search results. I see some files, but I also see many not. How do I now interpret this list? I know the list is incorrect(because I know the file exists). But to what extend? Are there other files not in the search results? And if so, why? 2. If I search for single non-alphabet characters, such as filenames with a '-', '_' or ' ' baloo returns zero results. Which goes against the results of option (1). So now the question becomes even more difficult to answer: what is this list I am looking at? Apparently A-Z characters are first class citizens, whereas the other characters are estranged cousins. To me this sounds rather arbitrary. baloo should simply return them all. If there is a genuine concern for the list being too long, then why not raise a warning: "Hey, are you sure you want a list containing 137K filenames?" BTW: I'm on openSUSE Leap 15.1 Baloo = 5.55.0 -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 Igor Poboiko changed: What|Removed |Added CC||igor.pobo...@gmail.com --- Comment #1 from Igor Poboiko --- The problem here is that short terms (of length < 3) are only matched *exactly*. This is intentional: because if we match them by prefix (like we do with longer terms) there will most likely be just too many matches. So in your case 'AB 1' doesn't match files #1 and #3 because they don't contain "1" exactly. But all of them do contain "AB" exactly. -- You are receiving this mail because: You are watching all bug changes.
[frameworks-baloo] [Bug 405094] Symbols in name break search
https://bugs.kde.org/show_bug.cgi?id=405094 nathan.figue...@gmail.com changed: What|Removed |Added Summary|Symbols in long name break |Symbols in name break |search |search -- You are receiving this mail because: You are watching all bug changes.