https://bugs.kde.org/show_bug.cgi?id=420939
--- Comment #47 from [email protected] --- (In reply to Scott from comment #46) No problem, we carry on troubleshooting. > I think the problem is more than just misidentifying mime types. Finding out about the mimetypes and that baloo would never attempt to index some files was one step along the way. Good to find out but there's more to do. > 3/ Further it reports files waiting to be indexed and files failed to index > both being zero when in fact approximately 1,000 of the 6,000 files in the > dataset have not been indexed. I have restarted baloo repeatedly and they > never get indexed, it re-indexes what it had before. It's possible that we've got another mimetype issue with these files, or they are your 1000 biggest files, or something else. I think copy one of them to your home directory and check with xdg-mime query filetype ...newstrangefile... Check that the mimetype is sensible, then see what balooshow -x ...newstrangefile... says. > 1/ baloo terminates during indexing for unknown reasons (not > hanging/freezing as I erroneously stated previously) without providing a > reason code. I'll ask a bit more about this. Your "balooctl status" output says > Baloo File Indexer is running > Indexer state: Idle That's what baloo says when it's alive and thinks it has nothing more to do. There is the content indexer process "baloo_file_extractor" that is run when there is indexing necessary, does its job, saves the results, stops and is run again when there is more to do. This would/should happen in the background and you wouldn't see exit codes. > 2/ On restarting the indexing baloo re-indexes the same files with an > erroneous message that the files have changed (see my last email) or added > with baloo being turned off. Baloo is not checking that these index entries > already exist or there is some problem with the index file itself and so > just duplicates them which is why baloo reports over 21,000 files indexed > from a dataset only containing 6,000 entries. The error message is a: > ... id seems to have changed. Perhaps baloo was not running, and this file > was deleted + re-created Need to check the Id and see if it is really changing. Ask with "stat", you'll get something like: $ stat 1.ts File: 1.ts Size: 41416704 Blocks: 80896 IO Block: 4096 regular file Device: fc01h/64513d Inode: 794964 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 1000/ test) Gid: ( 1000/ test) Access: 2021-07-24 22:50:57.838161084 +0200 Modify: 2021-07-24 22:50:57.838161084 +0200 Change: 2021-07-24 22:51:42.686181710 +0200 Birth: - It's the "Device" and "Inode" numbers that you need to keep you eye on. The: Device: fc01h/64513d Inode: 794964 If you reboot and these change, baloo will think it's got a new file and try to index it again. Keep a note of the numbers, check again after a reboot and compare. You could also try a baloosearch for one of the files that always seems to be reindexed $ baloosearch -i ...oneofyoursavedfiles... If you are OK, baloosearch will give a single result, if the id has been changing, "baloosearch -i" would show several lines - with different ID numbers and the same file/pathname. Something like: $ baloosearch -i testfile 9ca00000028 /home/test/testfile 9ca0000002a /home/test/testfile 9ca0000002c /home/test/testfile That would be a red flag... > I had to disable baloo because it somehow seriously interferes with my > ability to move files from the admin PC to the server. With baloo running > on the server any attempt to transfer files to it results in very slow > transfer speeds and on occasion failure to complete the move and this is > occuring while the indexer is reporting idle. I can only guess where - but you are indexing *really* large files, and there were a couple of fixes two months ago to stop a Mime lookup read the whole file into memory. Bug 398908, fixed according to https://bugs.kde.org/show_bug.cgi?id=398908#c97 with 5.83. If you don't have this version, maybe the best thing to do it wait until it gets to you with an update. -- You are receiving this mail because: You are watching all bug changes.
