D12787: Ignore more types of source files
This revision was automatically updated to reflect the committed changes. Closed by commit R293:7529727e4624: Ignore more types of source files (authored by ngraham). REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=34155=34241 REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
bruns accepted this revision. bruns added a comment. This revision is now accepted and ready to land. Not tested by me, by looks good in general. REPOSITORY R293 Baloo BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 34155. ngraham added a comment. Omit all .map files, and also .ini files REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=34151=34155 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked an inline comment as done. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
bruns added a comment. If you want to read more about text in SVG: http://tavmjong.free.fr/blog/ To show a generalized XML extractor is sufficient for SVG: - Path data: `` - Single Line: `Single line` - Multiline: `This is some multiline Text` Non-text tags are empty (i.e., are defined by attributes only). REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked 3 inline comments as done. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 34151. ngraham added a comment. Revert change to omit SVG files REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=34146=34151 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
bruns added inline comments. INLINE COMMENTS > ngraham wrote in fileexcludefilters.cpp:154 > My impression is that Baloo is really intended for user files; SVGs only get > their content indexed by accident, because they happen to be textual. I don't > think there's any textual content inside an SVG file that you'd actually want > to have indexed. SVGs are user files, and anything inside `` is textual content. You can have several paragraphs with text inside SVGs. We index the RDF metadata (author, title, ...) for PDFs, EPUB, ... so we should for SVG. Of course it is pointless to index e.g. the tags itself, or the content of any non-textual tag, thats the reason I asked for an XML extractor. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham added inline comments. INLINE COMMENTS > bruns wrote in fileexcludefilters.cpp:154 > Hm, not to sure about this one - SVG typically has RDF metadata, and also > everything in `` tags qualifies as "content". > Do we have a generalized XML extractor? My impression is that Baloo is really intended for user files; SVGs only get their content indexed by accident, because they happen to be textual. I don't think there's any textual content inside an SVG file that you'd actually want to have indexed. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
bruns added inline comments. INLINE COMMENTS > fileexcludefilters.cpp:154 > +"image/svg+xml", > +"image/svg+xml-compressed", > "application/x-awk", Hm, not to sure about this one - SVG typically has RDF metadata, and also everything in `` tags qualifies as "content". Do we have a generalized XML extractor? REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 34146. ngraham added a comment. Add some more REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=34145=34146 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked an inline comment as done. ngraham added a comment. How do people feel about adding `*.ini` to the exclusions list? REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked 5 inline comments as done. ngraham added inline comments. INLINE COMMENTS > bruns wrote in fileexcludefilters.cpp:82 > Thats not what I meant (I am not aware of anything generating a `Bytecode` > file literally). > I meant changing the `// Compiled files` comment to `// Bytecode files`, > which all the ones below are. Heh, oops. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 34145. ngraham added a comment. Fix misinterpretation REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=34143=34145 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
bruns added inline comments. INLINE COMMENTS > fileexcludefilters.cpp:82 > +"*.jsc", // Javascript > +"Bytecode", > Thats not what I meant (I am not aware of anything generating a `Bytecode` file literally). I meant changing the `// Compiled files` comment to `// Bytecode files`, which all the ones below are. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 34143. ngraham added a comment. More buildy files REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=34141=34143 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
bruns added a comment. Does anyone know if there are any artifacts generated by the meson build system? INLINE COMMENTS > fileexcludefilters.cpp:74 > > // Compiled files > "*.class", // Java Probably `Bytecode` - we have `.o` above, which is also compiled > fileexcludefilters.cpp:76 > "*.class", // Java > "*.pyc", // Python > "*.elc", // Emacs Lisp For python2, there is also `.pyo` (Python3 is covered by the `__pycache__` directory filter) > ngraham wrote in fileexcludefilters.cpp:69 > As far as I can tell, we do not, and they have to be manually listed. I've > added `qmlc` and `jsc`. Any more you can think of? Static library - `.a` REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked an inline comment as done. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked 2 inline comments as done. ngraham added inline comments. INLINE COMMENTS > broulik wrote in fileexcludefilters.cpp:69 > Don't we ignore blobs already? If not, we should also add stuff like `qmlc` > and `jsc` As far as I can tell, we do not, and they have to be manually listed. I've added `qmlc` and `jsc`. Any more you can think of? REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 34141. ngraham added a comment. Add more blobs REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=33920=34141 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
broulik added inline comments. INLINE COMMENTS > fileexcludefilters.cpp:69 > +"*.css.map," > +"*.so", > +"*.db", Don't we ignore blobs already? If not, we should also add stuff like `qmlc` and `jsc` > fileexcludefilters.cpp:77 > "*.elc", // Emacs Lisp > +"*.qrc", // QML > `qrc` is a Qt resource file, not a QML file REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: broulik, cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham added reviewers: michaelh, bruns. ngraham added a comment. Friendly ping! REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham, michaelh, bruns Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 33920. ngraham added a comment. Also omit node_packages folders REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=33919=33920 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham edited the summary of this revision. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham edited the summary of this revision. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 33919. ngraham added a comment. Add more to also fix 39093 REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=33915=33919 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 33915. ngraham added a comment. Revert unintentional change REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=33912=33915 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES src/file/fileexcludefilters.cpp To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham marked an inline comment as done. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham updated this revision to Diff 33912. ngraham added a comment. add missing comma REPOSITORY R293 Baloo CHANGES SINCE LAST UPDATE https://phabricator.kde.org/D12787?vs=33903=33912 BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES CMakeLists.txt src/file/fileexcludefilters.cpp To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
cfeck added inline comments. INLINE COMMENTS > fileexcludefilters.cpp:142 > +"text/csx", > +"text/vnd.trolltech.linguist" > "application/x-awk", , REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham Cc: cfeck, kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham edited the summary of this revision. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D12787 To: ngraham Cc: kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns
D12787: Ignore more types of source files
ngraham created this revision. Restricted Application added projects: Frameworks, Baloo. Restricted Application added subscribers: Baloo, kde-frameworks-devel. ngraham requested review of this revision. REVISION SUMMARY Add more types of development-related files to the exclusion lists. Thhese files aren't useful to index, and having them there can bog down Baloo. BUG: 394002 FIXED-IN 5.47 TEST PLAN Created a bunch of files of the newly excluded types. Baloo didn't index them. REPOSITORY R293 Baloo BRANCH more-excluded-source-files (branched from master) REVISION DETAIL https://phabricator.kde.org/D12787 AFFECTED FILES CMakeLists.txt src/file/fileexcludefilters.cpp To: ngraham Cc: kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns