> at low I/O priority, without unpleasantly degrading system performance. > I imagine the sheer seek cost of pulling all those dentries, inodes into > memory, and evicting all the other useful data you had around - is a big > part of the plague. Hopefully btrfs will improve the situation somewhat > here, but wrt. inode / dentry management I suspect there is no really > good solution.
On rotating media its seek and access times. This is amplified on most older systems by the fact ATA devices had no queueing interface so the drive couldn't do any smart re-ordering to extract further parallelism. SSD is more important here than btrfs. Filesystems can try to be clever and hide the fact rotating media sucks for latency versus processing power, but only SSD actually fixes the problem properly. > Unfortunately, as soon as we have this, it is only a small > feature-creep step to "lets index all .c/.h files to extract comments in > the API documentation" - which (I suspect) then commits you to the > disaster of irritating a lot of developers - so they turn it off, and > getting bogged down indexing things no-one is ever going to want indexed > by tracker (?). I think there lies a misassumption. The actual indexing has a fairly high cost. The cost of extracting metadata while indexing ought to be relatively low in comparison. That argues that allowing stuff to plug into the indexing based on file type is useful. It's not really function creep either given the only interface the indexer needs is - who is associated with this file type (which exists) - give me your metadata for this file content and if there is nobody wanting to do so then who cares. If apps provide the interface for metadata extraction (into a tag soup or something) then if you don't have the app installed you won't index for it. Document to tag ought to be fast. > Personally, I'd start by ignoring any directory tree with a configure* > script in the top-level, or perhaps a .git / .svn directory - that > should reduce the inotify pain :-) > > So - my point is: are the devs fetching source code at the console - > that you are concerned about above, really in the target audience for > tracker ? and if so why ? How about "who sent that patch, what are the related emails and when were they last on irc" - a classic developer query. Possibly bundled in with "do I have a picture of them" (conferences) and "who are their close friends" (other ways to get hold of and see connections), "where are they right now" (irc connecting address, email headers and geodata for IP addresses). Or in short - developers are not different. A lawyer wants to do the same thing within a firm for a case note, an CAD designer for a design change, a secretary for letters, etc. Physical indexing (the file walking side), extracting meaning and query processing are three unrelated tasks. In your developer case if I've got various git helpers installed it would be nice that the indexer bothered to talk to the git plugins about source code and git trees. If I don't have them installed it doesn't need to - its a modular problem. Maybe you also need to learn what types of metadata people use the most for presentation (eg by what links they follow) but thats another story in the UI anyway. Alan _______________________________________________ desktop-devel-list mailing list desktop-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/desktop-devel-list