Hi!, On Fri, 2008-02-29 at 09:10 -0500, Jamie McCracken wrote: > On Fri, 2008-02-29 at 11:34 +0100, Carlos Garnacho wrote: > > Hi!, > > > > I've attached a patch in bug #519337 to keep the extractor alive between > > operations. This greatly improves performance, as it avoids having to > > spawn/initialize the extractor constantly for each new file. With the > > patch, the extractor shuts down by itself after 30 seconds of > > inactivity, any testing is appreciated. > > > > Besides, I've been thinking a bit in this subject. Right now trackerd > > waits synchronously for the metadata extractor output (and the same > > happens for thumbnailing, even when such data isn't immediately > > necessary), so only 1 file is processed at the same time. > > > > Has there been any thinking/work on making that parallelizable? I'm sure > > there'd be performance improvements if there was a pool of extractors > > which asynchronously processed a queue of filenames. > > > > yeah although its tricky with threads (synchronisation and deadlock > issues)
I didn't plan to use threads here, I've developed a small test extractor [1] that spawns several extractors and manages them asynchronously through watches, it requires the patched tracker-extractor from bug #519337. You can run it with: ./test-extract [num-extractors] [path-to-extract] Being a test, it just gets metadata from mp3 files, but the tracker-extractor-pool.[ch] files can be easily adapted to tracker needs. <snip> > > anyway to cut a long story short, daemonizing tracker-extract is not > the > way to go but rather to embed common and reliable (Eg not crash prone) > formats in a tracker-file-indexer daemon. It should use dbus of course > for flexibility. It could be threaded as it would be less complex than > trackerd is at the moment What would be the criteria for marking a extractor as reliable? I'd be extra-careful there, extractors deal with unknown data. Also, threading brings other complexities, like the underlying libraries not being thread-safe, having extractors that resort to command line calls not thread aware at all, etc... It's nice to know about your plans, they sound really great overall. Regards, Carlos [1] http://people.imendio.com/carlos/test-extractor-pool.tar.gz _______________________________________________ tracker-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/tracker-list
