Re: [Tracker] [PATCH] "Daemonize" metadata extractor

Mikkel Kamstrup Erlandsen Sat, 01 Mar 2008 06:10:31 -0800

On 29/02/2008, Jamie McCracken <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, 2008-02-29 at 11:34 +0100, Carlos Garnacho wrote:
> > Hi!,
> >
> > I've attached a patch in bug #519337 to keep the extractor alive between
> > operations. This greatly improves performance, as it avoids having to
> > spawn/initialize the extractor constantly for each new file. With the
> > patch, the extractor shuts down by itself after 30 seconds of
> > inactivity, any testing is appreciated.
> >
> > Besides, I've been thinking a bit in this subject. Right now trackerd
> > waits synchronously for the metadata extractor output (and the same
> > happens for thumbnailing, even when such data isn't immediately
> > necessary), so only 1 file is processed at the same time.
> >
> > Has there been any thinking/work on making that parallelizable? I'm sure
> > there'd be performance improvements if there was a pool of extractors
> > which asynchronously processed a queue of filenames.
> >
>
>
> yeah although its tricky with threads (synchronisation and deadlock
> issues)
>
> The plan for 0.7 is to split trackerd into :
>
> 1) Always active main daemon that does watching and processes search
> requests
>
> 2) tracker-file-indexer - called by (1) via dbus to index files. Nice
> +19 and ioniced. Exits when indeixng complete. Dbus activated when
> crashed or new stuff to index comes about
>
> 3) tracker-email-indexer - called by (1) to index emails. same as (2).
> File attachemnts would need to be handled by similar code to (1) which
> is disadvantageous though
>
> 4) xesam extractors - some extractors can be built into (1) and (2) so
> as to become a daemonised extractor others will be specified by xesam
> and called out of process by (1)
>
> 5) xesam crawlers - as (4) but for containerised objects like news feeds
>
>
> The above would be faster and much more leaner on memory as memory
> consumed by indexing would be released when indexing has finished. It
> should be more maintainable and less complex than a monolithic trackerd
>
> there would also need to be private shared libs for the above components
> to enhance code reuse
>
> the xesam stuff would easily allow 3rd party extractors and crawlers to
> be implemented
>
> anyway to cut a long story short, daemonizing tracker-extract is not the
> way to go but rather to embed common and reliable (Eg not crash prone)
> formats in a tracker-file-indexer daemon. It should use dbus of course
> for flexibility. It could be threaded as it would be less complex than
> trackerd is at the moment
>
> Designing the above will be tricky but should go hand in hand with
> refactoring. If thats somehting you or others want to work on then we
> should discuss on IRC



Jamie if you have more in depth design ideas it would be a good idea to post
them on the Xesam ml. Specifically about the shared Xesam metadata
extractors and crawlers. There has not been much concrete discussion on
these topics.

Cheers,
Mikkel

_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Re: [Tracker] [PATCH] "Daemonize" metadata extractor

Reply via email to