Jos van den Oever wrote: > Hi all, Hi Jos, great to have you in on this discussion.
> > Strigi has a few features that are not in Tracker or Beagle and misses > a number of features that the other programs lack. But the core > functionality of Strigi, indexing data, is something that it shares. > One important distinction has to be made straightaway: the difference > between indexing metadata and storing metadata. Strigi only indexes > metadata. If you think you're disk is full, you can just throw away > the index, because there is no data of value in there. All that's in > there is an index that allows you to find your data quickly. > Personally, I think _storing_ metadata in an indexer is not a good > idea. (I do think that an index on a metadata store is a good idea, > but that's a different matter). This is a large difference with > Tracker which does act as a metadata store of 'first class objects' > whatever that means. Beagle is also mainly an index. (Is any > non-redundant data lost if I delete my Beagle index, Joe?) First to clarify, tracker is not a dedicated indexer (like Beagle and Strigi) but is first and foremost a database which has indexing as a side feature. Our metadata store (sqlite) is quite separate from our full text indexer (QDBM) which can be deleted if not required - the data there is just as expendable as in Strigi's and Beagle's case. No metadata is "stored" in the full text indexer although indexable metadata is of course indexed in it. Tracker can also be run as a stand alone metadata store/server without any indexing if desired (with the --disable-indexing command line option) [snip] > > So it this is not a sales talk, what is it? It's a call for > standardization. This discussion between competing programs is a great > time to start talking about common functionality. With regards to > desktop search there are many things that can be standardized: > - query language > - metadata names and meaning > - test suites > - DBus APIs > - index formats > > I won't discuss index formats because, even though Beagle and Strigi > both use the Lucene index format, this is an implementation detail and > defines performance and disk usage and should not be frozen into a > standard. > > The query language as used by Beagle and Strigi is very similar (no > coincidence) and is a good start for standardization. The largest > drawback of the language used is the ambiguity of the field > specifiers. > > Now that DBus v1 is almost upon is, the barriers between GNOME and KDE > are diminishing. Functionality defined by a DBus API can by > implemented in any language and as such, I think GNOME should choose a > DBus API to use and share with KDE and yes this is my desire also. > > Test suites. I'd love there to be a common test suite that says: if > you index this data with these parameters, you should get these > results from this query. Strigi will develop such test naturally. > Being able to share them across projects would mean that programs > would compete on merit and without the usual prejudices and license > and library incompatibilities. > Strigi has a DBus interface for searching, so does Tracker. We should > compare them and find a common interface. Of course the respective > GNOME and KDE developers should decide which DBus API should be used > by their applications. Freedesktop.org would be a good place to define > these interfaces. we should have a org.freedesktop.indexer interface that we can all share. Implementation specific stuff can then reside in their own unique interfaces > > Metadata naming and meaning. This is something which is rather hard. > Dublin Core is part of it. It names some types of metadata. I've > already mailed about this with Jamie in the past . In my opionion, the > issue should be separated into smaller definitions that say, what > metadata fields can be extracted from certain filetypes. Indexer > plugins could then advertise that they implement this functionality. > The names for the metadata names should also be used when searching > and there, for convenience, they should be abbreviated as is current > practice. > > So, rather a long mail that can be summarized in: please accept an API > for searching and not a suit of programs (indexer + guis to it) and > start thinking about standardizing _indexable_ metadata (other > metadata is a whole different can of worms that I wont touch). This is > still possible since neither KDE nor GNOME have agreed on a program > for indexing and by adopting only an API, programs will be forced to > collaborate to adhere to the API as good as possible, meaning the user > wins. I agree from the indexing point of view but Gnome requires a reference implementation to be available - in cases where there have been multiple cases, Gnome has always blessed one (EG Epiphany vs Galeon) but that does not mean distros use the blessed one (EG Firefox is more likely to be used as the dominant web browser even though I think Epiphany is better in a Gnome setting) The other (somewhat unique) features of tracker - desktop wide tagging, extensible metadata etc are still vital ingredients for Gnome and thats one of the other reasons for proposing tracker and we need it to be more integrated if Gnome is to become more integrated in this regard. We also have big problems with lots of #ifdef'ing in code so standardising would be a big win. Im sure when I try and implement Epiphany's next generation bookmark/history stuff into tracker's first class object database they would prefer it not to be #ifdef'ed? So with tracker being able to be used as a standalone metadata store without any indexing there shouldn't be a need to confine what goes into gnome to just pure indexing but could leave the door open to : just tracker or tracker+Beagle or tracker+strigi with the latter cases taking ownership of the shared indexing dbus interface and tracker confined to metadata storage only Some people might not like that but I think its a practical compromise. With tracker being the only one written in pure C it is therefore the only one that can *ultimately* get into the Gnome platform and be fully integrated (at the moment I am just proposing it for desktop which is just a simple blessing nothing more). I hope having a shared interface for the pure indexing case will solve the concerns other indexers have and allow us to integrate tracker otherwise we risk restricting innovation and integration with a pure indexing solution which would mean we miss out on the more exciting features of tracker and their usefulness to Epiphany and other apps. -- Mr Jamie McCracken http://jamiemcc.livejournal.com/ _______________________________________________ desktop-devel-list mailing list desktop-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/desktop-devel-list