Hello all, As you know, our main goal for the oct12 release of Koha is to introduce solr as an alternate search engine. BibLibre already explained which improvements will be added by this search engine on the blog page: http://drupal.biblibre.com/en/blog/entry/solr-developments-for-koha
During the hackfest in Marseille, a group of 4 persons (Claire, Henri-Damien, Juan and Zeno) worked on how this work should be done to be introduced smoothly. The first goal being that a library wanting to run zebra still could. As some librarians could want to use another search engine than zebra or solr, we want to follow a path that would result in a better modularity. I also think that most of us agree that current search code is ugly & very hard to maintain/improve. The hackfesters have produced a drawing explaining how we could name the different packages: https://docs.google.com/a/biblibre.com/drawings/d/1ZdsQsoThYgIVSgH3LqgRZy17xm9X7XkLT6RG3fDYCzs/edit, with a page on the wiki: http://wiki.koha-community.org/wiki/Switch_to_Solr_RFC#.23kohahack12 In this drawing (read from bottom to top), there are 2 main layers "Search" and "Index", that are reponsible of doing searches and doing indexing. The "Conf" object will be responsible to retrieve the configuration (current getIndexes), the "Query" object would be responsible to build the query in SearchEngine grammar, the "Plugin" object would be reponsible to deal with records before indexing (like normalizing data) Claire (from BibLibre) made a first implementation of this organization on github: https://github.com/clrh/wip-searchengine-layer/tree/master/lib/SearchEngine. Juan (from xercode), also worked on this organization, on the zebra side. His code is available also on github: https://github.com/xercode/Data-SearchEngine-Zebra. Now, Henri-Damien is continuing the work for implementing zebra with this global structure. In the meantime, 2 other directions have been followed: * Frédéric (Demians, from Tamil) wrote a daemon for zebra indexing (see http://git.tamil.fr/?p=Koha-Contrib-Tamil;a=summary), that resulted in bug http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=7759, that document how to introduce this daemon for indexing. Liz (and maybe others) are using it without any problem. This git repository introduces some other tools, but what they effectively do is not completely clear to me (Frédéric, if you want, to add some info...) * Galen (Charlton, from Equinox) wrote some code that you can see in http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=7818 and is in "needs signoff" status. The description of the bug includes a lot of things: DOM indexing for biblios (and a tool to automatically write the DOM xsl from the record.abs), and a normalizer for datas, an indexer (Koha::Indexer). Unless I've missed something (Galen, tell if I'm wrong): for now, only the DOM indexing is submitted, normalizer and indexer are not. What we all agree about: we should have a clearer way to: Normalize / Index / Search in Koha. That's great ! The structure described by the hackfester is great because it's independent from the SearchEngine you use. I think large portions (if not all) of Koha::Contrib::Tamil could be used to write the zebra indexing layer. I also think that The DOM indexing part of what Galen has submitted can be signed-off & pushed without any risk, but the normalize and indexer parts will need coordination to avoid having BibLibre/xercode working in a direction, and Galen working in another. I really like the idea of having normalizer not necessary being MARC; that could be useful in the future. That's why I propose to organize an IRC meeting (date and time to define, but that will be in Europe afternoon / US morning) with all volunteers to coordinate their efforts. I think this meeting should be regular (monthly ?) After each meeting, a summary of the conclusions would be made on the wiki and posted on this mailing-list. My proposition: if you're interested by participating to this effort, please answer to this mail. (I'll then start a doodle to find a proper time. I propose 2 hours for the duration of the 1st meeting, then, hopefully, shorter meetings) -Juan/Galen/Zeno, you're considered as being interested by this topic ;-) -- Paul POULAIN http://www.biblibre.com Expert en Logiciels Libres pour l'info-doc Tel : (33) 4 91 81 35 08 _______________________________________________ Koha-devel mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
