starting a workgroup

Frédéric Demians Sat, 31 Mar 2012 08:22:09 -0700

> The hackfesters have produced a drawing explaining how we could name
> the different packages:


The classes hierarchy seems to rely on Data::SearchEngine module as
abstraction layer:

https://metacpan.org/release/Data-SearchEngine

Are you it touch with the module author? He could give us interesting
feedback on his module. What kind of implementation has he done? Is
there any other implementation done by someone else than his author? Is
it generic enough to be used in Koha context?

There is a ElasticSearch implementation of Data::SearchEngine:

https://github.com/gphat/data-searchengine-elasticsearch

In implementation notes, we can read:

  ElasticSearch's query DSL is large and complex. It is not well suited
  to abstraction by a library like this one. As such you will almost
  likely find this abstraction lacking. Expect it to improve as the author
  uses more of ElasticSearch's features in applications.

> * Frédéric (Demians, from Tamil) wrote a daemon for zebra indexing
> (see http://git.tamil.fr/?p=Koha-Contrib-Tamil;a=summary), that
> resulted in bug
> http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=7759, that
> document how to introduce this daemon for indexing. Liz (and maybe
> others) are using it without any problem.

I don't think there is a lot to do to have an abstraction layer for
indexing with SolR/Zebra/xx.

Koha::Indexer
 |
 +-- Koha::Indexer::SolR
 +-- Koha::Indexer::Zebra

A indexer is then able to index partially (queued) or fully biblio or
authority records. A command line indexing script is nothing more than a
wrapper to this class. It's even possible to run the task from the web
interface. There is a 'watcher' associated with the indexer which could
communicate asynchronously with a WUI via a JavaScript callback function.

As with rebuild_zebra.pl, indexing is a two step process: (1) export
record and (2) index records. Since record format/syntax to be sent to
the search engine may vary: XML, ISO2709, JSON (ElasticSearch), the
exporter must also be a generic class subclassed by a specific class for
each search engine, implementing normalization processing. I'd need to
see how it works now in SolR/Biblibre branch.

For me, the most undecided/mysterious part of the whole is the query
parser. Now, Koha support several syntaxes thanks to ZOOM yaz client: PQF,
CCL and CQL. Queries in those syntaxes are directly given to ZOOM. I
can't figure out how it can be reproduced with other search engine than
Zebra... This isn't a small piece of engineering. See above the citation
about Data::SearcEngine::ElasticSearch. It's one thing to abstract a
search result and its paging, and another thing to abstract a query
language--imagine three languages...

To be continued...
--
Frédéric DEMIANS
http://www.tamil.fr/u/fdemians.html

_______________________________________________
Koha-devel mailing list
[email protected]
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Re: [Koha-devel] Solr / zebra / search in Koha 3.10 => starting a workgroup

Reply via email to