Reply inline:
On Tue, October 12, 2010 16:20, LAURENT Henri-Damien wrote: > Le 12/10/2010 14:48, Thomas Dukleth a écrit : >> Reply inline: >> >> >> Original Subject: [Koha-devel] Search Engine Changes : let's get some >> solr >> >> On Mon, October 4, 2010 08:10, LAURENT Henri-Damien wrote: [...] >>> I think that every one agrees that we have to refactor C4::Search. >>> Indeed, query parser is not able to manage independantly all the >>> configuration options. And usage of usmarc as internal for biblio comes >>> with a serious limitation of 9999 bytes, which for big biblios with >>> many >>> items, is not enough. >> >> How do MARC limitations on record size relate to Solr/Indexing or Zebra >> indexing which lacks Solr/Lucene support in the current version? > Koha is now using iso2709 returned from zebra in order to display result > lists. I recall that having Zebra return ISO2709, MARC communications format, records had the supposed advantage of faster response time from Zebra. > Problem is that if zebra is returning only part of the biblio and/or > MARC::Record is not able to parse the whole data then the biblio is not > displayed. We have biblio records which contains more than 1000 items. > And MARC::Record/MARC::File::XML fails to parse that. > > So this is a real issue. Ultimately, we need a specific solution to various problems arising from storing holdings directly in the MARC bibliographic records. > > >> >> How does BibLibre intend to fix the limitation on the size of >> bibliographic records as part of its work on record indexing and >> retrieval >> in Koha or in some parallel work.? > Solr/Lucene can return indexes and thoses be used in order to display > desired data or we could also do the same as we do with zebra : > - store the data record (Format could be iso2709 or marcxml or YAML) > - use that for display. If using ISO 2709, MARC communications format, how would the problem of excess record size be addressed? > Or we could use GetBiblio in order to get the data from database. > Problem now would be the fact that storing xml in database is not really > optimal for process. I like the idea of using YAML for some purposes. As you state, previous testing showed that returning every record in a large result set from the SQL database was very inefficient as compared to using the records as part of the response from the index server. Is there any practical way of sufficiently improving the efficiency of accessing a large set of records from the SQL database? How much might retrieving and parsing YAML records from the database help? I can imagine using XSLT to pre-process MARCXML records into an appropriate format, such YAML with embedded HTML, pure HTML, or whatever is needed embedded for a particular purpose and storing the pre-processed records in appropriate special purpose columns. Real time parsing would be minimised. The OPAC result set display might use biblioitems.recordOPACDisplayBrief. The standard single record view might use biblioitems.recordOPACDisplayDetail. An ISBD card view might use biblioitems.recordOPACDisplayISBD. [...] Thomas Dukleth Agogme 109 E 9th Street, 3D New York, NY 10003 USA http://www.agogme.com +1 212-674-3783 _______________________________________________ Koha-devel mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
