Update:

The site didn’t relaunch in September 2020, it relaunched this Thursday.

I needed to wrap $what (below) in ft:normalize as per Christian’s suggestion, so that upper-case input will be matched against the normalized, lower-case, index.

The publisher didn’t yet agree to make the search result pages publicly accessible. However, there’s a 14-day test access available at https://www.hogrefe.com/eu/free-trial-access

I don’t mean to brag, but this is a very nice example of what you can do with BaseX (and Saxon, and XML, and RDF).

Gerrit




On 31.08.2020 10:26, Imsieke, Gerrit, le-tex wrote:
Hi Michael,

Just to let you know that I finally and successfully used the full text index for autocompletion, based on your prototype code.

My endpoint for the lookup is like this:

declare
     %rest:path("/chpd/{$work}/complete/{$what}")
     %rest:single (: only run 1 query per client :)
     %output:method("json")
     %rest:GET function chpd:complete($work, $what) {
   element json {
     attribute type {"array"},
    let $tokens := ft:tokens('CHPD_' || $work || '_hobots_FT')[starts-with(., $what)]
     return
       for $t in $tokens
       let $c := number($t/@count)
       order by $c descending
       count $rank
       where $rank le 10
       return element _ {
         attribute type {"object"},
         element label {string($t)},
         element value {string($t)}
       }
   }
};

As I previously wrote, the site will (re-)launch later in September, and I will post a link then. Although it is behind a paywall, I will look into making the full text search and navigation lists available publicly in order to lure people into subscriptions.

The full text search is satisfyingly fast, at least when I’m the only user on the server.

Most other pages will be cached as HTML (with some placeholders for login/logout & user name) by the access control application, written in Ruby on Rails. This is because I render them dynamically from BITS with Saxon, and although the rendering is fast, it would be a waste of resources and a worse user experience if each page serving took additional ~ 200 ms of XSLT rendering time.

Among other reasons, I do render them dynamically (instead of serving statically rendered HTML) because there is a drug comparison functionality that takes 2 or more drug description pages (CHPD = Clinical Handbook of Psychotropic Drugs) and presents a side-by-side diff. The possible drug combinations for diffing are to manifold to pre-render them as HTML. Diffing the HTML instead of the BITS XML was not an option due to accidental changes in the output (the document structure of the BITS sources are more stable than the HTML output). Since all other pages use the same rendering XSLT as the comparison pages, I thought it would be too complicated to serve specific pages as pre-rendered HTML while dynamically rendering other pages. Therefore this HTML rendering happens dynamically, and it is cached also for the comparison pages.

Some more details: The BITS→HTML rendering is powered by our jats2html library (https://github.com/transpect/jats2html/blob/master/xsl/jats2html.xsl). In the XSLT you see imports like <xsl:import href="http://transpect.io/xslt-util/lengths/xsl/lengths.xsl"/>. These URIs don’t resolve immediately. They need a catalog resolver in order to resolve to local resources. A shoutout to Liam Quin (and to Christian) for making catalog resolution available to xsl:import and xsl:include, https://github.com/BaseXdb/basex/issues/1719, which has been quite a hairy issue.

If there is an XML Prague next year and if it features a BaseX user meeting (nudge nudge), I will be happy to present the application in greater detail.

Gerrit


On 28.06.2020 22:56, Michael Seiferle wrote:
You’re welcome.
Glad I could help save some time, I agree it looks simple, yet wrapping ones head around those small details can be a real showstopper sometimes :-)
Feel free to ask for more details anytime.

Looking forward to seeing said search portal!


Best from Konstanz
Michael
Von meinem iPhone gesendet

Am 27.06.2020 um 14:13 schrieb Imsieke, Gerrit, le-tex <gerrit.imsi...@le-tex.de>:

This looks quite simple but you probably saved me two to four hours of figuring out which lib to use, how to invoke the completion and how to shape the server response. Will try to use it in my app tomorrow.


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Reply via email to