Update:
The site didn’t relaunch in September 2020, it relaunched this Thursday.
I needed to wrap $what (below) in ft:normalize as per Christian’s
suggestion, so that upper-case input will be matched against the
normalized, lower-case, index.
The publisher didn’t yet agree to make the search result pages publicly
accessible. However, there’s a 14-day test access available at
https://www.hogrefe.com/eu/free-trial-access
I don’t mean to brag, but this is a very nice example of what you can do
with BaseX (and Saxon, and XML, and RDF).
Gerrit
On 31.08.2020 10:26, Imsieke, Gerrit, le-tex wrote:
Hi Michael,
Just to let you know that I finally and successfully used the full text
index for autocompletion, based on your prototype code.
My endpoint for the lookup is like this:
declare
%rest:path("/chpd/{$work}/complete/{$what}")
%rest:single (: only run 1 query per client :)
%output:method("json")
%rest:GET function chpd:complete($work, $what) {
element json {
attribute type {"array"},
let $tokens := ft:tokens('CHPD_' || $work ||
'_hobots_FT')[starts-with(., $what)]
return
for $t in $tokens
let $c := number($t/@count)
order by $c descending
count $rank
where $rank le 10
return element _ {
attribute type {"object"},
element label {string($t)},
element value {string($t)}
}
}
};
As I previously wrote, the site will (re-)launch later in September, and
I will post a link then. Although it is behind a paywall, I will look
into making the full text search and navigation lists available publicly
in order to lure people into subscriptions.
The full text search is satisfyingly fast, at least when I’m the only
user on the server.
Most other pages will be cached as HTML (with some placeholders for
login/logout & user name) by the access control application, written in
Ruby on Rails. This is because I render them dynamically from BITS with
Saxon, and although the rendering is fast, it would be a waste of
resources and a worse user experience if each page serving took
additional ~ 200 ms of XSLT rendering time.
Among other reasons, I do render them dynamically (instead of serving
statically rendered HTML) because there is a drug comparison
functionality that takes 2 or more drug description pages (CHPD =
Clinical Handbook of Psychotropic Drugs) and presents a side-by-side
diff. The possible drug combinations for diffing are to manifold to
pre-render them as HTML. Diffing the HTML instead of the BITS XML was
not an option due to accidental changes in the output (the document
structure of the BITS sources are more stable than the HTML output).
Since all other pages use the same rendering XSLT as the comparison
pages, I thought it would be too complicated to serve specific pages as
pre-rendered HTML while dynamically rendering other pages. Therefore
this HTML rendering happens dynamically, and it is cached also for the
comparison pages.
Some more details: The BITS→HTML rendering is powered by our jats2html
library
(https://github.com/transpect/jats2html/blob/master/xsl/jats2html.xsl).
In the XSLT you see imports like <xsl:import
href="http://transpect.io/xslt-util/lengths/xsl/lengths.xsl"/>. These
URIs don’t resolve immediately. They need a catalog resolver in order to
resolve to local resources. A shoutout to Liam Quin (and to Christian)
for making catalog resolution available to xsl:import and xsl:include,
https://github.com/BaseXdb/basex/issues/1719, which has been quite a
hairy issue.
If there is an XML Prague next year and if it features a BaseX user
meeting (nudge nudge), I will be happy to present the application in
greater detail.
Gerrit
On 28.06.2020 22:56, Michael Seiferle wrote:
You’re welcome.
Glad I could help save some time, I agree it looks simple, yet
wrapping ones head around those small details can be a real
showstopper sometimes :-)
Feel free to ask for more details anytime.
Looking forward to seeing said search portal!
Best from Konstanz
Michael
Von meinem iPhone gesendet
Am 27.06.2020 um 14:13 schrieb Imsieke, Gerrit, le-tex
<gerrit.imsi...@le-tex.de>:
This looks quite simple but you probably saved me two to four hours
of figuring out which lib to use, how to invoke the completion and
how to shape the server response. Will try to use it in my app tomorrow.
--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930
Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt