Re: State of Elastic/Open Search support in Fuseki

Andy Seaborne Fri, 16 Jun 2023 12:53:39 -0700

** Faceted search

From the documentation:

There is also the model of "One document equals one entity" model thatmight be more appropriate faceted search. It returns the subject URIwith a Lucene document for multiple triples.

"""

When using this integration model, text:query returns the subject URIfor the document

"""

There then needs to be a facet property function. Would someone like tosketch one out as a GH issue?

** ElasticSearch - if we can negotiate the licensing issues (the clientlibs are OSS but to test them needs a server so it impacts the build;there may be a testcontainers.io way round this, or optional tests - weneed the build to be clean as well as the produced binaries), then thiscould be done and/or solr. It does need someone or someones to take aninterest in this both now and for keeping the code maintained especiallyif any security issues arise.


    Andy

On 15/06/2023 12:49, Adrian Gschwend wrote:

On 14.06.23 14:45, Øyvind Gjesdal wrote:

Hi Øyvind,
Facet/aggregation was not implemented as extension functions in SPARQLandI believe that it also used the same abstraction described in thejena-text
docs:
  One Jena*triple*  equals one Lucene*document*
which makes aggregations/facets not available or usable neither from the
Elasticsearch APIs.
yes I saw that and I also thought that's probably not ideal. I don'tknow much about Elastic in practice, I mainly read tutorials &documentation. What I had in mind was that we could define for examplevia SHACL shape (or something comparable) what a "document" contains. Soit's shapes that would define how we see the document and we could usethis abstraction for search. So the integration would take SHACL shapes,create a "document" out of it that is consumable by Elastic and then wecould use this for search.
The second thing is that I'm mainly interested in an integration that wedon't have to update the Elastic index on our own. I guess that theFuseki integration takes care of that so it's "in sync" all the time. Iwould want the Elastic API available as well as this is easier to usefor the facet use-cases than pure SPARQL. Paging is not trivial inSPARQL for use-cases like this, the Elastic API however is built for that.
We switched to jena-text with Lucene after some weeks, which didn't have
aggregations either, but there was much more activity and usage for the
module, and the options for configuring from the assembler files weremuch
richer.
ok, any example of what you configure in there? I don't think I saw muchin the documentation for that so far. Aggregations are definitelysomething I would like to have. One example are archival records, wherewe have a hierarchy in the data. And I need to be able to show thathierarchy per record (which has it's own IRI) and to browse by hierarchylevels as well. This is super easy to represent in RDF but super hard toquery efficiently.
At the moment I'm unsure if I inspected and looked at the Elasticsearch
APIs directly to check the structure of the documents in the indexitself,
after indexing.
What versions did you work on with Elastic?

regards

Adrian

Re: State of Elastic/Open Search support in Fuseki

Reply via email to