On 14.06.23 14:45, Øyvind Gjesdal wrote:
Hi Øyvind,
Facet/aggregation was not implemented as extension functions in SPARQL and
I believe that it also used the same abstraction described in the jena-text
docs:
One Jena*triple* equals one Lucene*document*
which makes aggregations/facets not available or usable neither from the
Elasticsearch APIs.
yes I saw that and I also thought that's probably not ideal. I don't
know much about Elastic in practice, I mainly read tutorials &
documentation. What I had in mind was that we could define for example
via SHACL shape (or something comparable) what a "document" contains. So
it's shapes that would define how we see the document and we could use
this abstraction for search. So the integration would take SHACL shapes,
create a "document" out of it that is consumable by Elastic and then we
could use this for search.
The second thing is that I'm mainly interested in an integration that we
don't have to update the Elastic index on our own. I guess that the
Fuseki integration takes care of that so it's "in sync" all the time. I
would want the Elastic API available as well as this is easier to use
for the facet use-cases than pure SPARQL. Paging is not trivial in
SPARQL for use-cases like this, the Elastic API however is built for that.
We switched to jena-text with Lucene after some weeks, which didn't have
aggregations either, but there was much more activity and usage for the
module, and the options for configuring from the assembler files were much
richer.
ok, any example of what you configure in there? I don't think I saw much
in the documentation for that so far. Aggregations are definitely
something I would like to have. One example are archival records, where
we have a hierarchy in the data. And I need to be able to show that
hierarchy per record (which has it's own IRI) and to browse by hierarchy
levels as well. This is super easy to represent in RDF but super hard to
query efficiently.
At the moment I'm unsure if I inspected and looked at the Elasticsearch
APIs directly to check the structure of the documents in the index itself,
after indexing.
What versions did you work on with Elastic?
regards
Adrian