Hi, This is also something I've thought about, since we have a dated elasticsearch integration for creating facets from endpoints, and we use aggregated sparql queries for counting which sometimes becomes slow-ish, and has to be turned off for larger datasets.
An idea I had around 3 in how it could look, was maybe to extend the the text query syntax with one named variable for facets, which could also contain the results Using the example from from the jena-text documentation: (?s ?score ?literal ?g ?facets) text:query 'word' would add "?facets" optionally to the possible syntax. I don't know what the type of the list ?facets (categories and counts) should be, I initially thought it would be nice to have as json, but see that one graph database implements facet results as blank nodes. An option could be just adding an additional parsable string to the text:query extension function, but it is kind of already rich, so I think text:facet is a good idea to not bloat the text:query. There are probably multiple use-cases there as well, such as range, multiple values on same facet, so this idea may end up looking a bit hackish: ?s text:query ( property* 'query string' limit 'lang:xx' 'highlight:yy' 'facets: facet1: "value1", facet1: "value2"; facet2 : ...') I'm very happy to see others also interested in this use case. Best regards, Øyvind On Tue, Feb 14, 2023 at 6:52 AM David Habgood <[email protected]> wrote: > Thanks for the link Andy, > > @Elie my specific use case is this: > > I have millions of records with perhaps 100 unique attributes across the > records. Individual records may only have 5-10 attributes though. So a user > who wishes to browse the data based on attributes can progressively filter > the data to find individual or groups of records. When a user selects a > facet, only those (additional) facets for which records exist are displayed > as options, along with counts. > > It is possible with regular SPARQL GROUP BY and COUNT queries but not so > performant. > > Cheers > > On Tue, Feb 14, 2023 at 2:58 AM Andy Seaborne <[email protected]> wrote: > > > > > > > On 13/02/2023 12:59, David Habgood wrote: > > > Hi Jena Users, > > > > > > I'm interested in extending the Jena Lucene capabilities to include > > > Lucene's faceted search ( > > > > https://javadoc.io/doc/org.apache.lucene/lucene-facet/latest/index.html > > ). > > > > > > > https://lucene.apache.org/core/9_5_0/demo/org/apache/lucene/demo/facet/package-summary.html > > > > > > > > > > As far as I can tell from searching the mailing list (and github) the > > > Lucene faceted search capability hasn't been exposed in Jena before. > > > > > > I think it could be exposed as follows: > > > 1. Defining how faceted search concepts can be expressed in the Jena > > > dataset configuration > > > 2. Extending the current indexing code to also generate the facet index > > > based on definitions in 1. > > > 3. Adding a new query function for faceted search e.g. text:facet > > > > > > Keen to hear if anyone can see issues with this approach or has other > > > feedback. > > > > > > Thanks > > > David > > > > > >
