On 29.08.2014, at 03:10, Ard Schrijvers <[email protected]> wrote:
> 1) When exposing faceting from Jackrabbit, we wouldn't use virtual
> layers any more to expose them over pure JCR spec API's. Instead, we
> would extend the jcr QueryResult to have next to getRows/getNodes/etc
> also expose for example methods on the QueryResult like
>
> public Map<String, Integer> getFacetValues(final String facet) {
> return result.getFacetValues(facet);
> }
>
> public QueryResult drilldown(final FacetValue facetValue) {
> // return current query result drilled down for facet value
> return ...
> }
We actually have a similar API in our CQ/AEM product:
Query => represents a query [1]
SearchResult result = query.getResult();
Map<String, Facet> facets = result.getFacets();
A facet is a list of "Buckets" [2] - same as FacetValue above, I assume - an
abstraction over different values. You could have distinctive values (e.g.
"red", "green", "blue"), but also ranges ("last year", "last month" etc.). Each
bucket has a count, i.e. the number of times it occurs in the current result.
Then on Query you have a method
Query refine(Bucket bucket)
which is the same as the drilldown above.
So in the end it looks pretty much the same, and seems to be a good way to
represent this as API. Doesn't say much about the implementation yet, though :)
> 2) Authorized counts....for faceting, it doesn't make sense to expose
> there are 314 results if you can only read 54 of them. Accounting for
> authorization through access manager can be way too slow.
> ...
> 3) If you support faceting through Oak, will that be competitive
> enough to what Solr and Elasticsearch offer? Customers these days have
> some expectations on search result quality and faceting capabilities,
> performance included.
> ...
> So, my take would be to invest time in easy integration with
> solr/elasticsearch and focus in Oak on the parts (hierarchy,
> authorization, merging, versioning) that aren't covered by already
> existing frameworks. Perhaps provide an extended JCR API as described
> in (1) which under the hood can delegate to a solr or es java client.
> In the end, you'll still end up having the authorized counts issue,
> but if you make the integration pluggable enough, it might be possible
> to leverage domain specific solutions to this (solr/es doesn't do
> anything with authorization either, it is a tough nut to crack)
Good points. When facets are used, the worst case (showing facets for all your
content) might actually be the very first thing you see, when something like a
product search/browse page is shown, before any actual search by the user is
done. Optimizing for performance right from the start is a must, I agree.
What I can imagine though, is if you can leverage some kind of caching though.
In practice, if you have a public site with content that does not change
permanently, the facet values are pretty much stable, and authorization
shouldn't cost much.
[1]
http://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/Query.html
[2]
http://docs.adobe.com/docs/en/aem/6-0/develop/ref/javadoc/com/day/cq/search/facets/Bucket.html
Cheers,
Alex