On Thu, Jul 24, 2014 at 4:53 PM, Julian Reschke <[email protected]> wrote:
> On 2014-07-24 16:40, Thomas Mueller wrote:
>>
>> Hi,
>>
>> I believe the JCR API specification allows to return -1 in all cases. The
>> specification doesn't say -1 must not be returned if you use "order by".
>> If you relied on that, then you have relied on an implementation detail.
>
>
> Indeed.

Well, if every Session.save() in oak would result in a
RepositoryException, then strictly speaking that would still be spec
compliant. So if my code relied on a Session.save() succeeding, did I
then rely on an implementation detail?

Of course I am not too serious here, but I don't think it really helps
to refer to a specification in these cases and that someone relied on
an implementation detail. Don't we all?

>
>
>>> Why I need to know the count is irrelevant.
>>
>>
>> Ehm, no, not to me. I'd like to understand the use case.

I do understand the use case (we have it as well), but I also do
understand the enormous complexity of returning correct sizes. There
is a reason why Solr and Elastic Search do not have any support for
authorization. In the field of fine grained ACLs, which can change on
the fly, it is virtually impossible to index (adn to keep in sync)
ACLs in an inverted index, and thus, you have to do the authorization
after Lucene has returned the results. This however implies the
authorization typically has to fetch every JCR Node for every lucene
hit, and then do a isGranted check. If the number of hits is large,
and you run through your bundle caches (JR 2), it becomes really
expensive and slow to get an accurate #getSize.

What we added locally was a #getTotalSize, or better,
#getUnauthorizedSize, which returns the size from lucene directly.
This already helped for quite some general use cases. A second thing
we added is an authorization query, where the security model we have
is mapped to a lucene query. However, this cannot be done generically
and only for domain specific implementation of access manager.

Any way, hope this gives some feedback to Torgeir about the why and
the complexity.

The use cases we face why we need a correct #getSize count is
typically because our customers want to see a correct number of pages
of hits. Also in the context of faceted navigation with counts, you
typically require counts that are correct. In the past years I
frequently tried to explain customers that , 'there are more than 100
hits' would be good enough, but we faced the same request too often.

Any way, long story short, I can imagine the default return size is -1
in oak, it is spec compliant, and I do understand Torgeir would like
to get the behavior from JR 2 back in. Wouldn't be an option to expose
#getSizeUnauthorized() (and then of course a better name :-)

Regards Ard

>
>
> Absolutely.
>
> Best regards, Julian



-- 
Amsterdam - Oosteinde 11, 1017 WT Amsterdam
Boston - 1 Broadway, Cambridge, MA 02142

US +1 877 414 4776 (toll free)
Europe +31(0)20 522 4466
www.onehippo.com

Reply via email to