Hi, It's true that COUNT() queries are slow in Jackrabbit. Databases have the same problem: if you have no index (and in many cases even if you have one), the database internally still needs to read all rows that match, at least in a lot of cases.
Regards, Thomas On 4/4/12 11:08 AM, "Bart van der Schans" <[email protected]> wrote: >Hi, > >The problem with providing a COUNT feature is authorization. The get >to a correct count you need to fetch all the nodes from the PM and >authorize them one-by-one. This works fine for a few hundred nodes, >but will take forever when you scale up.For example "SELECT COUNT(*) >FROM nt:base" would directly translate to fetch every single bundle >from the persistence layer (database) and do that with a single query >for every node and then authorize each of those nodes. > >The only way I can see a well performing count feature is when the >authorization tokens are indexed as well, aka running authorized >queries. For JR 2 this is probably not an option. > >IMO providing a count feature in the query language would give the >wrong impression to developers. It would look like something you can >"just use", but in fact it will not work once you start putting >content in your production repository. > >Regard, >Bart > >On Wed, Apr 4, 2012 at 8:35 AM, Christian Stocker ><[email protected]> wrote: >> Hi >> >> Any opinions on this? I'm mainly wondering if there's a chance that >> something like this goes into jackrabbit or if there's opposition to >> that. Not the actual implementation, just the idea of it. I'll work more >> on it, if you think that could be something for jackrabbit-jcr-server >> and propose something more decent. >> >> If not, I'll pursue other challenges ;) >> >> Greetings >> >> chregu >> >> On 31.03.12 08:18, Christian Stocker wrote: >>> Hi >>> >>> As we all know, there's no COUNT feature in SQL2. This is not that bad, >>> if you're not trying to get a count via remoting, since you can get it >>> with .getSize() locally. But if you're working via the webdav layer, >>> it's not really efficient right now, especially if your resultsets are >>> larger. >>> >>> There are in my opinion 2 opinions to solve this. >>> >>> 1) add a count() feature to the SQL2 parser, there's already a ticket >>> for that (among other things) >>> https://issues.apache.org/jira/browse/JCR-2605 >>> That of course would be a great solutions, it solves the problem for >>>all >>> cases who need that >>> >>> 2) add an option to the SEARCH webdav interface to get the count if >>> results instead of the whole result set. >>> >>> For 2) I made a little prototype (which is really just that ;)) to see >>> if that would be feasible. That patch can be found here >>> >>> https://github.com/chregu/jackrabbit/compare/count-in-webdav >>> >>> It adds a D:count option to a SEARCH query like this >>> >>> <D:searchrequest xmlns:D="DAV:"> >>> <JCR-SQL2> >>> <![CDATA[SELECT * FROM [nt:unstructured] ]]> >>> </JCR-SQL2> >>> <D:count>1</D:count> >>> </D:searchrequest> >>> >>> and if that is set it returns only the count of results. Currently like >>> below, but that format has certainly to be changed (it was just the >>> shortest way to get something back): >>> >>> <D:multistatus xmlns:D="DAV:"> >>> <D:response> >>> <D:href>#count</D:href> >>> <D:responsedescription>433</D:responsedescription> >>> </D:response> >>> >>><D:responsedescription>nt:unstructured.jcr:primaryType</D:responsedescri >>>ption> >>> </D:multistatus> >>> >>> My questions: >>> >>> - Is there any chance that something like this goes into jackrabbit and >>> we can build on that feature? >>> - Or is there a better way to just get the count of a query? >>> - How should the response look like for something like this? The above >>> is certainly not right ;) >>> - And last but not least, are there any plans to implement COUNT() some >>> time in the future directly in SQL2? >>> >>> Thanks for any feedback. >>> >>> chregu >>> >>> >> >> -- >> Liip AG // Feldstrasse 133 // CH-8004 Zurich >> Tel +41 43 500 39 81 // Mobile +41 76 561 88 60 >> www.liip.ch // blog.liip.ch // GnuPG 0x0748D5FE >> > > > >-- >Amsterdam - Oosteinde 11, 1017 WT Amsterdam >Boston - 1 Broadway, Cambridge, MA 02142 > >US +1 877 414 4776 (toll free) >Europe +31(0)20 522 4466 >www.onehippo.com
