Hi, > "What's the best way to query for LARGE NUMBERS of key/value pairs?"
As I wrote, I'm not aware of a limit (on the number of values or the query length) in Oak. But sure, if it is possible to avoid having a large query, then it should be avoided, for simplicity. Regards, Thomas On 14/10/16 18:28, "Clay Ferguson" <[email protected]> wrote: >The "Traversed 210000 nodes" warning is really telling you that it was >unable to use your indexes to perform the search. (I think) This doesn't >mean too many results were found, it just means you didn't create all the >right indexes for the search. Just create an index for each property, and >then search them the normal way (without the LIKE clause, but using '=') >and I bet you will see good performance. If you genuinely have thousands >of key/value pairs to search it is possible that your full-text approach >is >the best performing solution, but I'm not sure. > >However your general question is: "What's the best way to query for LARGE >NUMBERS of key/value pairs?" > >Maybe some experts who know more than me about Oak can reply to that >simplified version of your question. > >Best regards, >Clay Ferguson >[email protected] > > >On Fri, Oct 14, 2016 at 10:29 AM, rachna <[email protected]> >wrote: > >> Thanks Clay & Thomas. >> >> Taking a step back from our problem has helped to look at it in a >>different >> way. >> >> The tag property also stores the values in a specific format that show >>the >> tree structure. >> >> cq:tags >> - location:europe >> - type:waterfalls >> >> Therefore instead of traversing the repository to identify the >>descendants >> of these tags, we could use a LIKE query. >> >> e.g. SELECT * FROM [cq:PageContent] AS b WHERE ISDESCENDANTNODE(b, >> [/content/guides]) AND ([cq:tags] LIKE 'location:europe%' OR [cq:tags] >>LIKE >> 'type:waterfalls%') ORDER BY [cq:lastModified] >> >> However, since our repository contains a large number of items that >>match >> this criteria, we start to see warnings about traversing the index. >> >> org.apache.jackrabbit.oak.plugins.index.property.strategy. >> ContentMirrorStoreStrategy >> Traversed 210000 nodes (210164 index entries) using index >>jcr:primaryType >> with filter Filter(query=SELECT * FROM [cq:PageContent] AS b WHERE >> ISDESCENDANTNODE(b, [/content/guides]) AND ([cq:tags] LIKE >> 'location:europe%' OR [cq:tags] LIKE 'type:waterfalls%') ORDER BY >> [cq:lastModified], path=/content/guides//*, property=[cq:tags=[is not >> null]]) >> >> Instead, I created a lucene index that indexes the cq:tags (/w full >>text) >> and cq:lastModified (/w ordered support) property. >> >> e.g. SELECT [jcr:path] FROM [cq:PageContent] AS b WHERE >>ISDESCENDANTNODE(b, >> [/content/guides]) AND (CONTAINS([cq:tags], 'location:europe') OR >> CONTAINS([cq:tags], 'type:waterfalls')) ORDER BY [cq:lastModified] >> >> That seems to be much faster than using a property index and should >>solve >> most of the issues that we might have (hopefully avoiding creating a new >> index). >> >> Is there any support with the lucene index to use something like >>STARTSWITH >> rather CONTAINS? >> >> The maxClauseCount configuration parameter introduced the soft limit of >> 1024 >> which is part of Jackrabbit 2. >> We have been attempting to move to oak however our progress has been >>slow >> due to repository inconsistencies. >> I realise this value is configurable however constantly increasing it >> doesn't sound the right thing to do. >> >> Thanks, >> Rachna >> >> >> >> -- >> View this message in context: http://jackrabbit.510166.n4. >> nabble.com/Custom-index-type-tp4665031p4665121.html >> Sent from the Jackrabbit - Users mailing list archive at Nabble.com. >>
