Hi,

> "What's the best way to query for LARGE NUMBERS of key/value pairs?"

As I wrote, I'm not aware of a limit (on the number of values or the query
length) in Oak. But sure, if it is possible to avoid having a large query,
then it should be avoided, for simplicity.

Regards,
Thomas




On 14/10/16 18:28, "Clay Ferguson" <wcl...@gmail.com> wrote:

>The "Traversed 210000 nodes" warning is really telling you that it was
>unable to use your indexes to perform the search. (I think) This doesn't
>mean too many results were found, it just means you didn't create all the
>right indexes for the search. Just create an index for each property, and
>then search them the normal way (without the LIKE clause, but using '=')
>and I bet you will see good performance.  If you genuinely have thousands
>of key/value pairs to search it is possible that your full-text approach
>is
>the best performing solution, but I'm not sure.
>
>However your general question is: "What's the best way to query for LARGE
>NUMBERS of key/value pairs?"
>
>Maybe some experts who know more than me about Oak can reply to that
>simplified version of your question.
>
>Best regards,
>Clay Ferguson
>wcl...@gmail.com
>
>
>On Fri, Oct 14, 2016 at 10:29 AM, rachna <rachana.me...@telegraph.co.uk>
>wrote:
>
>> Thanks Clay & Thomas.
>>
>> Taking a step back from our problem has helped to look at it in a
>>different
>> way.
>>
>> The tag property also stores the values in a specific format that show
>>the
>> tree structure.
>>
>> cq:tags
>> - location:europe
>> - type:waterfalls
>>
>> Therefore instead of traversing the repository to identify the
>>descendants
>> of these tags, we could use a LIKE query.
>>
>> e.g. SELECT * FROM [cq:PageContent] AS b WHERE ISDESCENDANTNODE(b,
>> [/content/guides]) AND ([cq:tags] LIKE 'location:europe%' OR [cq:tags]
>>LIKE
>> 'type:waterfalls%') ORDER BY [cq:lastModified]
>>
>> However, since our repository contains a large number of items that
>>match
>> this criteria, we start to see warnings about traversing the index.
>>
>> org.apache.jackrabbit.oak.plugins.index.property.strategy.
>> ContentMirrorStoreStrategy
>> Traversed 210000 nodes (210164 index entries) using index
>>jcr:primaryType
>> with filter Filter(query=SELECT * FROM [cq:PageContent] AS b WHERE
>> ISDESCENDANTNODE(b, [/content/guides]) AND ([cq:tags] LIKE
>> 'location:europe%' OR [cq:tags] LIKE 'type:waterfalls%') ORDER BY
>> [cq:lastModified], path=/content/guides//*, property=[cq:tags=[is not
>> null]])
>>
>> Instead, I created a lucene index that indexes the cq:tags (/w full
>>text)
>> and cq:lastModified (/w ordered support) property.
>>
>> e.g. SELECT [jcr:path] FROM [cq:PageContent] AS b WHERE
>>ISDESCENDANTNODE(b,
>> [/content/guides]) AND (CONTAINS([cq:tags], 'location:europe') OR
>> CONTAINS([cq:tags], 'type:waterfalls')) ORDER BY [cq:lastModified]
>>
>> That seems to be much faster than using a property index and should
>>solve
>> most of the issues that we might have (hopefully avoiding creating a new
>> index).
>>
>> Is there any support with the lucene index to use something like
>>STARTSWITH
>> rather CONTAINS?
>>
>> The maxClauseCount configuration parameter introduced the soft limit of
>> 1024
>> which is part of Jackrabbit 2.
>> We have been attempting to move to oak however our progress has been
>>slow
>> due to repository inconsistencies.
>> I realise this value is configurable however constantly increasing it
>> doesn't sound the right thing to do.
>>
>> Thanks,
>> Rachna
>>
>>
>>
>> --
>> View this message in context: http://jackrabbit.510166.n4.
>> nabble.com/Custom-index-type-tp4665031p4665121.html
>> Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
>>

Reply via email to