I have backported the fix for OAK-6333 to 1.2 branch also. Would be
part of Oak 1.2.28
Chetan Mehrotra


On Wed, Aug 30, 2017 at 8:18 AM, Jason Bailey <jason.bai...@sas.com> wrote:
> Yes, that's how I patched the problem. You've answered my question by 
> explaining that the index value is not the correct value and that issue is 
> being addressed.
>
> Thank you.
> Jason
>
> -----Original Message-----
> From: Chetan Mehrotra [mailto:chetan.mehro...@gmail.com]
> Sent: Wednesday, August 30, 2017 10:56 AM
> To: users@jackrabbit.apache.org
> Subject: Re: slowness accessing lucene results
>
> EXTERNAL
>
>> "cost for lucene-property[/oak:index/lucene] is 1001.0", "property
>> cost for cqMaster is 1584.0",
>
> Problem here is global fultext index /oak:index/lucene is getting picked up 
> instead of cqMaster. This has been fixed with OAK-6333 (not backported to 1.2 
> yet). As a workaround you can set 'entryCount'
> property to some high value such that this lucene index is not preferred 
> Chetan Mehrotra
>
>
> On Wed, Aug 30, 2017 at 7:05 PM, Jason Bailey <jason.bai...@sas.com> wrote:
>> Hi Chetan,
>>
>> For the lucene index provider we have the following configured
>> Enabled:
>>   Copy on Read
>>   Copy on Write
>>   Open Index asynchronously
>>   Prefetch Index Files
>>
>> Thread pool is 20
>> Extracted text cache size is 0
>> Extracted text cache expiry 300
>> Boolean clause limit 1024
>>
>> Additional notes:
>> The timing is coming from a backup of our prod environment that I'm using 
>> for testing. Running the query in our production environment actually 
>> results in a response time that is closer to an hour. When we use a query 
>> that does not use Lucene and utilizes another index like cqMaster it is 
>> magnitudes faster, in the range of 400 milliseconds to count.
>>
>> I'm investigating the usage of ACL now.
>>
>> The following is the text from the explain query
>>
>> {
>>     "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) 
>> AND [cq:master] IS NOT NULL",
>>     "language": "JCR-SQL2",
>>     "explain": {
>>         "logs": [
>>             "cost using filter Filter(query=explain SELECT * FROM [nt:base] 
>> WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, 
>> path=/content//*, property=[cq:master=[is not null]])",
>>             "cost for aggregate lucene is Infinity",
>>             "Evaluating plan with index definition Lucene Index : 
>> cqTag(/oak:index/cqTagLucene)",
>>             "Evaluating plan with index definition Lucene Index : 
>> workflow(/oak:index/workflowDataLucene)",
>>             "Evaluating plan with index definition Lucene Index : 
>> authorizables(/oak:index/authorizables)",
>>             "Evaluating plan with index definition Lucene Index : 
>> /oak:index/damAssetLucene",
>>             "Evaluating plan with index definition Lucene Index : 
>> tags(/oak:index/ntBaseLucene)",
>>             "Evaluating plan with index definition Lucene Index : 
>> /oak:index/lucene",
>>             "Evaluating plan with index definition Lucene Index : 
>> cq:Page(/oak:index/cqPageLucene)",
>>             "cost for lucene-property[/oak:index/lucene] is 1001.0",
>>             "cost for reference is Infinity",
>>             "getPlans(Filter, List<OrderEntry>, NodeState)",
>>             "getPlans() - filter: Filter(query=explain SELECT * FROM 
>> [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, 
>> path=/content//*, property=[cq:master=[is not null]]) - ",
>>             "getPlans() - sortOrder: null - ",
>>             "getPlans() - rootState: { jcr:primaryType = rep:root, 
>> jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], 
>> sling:target = /index.html, sling:resourceType = sling:redirect, :async : { 
>> async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, 
>> f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = 
>> fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 
>> 2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = 
>> { ... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... 
>> }, content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = 
>> [mix:versionable], jcr:createdBy = admin, jcr:versionHistory = 
>> e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 
>> 2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 
>> 777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 
>> 40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... 
>> }, rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
>>             "cost for ordered is Infinity",
>>             "cost for nodeType is Infinity",
>>             "property cost for cqMaster is 1584.0",
>>             "cost for property is 1584.0",
>>             "cost for traverse is 1.0588708E7"
>>         ],
>>         "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) 
>> :ancestors:/content where (isdescendantnode([nt:base], [/content])) and 
>> ([nt:base].[cq:master] is not null) */"
>>     },
>>     "heuristics": {
>>         "executeTime": 0,
>>         "getNodesTime": 3279,
>>         "count": 2949,
>>         "countTime": 349807,
>>         "totalTime": 353086
>>     }
>> }
>>
>> -Jason
>>
>> -----Original Message-----
>> From: Chetan Mehrotra [mailto:chetan.mehro...@gmail.com]
>> Sent: Wednesday, August 30, 2017 1:03 AM
>> To: users@jackrabbit.apache.org
>> Subject: Re: slowness accessing lucene results
>>
>> EXTERNAL
>>
>> The query can be slow due to multiple reasons
>>
>> 1. Copy-On-Write support is not enabled
>>
>> 2. Some of the JCR query constraints may not be getting mapped to
>> Lucene index i.e. index definition does not have full coverage of
>> query constraints causing quite a few paths to get filtered out. So
>> actual number of rows accessed may be large but only few of them
>> qualify all criteria at query engine level. This can be confirmed by
>> query explanation output
>>
>> 3. Or lots of filtering happening due to access control. Does the 
>> performance remains same with admin user?
>>
>> Chetan Mehrotra
>>
>>
>> On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <jason.bai...@sas.com> wrote:
>>> Hi all,
>>>
>>> I'm tracking down a problem I'm currently experiencing when my queries use 
>>> the lucene index on oak 1.2.18.  Here are the figures that I get from the 
>>> acs commons tool that breaks down queries.
>>>
>>> "executeTime": 0
>>> "getNodesTime": 3279
>>> "count": 2949
>>> "countTime": 349807
>>> "totalTime": 353086
>>>
>>> In essence, it's taking me about 6 minutes to iterate through less than 
>>> 3000 results. Does anyone have any ideas on root cause?
>>>
>>> -Jason

Reply via email to