[ 
https://issues.apache.org/jira/browse/OAK-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004494#comment-14004494
 ] 

Davide Giannella commented on OAK-1806:
---------------------------------------

Unfortunately as of OAK-1570 we can't yet use an ordered index for
serving range queries on those amount of data and if we run on a
cluster we would hit OAK-1717.

I think that the main problem is that by selecting the last hour on
every {{nt:unstructured}} or {{oak:Unstructured}} the query engine
will actually filter far more nodes than what we would need. Maybe
creating a specific node type for the query would help in having more
consistent results.

For the other uses cases of {{LastModified}} we see it scaling a bit
better as we hits less nodes. Only our nodes that are created with a
[random 
date|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/src/main/java/org/apache/jackrabbit/oak/scalability/ScalabilityBlobSearchSuite.java#L393]
will have such "old" dates.

I just realised that for all the random extractions I didn't provide a
seed so we will have slightly different distributions of properties
value on each run.


> Benchmark for blob upload and search longevity
> ----------------------------------------------
>
>                 Key: OAK-1806
>                 URL: https://issues.apache.org/jira/browse/OAK-1806
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: run
>            Reporter: Amit Jain
>              Labels: benchmark, test
>         Attachments: OAK-1806-SDF.patch, OAK-1806-doc.patch, OAK-1806.patch
>
>
> Have a longevity test which incrementally increases the load by adding blobs 
> and then running full text search to measure the execution times and the 
> performance degradation for increased loads.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to