[ 
https://issues.apache.org/jira/browse/OAK-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16132091#comment-16132091
 ] 

Thomas Mueller commented on OAK-937:
------------------------------------

> support both by name and by tag lookup

[~chetanm] that's true. I don't want to document "by name" right now, because I 
think I want to remove this. There is a relatively large risk that people 
hardcode index names in the code, and then can not easily migrate to other 
indexes (for example, combine two indexes later on, or switch to Solr). The 
only place where index name might be needed is for the nodetype index, as I 
found it hard to support tags there (the nodetype index is a "virtual" index 
and not tight to an index node). The "index name" is (for Lucene and property 
indexes) the name of the index definition node. You can specify both an index 
name and a tag, and all indexes where either one matches can be used.

> refresh to true ... relevant for 1.6 onwards

[~catholicon] OK. My plan is to backport to 1.6, not sure yet about earlier Oak 
versions yet. The good thing is, even if someone sets refresh in an earlier 
version, nothing bad will happen (it's just ignored).

> maybe, we should clarify how tags look like

I will document that only a-zA-Z0-9_ should be used. That makes sense. I didn't 
test with special characters so far, I think the limit is in the parser. I 
wouldn't use "-", even thought that works right now (by accident I guess). At 
some point we might want to change the parser to support + - * / and other 
operations.

By the way, internally this is implemented using property restrictions. So you 
can see this in the index plan as follows:

{noformat}
explain //* option(index tag helloWorld)
...
cost using filter Filter(query=
explain select [jcr:path], [jcr:score], * from [nt:base] as a option(index tag 
[helloWorld]) 
/* xpath: //* option(index tag helloWorld) */, 
path=*, property=[:indexTag=[helloWorld]])
{noformat}

>  I wonder if no 'tagged' indices could answer the query, then instead of 
> falling down to traversal

For the case a query uses a tag, but no index has that tag (or: all indexes 
with this tag doesn't know how to deal with that query). Right now, it will use 
traversal. That means, you will get a traversal warning, and the query will 
probably fail (if traversed too many nodes, and if configured to fail 
immediately). So the current behavior is "fail fast" (well, relatively fast). 
You propose to not fail the query, but use a different index. I think failing 
the query is actually better: it indicates something is not as expected. Maybe 
a typo. Maybe refresh was not set. Maybe forgot to add the index. I assume if 
one specifies a tag in the query, then the given index(es) are used, and not 
behind the scenes maybe a different one.

> better chance to win

That would be tweaking the cost in favour of some index(es). That could be 
done, but in this case I would probably use a different syntax, maybe 
"option(index prefer tag x)". I suggest we wait implementing this right now.

> Query engine index selection tweaks: shortcut and hint
> ------------------------------------------------------
>
>                 Key: OAK-937
>                 URL: https://issues.apache.org/jira/browse/OAK-937
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Alex Deparvu
>            Assignee: Thomas Mueller
>            Priority: Critical
>              Labels: candidate_oak_1_4, candidate_oak_1_6, performance
>             Fix For: 1.8, 1.7.6
>
>
> This issue covers 2 different changes related to the way the QueryEngine 
> selects a query index:
>  Firstly there could be a way to end the index selection process early via a 
> known constant value: if an index returns a known value token (like -1000) 
> then the query engine would effectively stop iterating through the existing 
> index impls and use that index directly.
>  Secondly it would be nice to be able to specify a desired index (if one is 
> known to perform better) thus skipping the existing selection mechanism (cost 
> calculation and comparison). This could be done via certain query hints [0].
> [0] http://en.wikipedia.org/wiki/Hint_(SQL)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to