Author: thomasm
Date: Thu Sep 20 13:41:58 2018
New Revision: 1841474
URL: http://svn.apache.org/viewvc?rev=1841474&view=rev
Log:
OAK-7768: Ability to deprecate an index
Modified:
jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md
Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md
URL:
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md?rev=1841474&r1=1841473&r2=1841474&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md
(original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md Thu
Sep 20 13:41:58 2018
@@ -43,17 +43,18 @@ grep "^#.*$" src/site/markdown/query/que
* [XPath to SQL-2 Transformation](#XPath_to_SQL-2_Transformation)
* [The Node Type Index](#The_Node_Type_Index)
* [Temporarily Disabling an Index](#Temporarily_Disabling_an_Index)
+ * [Deprecating Indexes](#Deprecating_Indexes)
* [The Deprecated Ordered Index](#The_Deprecated_Ordered_Index)
* [Index Storage and Manual
Inspection](#Index_Storage_and_Manual_Inspection)
* [SQL-2 Optimisation](#SQL-2_Optimisation)
* [Additional XPath and SQL-2
Features](#Additional_XPath_and_SQL-2_Features)
-
+
## Overview
-Oak does not index as much content by default as does Jackrabbit 2. You need
to create custom
-indexes when necessary, much like in traditional RDBMSs. If there is no index
for a
-specific query, then the repository will be traversed. That is, the query will
still
+Oak does not index as much content by default as does Jackrabbit 2. You need
to create custom
+indexes when necessary, much like in traditional RDBMSs. If there is no index
for a
+specific query, then the repository will be traversed. That is, the query will
still
work but probably be very slow.
Query Indices are defined under the `oak:index` node.
@@ -61,7 +62,7 @@ Query Indices are defined under the `oak
### Query Processing
Internally, the query engine uses a cost based query optimizer that asks all
the available
-query indexes for the estimated cost to process the query. It then uses the
index with the
+query indexes for the estimated cost to process the query. It then uses the
index with the
lowest cost.
By default, the following indexes are available:
@@ -72,16 +73,16 @@ By default, the following indexes are av
jcr:primaryType and jcr:mixins).
* A traversal index that iterates over a subtree.
-If no index can efficiently process the filter condition, the nodes in the
repository are
+If no index can efficiently process the filter condition, the nodes in the
repository are
traversed at the given subtree.
-Usually, data is read from the index and repository while traversing over the
query
+Usually, data is read from the index and repository while traversing over the
query
result. There are exceptions however, where all data is read in memory when
the query
-is executed. The most common case is when using an `order by` clause and
+is executed. The most common case is when using an `order by` clause and
the index can not provide a sorted result.
-There are other cases where paths of the results read so far are kept in
memory,
-in order to not return duplicate results.
-This is the case when using `or` conditions such that two indexes are used
+There are other cases where paths of the results read so far are kept in
memory,
+in order to not return duplicate results.
+This is the case when using `or` conditions such that two indexes are used
(internally a `union` query is executed).
If you enable debug logging for the module `org.apache.jackrabbit.oak.query`,
you may see this:
@@ -89,37 +90,37 @@ If you enable debug logging for the modu
cost for nodeType is 1638354.0
cost for property is 2.0
cost for traverse is 3451100.0
-
-This means the cost for the nodetype index _would_ be about 1638354.0,
-the cost for the property index _would_ be about 2,
-and the cost for traversal _would_ be about 3451100.0.
+
+This means the cost for the nodetype index _would_ be about 1638354.0,
+the cost for the property index _would_ be about 2,
+and the cost for traversal _would_ be about 3451100.0.
An index that can't deal with a certain condition will return the cost
"Infinity".
-It doesn't say traversal is actually used, it just lists the expected costs.
-The query engine will then pick the index with the lowest expected cost,
+It doesn't say traversal is actually used, it just lists the expected costs.
+The query engine will then pick the index with the lowest expected cost,
which is (in the case above) "property".
-The expected cost for traversal is, with Oak 1.0.x, really just a guess
looking at
-the length of the path. With Oak 1.2 and newer, the "counter" index is used
-(see mainly OAK-1907). There is an known issue with this, if you add and
remove a lot
+The expected cost for traversal is, with Oak 1.0.x, really just a guess
looking at
+the length of the path. With Oak 1.2 and newer, the "counter" index is used
+(see mainly OAK-1907). There is an known issue with this, if you add and
remove a lot
of nodes in a loop, you could end up with a too-low cost, see OAK-4065.
#### Cost Calculation
-Each query index is expected to estimate the worst-case cost to query with the
given filter.
-The returned value is between 1 (very fast; lookup of a unique node) and
-the estimated number of entries to traverse, if the cursor would be fully
read,
-and if there could in theory be one network round-trip or disk read operation
per node
+Each query index is expected to estimate the worst-case cost to query with the
given filter.
+The returned value is between 1 (very fast; lookup of a unique node) and
+the estimated number of entries to traverse, if the cursor would be fully read,
+and if there could in theory be one network round-trip or disk read operation
per node
(this method may return a lower number if the data is known to be fully in
memory).
-The returned value is supposed to be an estimate and doesn't have to be very
accurate.
-Please note this method is called on each index whenever a query is run,
+The returned value is supposed to be an estimate and doesn't have to be very
accurate.
+Please note this method is called on each index whenever a query is run,
so the method should be reasonably fast (not read any data itself, or at least
not read too much data).
If an index implementation can not query the data, it has to return `Infinity`
(`Double.POSITIVE_INFINITY`).
### Query Options
-With query options, you can enforce the usage of indexes (failing the query if
there is no index),
+With query options, you can enforce the usage of indexes (failing the query if
there is no index),
and you can limit which indexes should be considered.
#### Query Option Traversal
@@ -130,22 +131,22 @@ By default, queries without index will l
Traversal query (query without index): {statement}; consider creating an
index
This message is only logged if no index is available, and if the query
-potentially traverses many nodes.
+potentially traverses many nodes.
No message is logged if an index is available, but traversing is cheap.
By setting the JMX configuration `QueryEngineSettings.failTraversal` to true,
queries without index throw an exception instead of just logging a message.
-In the query itself, the syntax `option(traversal {ok|fail|warn})` is
supported
+In the query itself, the syntax `option(traversal {ok|fail|warn})` is supported
(at the very end of the statement, after `order by`).
-This is to override the default setting,
+This is to override the default setting,
for queries that traverse a well known number of nodes (for example 10 or 20
nodes).
-This is supported for both XPath and SQL-2, as follows:
+This is supported for both XPath and SQL-2, as follows:
/jcr:root/oak:index/*[@type='lucene'] option(traversal ok)
- select * from [nt:base]
- where ischildnode('/oak:index')
+ select * from [nt:base]
+ where ischildnode('/oak:index')
order by name()
option(traversal ok)
@@ -159,11 +160,11 @@ But in rare cases, it is needed to speci
* If there are multiple Lucene fulltext indexes with different aggregation
rules,
and the index data overlaps.
In the query, you want to specify which aggregation rule to use.
-* To temporarily work around limitations of the index implementation,
+* To temporarily work around limitations of the index implementation,
for example incorrect cost estimation.
Using index tags should be the exception, and should only be used temporarily.
-To use index tags,
+To use index tags,
add `tags` (a multi-valued String property) to the index(es) of choice, for
example:
/oak:index/lucene/tags = [x, y]
@@ -173,11 +174,11 @@ The syntax to limit a query to a certain
/jcr:root/content//element(*, nt:file)[jcr:contains(., 'test')]
option(index tag x)
- select * from [nt:file]
+ select * from [nt:file]
where ischildnode('/content')
and contains(*, 'test')
option(index tag [x])
-
+
The query will only consider the indexes that contain the specified tag (that
is, possibly multiple indexes).
Each query supports one tag only.
The tag name may only contain the characters `a-z, A-Z, 0-9, _`.
@@ -186,9 +187,9 @@ Limitations:
* This is currently supported in indexes of type `lucene` compatVersion 2, and
type `property`.
* For indexes of type `lucene`, when adding adding or changing the property
`tags`,
- you need to also set the property `refresh` to `true` (Boolean),
+ you need to also set the property `refresh` to `true` (Boolean),
so that the change is applied. No indexing is required.
-* Solr indexes, and the reference index, don't support tags yet.
+* Solr indexes, and the reference index, don't support tags yet.
That means they still might return a low cost, even if the tag does not
match.
* The nodetype index only partially supports this feature: if a tag is
specified in the query, then the nodetype index
is not used. However, tags in the nodetype index itself are ignored
currently.
@@ -206,10 +207,10 @@ Lucene result set size, including nodes
By default, Oak does not do this; it either returns the correct result size,
or -1.
Oak 1.2.x and newer supports a compatibility flag so that it works in a
similar way
as Jackrabbit 2.x, by returning an estimate (see OAK-2926).
-Specially, only query restrictions that are part of the used index are
considered when calculating the size.
-Additionally, ACLs are not applied to the results,
-so nodes which are not visible to the current session will still be included
in the count returned.
-As such, the count returned can be higher than the actual number of results
+Specially, only query restrictions that are part of the used index are
considered when calculating the size.
+Additionally, ACLs are not applied to the results,
+so nodes which are not visible to the current session will still be included
in the count returned.
+As such, the count returned can be higher than the actual number of results
and the accurate count can only be determined by iterating through the results.
This only works with the Lucene `compatVersion=2` right now,
@@ -237,23 +238,23 @@ the query parser is now generally more s
The following query used to work in Jackrabbit 2.x, but not in Oak,
because multiple way to quote the path are used at the same time:
- SELECT * FROM [nt:base] AS s
+ SELECT * FROM [nt:base] AS s
WHERE ISDESCENDANTNODE(s, ["/libs/sling/config"])
-
+
Instead, the query now needs to be:
- SELECT * FROM [nt:base] AS s
+ SELECT * FROM [nt:base] AS s
WHERE ISDESCENDANTNODE(s, [/libs/sling/config])
-
+
#### Equality for Path Constraints
In Jackrabbit 2.x, the following condition was interpreted as a LIKE condition:
SELECT * FROM nt:base WHERE jcr:path = '/abc/%'
-
+
Therefore, the query behaves exactly the same as if LIKE was used.
In Oak, this is no longer the case, and such queries search for an exact path
match.
-
+
### Slow Queries and Read Limits
Slow queries are logged as follows:
@@ -261,7 +262,7 @@ Slow queries are logged as follows:
*WARN* Traversed 10000 nodes with filter Filter(query=select ...)
consider creating an index or changing the query
-If this is the case, an index might need to be created, or the condition
+If this is the case, an index might need to be created, or the condition
of the query might need to be changed to take advantage of an existing index.
Queries that traverse many nodes, or that read many nodes in memory, can be
cancelled.
@@ -269,23 +270,23 @@ The limits can be set at runtime (also w
domain "org.apache.jackrabbit.oak", type "QueryEngineSettings", attribute names
"LimitInMemory" and "LimitReads".
These setting are not persisted, so in the next restart, the default values
(unlimited) are used.
-As a workaround, these limits can be changed using the system properties
+As a workaround, these limits can be changed using the system properties
"oak.queryLimitInMemory" and "oak.queryLimitReads".
-Queries that exceed one of the limits are cancelled with an
UnsupportedOperationException saying that
+Queries that exceed one of the limits are cancelled with an
UnsupportedOperationException saying that
"The query read more than x nodes... To avoid running out of memory,
processing was stopped."
-"LimitReads" applies to the number of nodes read by a query.
+"LimitReads" applies to the number of nodes read by a query.
It applies whether or not an index is used.
As an example, if a query has just two conditions, as in `a=1 and b=2`, and if
there is an index on `a`,
then all nodes with `a=1` need to be read while traversing the result.
If more nodes are read than the set limit, then an exception is thrown.
-If the query also has a path condition (for example descendants of `/home`),
+If the query also has a path condition (for example descendants of `/home`),
and if the index supports path conditions (which is the case for all property
indexes, and
also for Lucene indexes if `evaluatePathRestrictions` is set), then only nodes
in the given subtree
are read.
-"LimitInMemory" applies to nodes read in memory, in order to sort the result,
-and in order to ensure the same node is only returned once.
+"LimitInMemory" applies to nodes read in memory, in order to sort the result,
+and in order to ensure the same node is only returned once.
It applies whether or not an index is used.
As an example, if a query uses `order by c`, and if the index used for this
query does not
support ordering by this property, then all nodes that match the condition are
read in memory first,
@@ -293,11 +294,11 @@ even before the first node is returned.
Property indexed can not order, and Lucene indexes can only order when
enabling `ordered` for a property.
Ensuring the same node is only returned once: this is needed for queries that
contain `union`
(it is not needed when using `union all`). It is also needed if a query uses
`or` conditions
-on different properties. For example, if a query uses `a=1 or b=2`,
+on different properties. For example, if a query uses `a=1 or b=2`,
then the conversion to `union` would be `select ... where a=1 union select ...
where b=2`.
-The query is converted to a `union` so that both indexes can be used,
+The query is converted to a `union` so that both indexes can be used,
in case there are separate indexes for `a` and `b`.
-For XPath queries, such conversion to `union` is always made,
+For XPath queries, such conversion to `union` is always made,
and for SQL-2 queries such a conversion is only made if the `union` query has
a lower expected cost.
When using `or` in combination with the same property, as in `a=1 or a=2`,
then no conversion to `union` is made.
@@ -305,24 +306,24 @@ When using `or` in combination with the
The full-text syntax supported by Jackrabbit Oak is a superset of the JCR
specification.
-By default (that is, using a Lucene index with `compatVersion` 2), Jackrabbit
Oak uses the
+By default (that is, using a Lucene index with `compatVersion` 2), Jackrabbit
Oak uses the
[Apache Lucene grammar for fulltext
search](https://lucene.apache.org/core/4_7_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html).
[See also how to escape
queries.](https://wiki.apache.org/jackrabbit/EncodingAndEscaping).
However, the words `AND` and `NOT` are treated as search terms, and only `OR`
is supported as a keyword. Instead of `NOT`, use `-`.
Instead of `AND`, just write the terms next to each other (instead of `hello
AND world`, just write `hello world`).
-For older Lucene indexes (`compatVersion` 1),
+For older Lucene indexes (`compatVersion` 1),
the following syntax is supported within `contains` queries.
This is a subset of the Apache Lucene syntax:
FullTextSearch ::= Or
- Or ::= And { ' OR ' And }*
+ Or ::= And { ' OR ' And }*
And ::= Term { ' ' Term }*
Term ::= ['-'] { SimpleTerm | PhraseTerm } [ '^' Boost ]
SimpleTerm ::= Word
PhraseTerm ::= '"' Word { ' ' Word }* '"'
Boost ::= <number>
-
+
Please note that `OR` needs to be written in uppercase.
Characters within words can be escaped using a backslash.
@@ -330,7 +331,7 @@ Examples:
jcr:contains(., 'jelly sandwich^4')
jcr:contains(@jcr:title, 'find this')
-
+
In the first example, the word "sandwich" has weight four times more than the
word "jelly."
For details of boosting, see the Apache Lucene documentation about Score
Boosting.
@@ -344,11 +345,11 @@ The Lucene index can be configured to pr
See <a href="lucene.html#Property_Definitions">useInExcerpt</a> for details
on how to configure excerpt generation.
-For queries to use those excerpts, the query needs to use the Lucene index
where
+For queries to use those excerpts, the query needs to use the Lucene index
where
this is configured. The queries also needs to contain the "excerpt" property,
as follows:
/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(.))
-
+
The excerpt is then read using the JCR API call
`row.getValue("rep:excerpt(.)")`.
Since Oak version 1.10 (OAK-7151), optionally a property name can be specified
in the query:
@@ -360,13 +361,13 @@ and the excerpt for the node using (as b
#### SimpleExcerptProvider
-The SimpleExcerptProvider is a fallback mechanism for excerpts and
highlighting.
+The SimpleExcerptProvider is a fallback mechanism for excerpts and
highlighting.
This mechanism has limitations, and should only be used if really needed.
The SimpleExcerptProvider is independent of the index configuration.
Highlighting is limited, for example stopwords are ignored.
Highlighting is case insensitive since Oak versions 1.2.30, 1.4.22, 1.6.12,
1.8.3, and 1.10 (OAK-7437).
-The SimpleExcerptProvider is used when reading an excerpt
+The SimpleExcerptProvider is used when reading an excerpt
if the query doesn't contain an excerpt property, as in:
/jcr:root/content//*[jcr:contains(., 'test')]
@@ -377,7 +378,7 @@ when using `row.getValue("rep:excerpt(@t
this property as an excerpt property, as in:
/jcr:root/content//*[jcr:contains(., 'test')]/(rep:excerpt(.))
-
+
The SimpleExcerptProvider is also used for queries that don't use
a Lucene index, or if the query uses a Lucene index, but excerpts are not
configured there.
@@ -393,29 +394,29 @@ For SQL-2, the selector name (if needed)
Examples:
//*[rep:native('solr', 'name:(Hello OR World)')]
-
- select [jcr:path] from [nt:base]
+
+ select [jcr:path] from [nt:base]
where native('solr', 'name:(Hello OR World)')
- select [jcr:path] from [nt:base] as a
+ select [jcr:path] from [nt:base] as a
where native(a, 'solr', 'name:(Hello OR World)')
This also allows to use the Solr
[MoreLikeThis](http://wiki.apache.org/solr/MoreLikeThis)
feature. An example query is:
- select [jcr:path] from [nt:base]
+ select [jcr:path] from [nt:base]
where native('solr',
'mlt?q=id:UTF8TEST&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1')
If no full-text implementation is available, those queries will fail.
### Similarity Queries
-Oak supports similarity queries when using the Lucene or Solr indexes.
+Oak supports similarity queries when using the Lucene or Solr indexes.
For example, the following query will return nodes that have similar content
than
the node /test/a:
//element(*, nt:base)[rep:similar(., '/test/a')]
-
+
Compared to Jackrabbit 2.x, support for rep:similar has the following
limitations:
Full-text aggregation is not currently supported.
@@ -424,34 +425,34 @@ Full-text aggregation is not currently s
`@since Oak 1.1.17, 1.0.13`
Oak supports spellcheck queries when using the Lucene or Solr indexes.
-Unlike most queries, spellcheck queries won't return a JCR `Node` as
-the outcome of such queries will be text terms
+Unlike most queries, spellcheck queries won't return a JCR `Node` as
+the outcome of such queries will be text terms
that come from content as written into JCR `properties`.
For example, the following query will return spellchecks for the (wrongly
spelled) term `helo`:
/jcr:root[rep:spellcheck('helo')]/(rep:spellcheck())
-
-The result of such a query will be a JCR `Row` which will contain the
corrected terms,
-as spellchecked by the used underlying
+
+The result of such a query will be a JCR `Row` which will contain the
corrected terms,
+as spellchecked by the used underlying
index, in a special property named `rep:spellcheck()`.
Clients wanting to obtain spellchecks could use the following JCR code:
`@until Oak 1.3.10, 1.2.13` spellchecks are returned flat.
-
+
QueryManager qm = ...;
String xpath = "/jcr:root[rep:spellcheck('helo')]/(rep:spellcheck())";
QueryResult result = qm.createQuery(xpath, Query.XPATH).execute();
RowIterator it = result.getRows();
String spellchecks = "";
if (it.hasNext()) {
- spellchecks = row.getValue("rep:spellcheck()").getString()
+ spellchecks = row.getValue("rep:spellcheck()").getString()
}
-
+
The `spellchecks` String would be have the following pattern
`\[[\w|\W]+(\,\s[\w|\W]+)*\]`, e.g.:
[hello, hold]
-
+
`@since Oak 1.3.11, 1.2.14` each spellcheck would be returned per row.
QueryManager qm = ...;
@@ -460,15 +461,15 @@ The `spellchecks` String would be have t
RowIterator it = result.getRows();
List<String> spellchecks = new LinkedList<String>();
while (it.hasNext()) {
- spellchecks.add(row.getValue("rep:spellcheck()").getString());
+ spellchecks.add(row.getValue("rep:spellcheck()").getString());
}
-
-If either Lucene or Solr were configured to provide the spellcheck feature,
see
+
+If either Lucene or Solr were configured to provide the spellcheck feature, see
[Enable spellchecking in Lucene](lucene.html#Spellchecking) and [Enable
spellchecking in Solr](solr.html#Spellchecking)
-Note that spellcheck terms come already filtered according to calling user
privileges,
-so that users could see spellcheck
+Note that spellcheck terms come already filtered according to calling user
privileges,
+so that users could see spellcheck
corrections only coming from indexed content they are allowed to read.
### Suggestions
@@ -476,36 +477,36 @@ corrections only coming from indexed con
`@since Oak 1.1.17, 1.0.15`
Oak supports search suggestions when using the Lucene or Solr indexes.
-Unlike most queries, suggest queries won't return a JCR `Node` as the outcome
of such queries
+Unlike most queries, suggest queries won't return a JCR `Node` as the outcome
of such queries
will be text terms that come from content as written into JCR `properties`.
For example, the following query will return search suggestions for the (e.g.
user entered) term `in `:
/jcr:root[rep:suggest('in ')]/(rep:suggest())
-
-The result of such a query will be a JCR `Row` which will contain the
suggested terms,
-together with their score, as
+
+The result of such a query will be a JCR `Row` which will contain the
suggested terms,
+together with their score, as
suggested and scored by the used underlying index, in a special property named
`rep:suggest()`.
Clients wanting to obtain suggestions could use the following JCR code:
`@until Oak 1.3.10, 1.2.13` suggestions are returned flat.
-
+
QueryManager qm = ...;
String xpath = "/jcr:root[rep:suggest('in ')]/(rep:suggest())";
QueryResult result = qm.createQuery(xpath, Query.XPATH).execute();
RowIterator it = result.getRows();
String suggestions = "";
if (it.hasNext()) {
- suggestions = row.getValue("rep:suggest()").getString()
+ suggestions = row.getValue("rep:suggest()").getString()
}
-The `suggestions` String would be have the following pattern
+The `suggestions` String would be have the following pattern
`\[\{(term\=)[\w|\W]+(\,weight\=)\d+\}(\,\{(term\=)[\w|\W]+(\,weight\=)\d+\})*\]`,
e.g.:
- [{term=in 2015 a red fox is still a fox,weight=1.5}, {term=in 2015 my fox
is red,
+ [{term=in 2015 a red fox is still a fox,weight=1.5}, {term=in 2015 my fox
is red,
like mike's fox and john's fox,weight=0.7}]
-
-
+
+
`@since Oak 1.3.11, 1.2.14` each suggestion would be returned per row.
QueryManager qm = ...;
@@ -514,20 +515,20 @@ The `suggestions` String would be have t
RowIterator it = result.getRows();
List<String> suggestions = new LinkedList<String>();
while (it.hasNext()) {
- suggestions.add(row.getValue("rep:suggest()").getString());
+ suggestions.add(row.getValue("rep:suggest()").getString());
}
-
-If either Lucene or Solr were configured to provide the suggestions feature,
+
+If either Lucene or Solr were configured to provide the suggestions feature,
see [Enable suggestions in Lucene](lucene.html#Suggestions) and [Enable
suggestions in Solr](solr.html#Suggestions).
-Note that suggested terms come already filtered according to calling user
privileges,
+Note that suggested terms come already filtered according to calling user
privileges,
so that users could see suggested
terms only coming from indexed content they are allowed to read.
### Facets
-`@since Oak 1.3.14` Oak has support for
[facets](https://en.wikipedia.org/wiki/Faceted_search).
-Once enabled (see details for [Lucene](lucene.html#Facets) and/or
[Solr](solr.html#Suggestions) indexes)
+`@since Oak 1.3.14` Oak has support for
[facets](https://en.wikipedia.org/wiki/Faceted_search).
+Once enabled (see details for [Lucene](lucene.html#Facets) and/or
[Solr](solr.html#Suggestions) indexes)
facets can be retrieved on properties (backed by a proper
field in Lucene / Solr) using the following snippet:
@@ -543,35 +544,35 @@ field in Lucene / Solr) using the follow
int count = facet.getCount();
...
}
-
+
Nodes/Rows can still be retrieved from within the QueryResult object the usual
way.
### XPath to SQL-2 Transformation
-To support the XPath query language, such queries are internally converted to
SQL-2.
+To support the XPath query language, such queries are internally converted to
SQL-2.
-Every conversion is logged in `debug` level under the
+Every conversion is logged in `debug` level under the
`org.apache.jackrabbit.oak.query.QueryEngineImpl` logger:
- org.apache.jackrabbit.oak.query.QueryEngineImpl Parsing xpath statement:
+ org.apache.jackrabbit.oak.query.QueryEngineImpl Parsing xpath statement:
//element(*)[@sling:resourceType = 'slingevent:Lock')]
- org.apache.jackrabbit.oak.query.QueryEngineImpl XPath > SQL2:
- select [jcr:path], [jcr:score], * from [nt:base] as a
- where [sling:resourceType] = 'slingevent:Lock'
- /* xpath: //element(*)[@sling:resourceType = 'slingevent:Lock'
+ org.apache.jackrabbit.oak.query.QueryEngineImpl XPath > SQL2:
+ select [jcr:path], [jcr:score], * from [nt:base] as a
+ where [sling:resourceType] = 'slingevent:Lock'
+ /* xpath: //element(*)[@sling:resourceType = 'slingevent:Lock'
and @lock.created < xs:dateTime('2013-09-02T15:44:05.920+02:00')] */
_Each transformed SQL-2 query contains the original XPath query as a comment._
When converting from XPath to SQL-2, `or` conditions are automatically
converted to
-`union` queries, so that indexes can be used for conditions of the form
+`union` queries, so that indexes can be used for conditions of the form
`a = 'x' or b = 'y'`.
### The Node Type Index
-The `NodeTypeIndex` implements a `QueryIndex` using `PropertyIndexLookup`s on
+The `NodeTypeIndex` implements a `QueryIndex` using `PropertyIndexLookup`s on
`jcr:primaryType` `jcr:mixinTypes` to evaluate a node type restriction on the
filter.
-The cost for this index is the sum of the costs of the `PropertyIndexLookup`
+The cost for this index is the sum of the costs of the `PropertyIndexLookup`
for queries on `jcr:primaryType` and `jcr:mixinTypes`.
### Temporarily Disabling an Index
@@ -580,6 +581,13 @@ To temporarily disable an index (for exa
Please note that while the index type is not set, the index is not updated, so
if you enable it again,
it might not be correct. This is specially important for synchronous indexes.
+### Deprecating Indexes
+
+If an index is still needed for backward compatibility, but should not longer
be used
+for queries, then set the property `deprecated` in the index definition.
+This will log a warning "This index is deprecated" if the index is used for
any queries.
+This will allow to detect any usage of the index in the application.
+
### The Deprecated Ordered Index
This index has been deprecated in favour of Lucene Property
@@ -595,19 +603,19 @@ For historical information around the in
### Index Storage and Manual Inspection
Sometimes there is a need to inspect the index content for debugging (or pure
curiosity).
-The index content is generally stored as content under the index definition as
hidden nodes
+The index content is generally stored as content under the index definition as
hidden nodes
(this doesn't apply to the solr index).
-In order to be able to browse down into an index content you need a low level
repository tool
+In order to be able to browse down into an index content you need a low level
repository tool
that allows NodeStore level access.
-There are currently 2 options: the oak-console (command line tool,
+There are currently 2 options: the oak-console (command line tool,
works will all existing NodeStore implementations) and the oak-explorer
-(gui based on java swing, works only with the SegmentNodeStore),
-both available as run modes of the
+(gui based on java swing, works only with the SegmentNodeStore),
+both available as run modes of the
[oak-run](https://github.com/apache/jackrabbit-oak/blob/trunk/oak-run/README.md)
module
-The structure of the index is specific to each implementation and is subject
to change.
+The structure of the index is specific to each implementation and is subject
to change.
What is worth mentioning is that all the _*PropertyIndex_
-flavors store the content as unstructured nodes (clear readable text),
+flavors store the content as unstructured nodes (clear readable text),
the _Lucene_ index is stored as binaries, so one would need to export the
entire Lucene directory to the local file system and browse it using a
dedicated tool.
@@ -627,16 +635,15 @@ To disable it provide `-Doak.query.sql2o
The Oak implementation supports some features that are not part of the JCR
specification:
@since 1.5.12
-
+
Union for XPath and SQL-2 queries. Examples:
/jcr:root/(content|lib)/*
/jcr:root/content//*[@a] | /jcr:root/lib//*[@b]) order by @c
- select * from [nt:base] as a where issamenode(a, '/content')
+ select * from [nt:base] as a where issamenode(a, '/content')
union select * from [nt:base] as a where issamenode(a, '/lib')
XPath functions "fn:string-length" and "fn:local-name".
-
-
\ No newline at end of file
+