Author: alexparvulescu
Date: Mon Sep  2 13:35:27 2013
New Revision: 1519440

URL: http://svn.apache.org/r1519440
Log:
https://issues.apache.org/jira/browse/OAK-301
 - added some query docs

Added:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/query.md
Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/differences.md
    
jackrabbit/oak/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/differences.md
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/differences.md?rev=1519440&r1=1519439&r2=1519440&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/differences.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/differences.md Mon Sep  2 
13:35:27 2013
@@ -71,7 +71,7 @@ Oak does not index content by default as
 necessary, much like in traditional RDBMSs. If there is no index for a 
specific query then the
 repository will be traversed. That is, the query will still work but probably 
be very slow.
 
-See TODO for how to create a custom index.
+See the [query overview page](/query/) for how to create a custom index.
 
 Observation
 -----------

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/query.md
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query.md?rev=1519440&view=auto
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query.md (added)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query.md Mon Sep  2 13:35:27 
2013
@@ -0,0 +1,99 @@
+## Query
+
+Oak does not index content by default as does Jackrabbit 2. You need to create 
custom indexes when
+necessary, much like in traditional RDBMSs. If there is no index for a 
specific query then the
+repository will be traversed. That is, the query will still work but probably 
be very slow.
+
+Query Indices are defined under the `oak:index` node.
+
+### Cost calculation
+
+Each query index is expected to estimate the worst-case cost to query with the 
given filter. 
+The returned value is between 1 (very fast; lookup of a unique node) and the 
estimated number of entries to traverse, if the cursor would be fully read, and 
if there could in theory be one network round-trip or disk read operation per 
node (this method may return a lower number if the data is known to be fully in 
memory).
+
+The returned value is supposed to be an estimate and doesn't have to be very 
accurate. Please note this method is called on each index whenever a query is 
run, so the method should be reasonably fast (not read any data itself, or at 
least not read too much data).
+
+If an index implementation can not query the data, it has to return 
`Double.POSITIVE_INFINITY`.
+
+### Property index
+
+To define a property index on a subtree you have to add an index definition 
node that:
+
+* must be of type `oak:queryIndexDefinition`
+* must have the `type` property set to __`property`__
+* contains the `propertyNames` property that indicates what properties will be 
stored in the index.
+
+    `propertyNames` can be a list of properties, and it is optional.in case it 
is missing, the node name will be used as a property name reference value
+
+_Optionally_ you can specify
+
+* a uniqueness constraint on a property index by setting the `unique` flag to 
`true`
+* that the property index only applies to a certain node type by setting the 
`declaringNodeTypes` property
+* the `reindex` flag which when set to `true`, triggers a full content 
re-index.
+
+Example:
+
+    {
+      NodeBuilder index = root.child("oak:index");
+      index.child("uuid")
+        .setProperty("jcr:primaryType", "oak:queryIndexDefinition", Type.NAME)
+        .setProperty("type", "property")
+        .setProperty("propertyNames", "jcr:uuid")
+        .setProperty("declaringNodeTypes", "mix:referenceable")
+        .setProperty("unique", true)
+        .setProperty("reindex", true);
+    }
+
+or to simplify you can use one of the existing 
`IndexUtils#createIndexDefinition` helper methods:
+
+    {
+      NodeBuilder index = IndexUtils.getOrCreateOakIndex(root);
+      IndexUtils.createIndexDefinition(index, "myProp", true, false, 
ImmutableList.of("myProp"), null);
+    }
+
+
+### Node type index
+
+The `NodeTypeIndex` implements a `QueryIndex` using `PropertyIndexLookup`s on 
`jcr:primaryType` `jcr:mixinTypes` to evaluate a node type restriction on the 
filter.
+The cost for this index is the sum of the costs of the `PropertyIndexLookup` 
for queries on `jcr:primaryType` and `jcr:mixinTypes`.
+
+
+### Lucene full-text index
+
+The full-text index update is asynchronous via a background thread, see 
`Oak#withAsyncIndexing`.
+
+This means that some full-text searches will not work for a small window of 
time: the background thread runs every 5 seconds, plus the time is takes to run 
the diff and to run the text-extraction process. The async update status is now 
reflected on the `oak:index` node with the help of a few properties, see 
[OAK-980](https://issues.apache.org/jira/browse/OAK-980)
+
+TODO Node aggregation [OAK-828](https://issues.apache.org/jira/browse/OAK-828)
+
+The index definition node for a lucene-based full-text index:
+
+* must be of type `oak:queryIndexDefinition`
+* must have the `type` property set to __`lucene`__
+* must contain the `async` property set to the value `async`, this is what 
sends the index update process to a background thread
+
+_Optionally_ you can add
+
+ * what subset of property types to be included in the index via the 
`includePropertyTypes` property
+ * a blacklist of property names: what property to be excluded from the index 
via the `excludePropertyNames` property
+ * the `reindex` flag which when set to `true`, triggers a full content 
re-index.
+
+Example:
+
+    {
+      NodeBuilder index = root.child("oak:index");
+      index.child("lucene")
+        .setProperty("jcr:primaryType", "oak:queryIndexDefinition", Type.NAME)
+        .setProperty("type", "lucene")
+        .setProperty("async", "async")
+        .setProperty(PropertyStates.createProperty("includePropertyTypes", 
ImmutableSet.of(
+            PropertyType.TYPENAME_STRING, PropertyType.TYPENAME_BINARY), 
Type.STRINGS))
+        .setProperty(PropertyStates.createProperty("excludePropertyNames", 
ImmutableSet.of( 
+            "jcr:createdBy", "jcr:lastModifiedBy"), Type.STRINGS))
+        .setProperty("reindex", true);
+    }
+
+
+### Solr full-text index
+
+`TODO`

Modified: 
jackrabbit/oak/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java?rev=1519440&r1=1519439&r2=1519440&view=diff
==============================================================================
--- 
jackrabbit/oak/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java
 (original)
+++ 
jackrabbit/oak/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java
 Mon Sep  2 13:35:27 2013
@@ -100,16 +100,19 @@ import org.slf4j.LoggerFactory;
  * Under it follows the index definition node that:
  * <ul>
  * <li>must be of type <code>oak:queryIndexDefinition</code></li>
- * <li>must have the <code>type</code> property set to <b><code>lucene</code>
+ * <li>must have the <code>type</code> property set to 
<b><code>lucene</code></b></li>
+ * <li>must have the <code>async</code> property set to 
<b><code>async</code></b></li>
  * </b></li>
  * </ul>
  * </p>
- * 
  * <p>
- * Note: <code>reindex<code> is a property that when set to <code>true</code>,
- * triggers a full content reindex.
+ * Optionally you can add
+ * <ul>
+ * <li>what subset of property types to be included in the index via the 
<code>includePropertyTypes<code> property</li>
+ * <li>a blacklist of property names: what property to be excluded from the 
index via the <code>excludePropertyNames<code> property</li>
+ * <li>the <code>reindex<code> flag which when set to <code>true<code>, 
triggers a full content re-index.</li>
+ * </ul>
  * </p>
- * 
  * <pre>
  * <code>
  * {
@@ -117,6 +120,7 @@ import org.slf4j.LoggerFactory;
  *     index.child("lucene")
  *         .setProperty("jcr:primaryType", "oak:queryIndexDefinition", 
Type.NAME)
  *         .setProperty("type", "lucene")
+ *         .setProperty("async", "async")
  *         .setProperty("reindex", "true");
  * }
  * </code>


Reply via email to