Author: chetanm
Date: Tue Jul 25 08:29:08 2017
New Revision: 1802899

URL: http://svn.apache.org/viewvc?rev=1802899&view=rev
Log:
OAK-6471 - Support adding or updating index definitions via oak-run

Update docs

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md

Modified: 
jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md?rev=1802899&r1=1802898&r2=1802899&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md 
(original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md 
Tue Jul 25 08:29:08 2017
@@ -34,6 +34,8 @@
         * [B - Online indexing](#online-indexing)
             * [Step 1 - Text PreExtraction](#online-indexing-pre-extract)
             * [Step 2 - Perform reindexing](#online-indexing-perform-reindex)
+        * [Updating or Adding New Index Definitions](#index-definition-updates)
+        * [JSON File Format](#json-file-format)
         * [Tika Setup](#tika-setup)
 
 `@since Oak 1.7.0`
@@ -150,6 +152,7 @@ Here following options can be used
 * `--index-paths` - This command requires an explicit set of index paths which 
need to be indexed (required)
 * `--checkpoint` - The checkpoint up to which the index is updated, when 
indexing in read only mode. For
   testing purpose, it can be set to 'head' to indicate that the head state 
should be used. (required)
+* `-index-definitions-file` - json file file path which contains updated index 
definitions
   
 If the index does not support fulltext indexing then you can omit providing 
BlobStore details
   
@@ -199,7 +202,116 @@ In this step we configure oak-run to con
 checkpoint creation, indexing and import
 
     java -jar oak-run*.jar index --reindex --index-paths=/oak:index/lucene 
--read-write --fds-path=/path/to/datastore /path/to/segmentstore
+
+### <a name="index-definition-updates"></a> Updating or Adding New Index 
Definitions
+
+`@since Oak 1.7.5`
+
+Index tooling support updating and adding new index definitions to existing 
setups. This can be done by passing 
+in path of a json file which contains index definitions
+
+    java -jar oak-run*.jar index index --reindex 
--index-paths=/oak:index/newAssetIndex \
+    --index-definitions-file=index-definitions.json \
+    --fds-path=/path/to/datastore /path/to/segmentstore  
+   
+Where index-definitions.json has following structure
+
+    {
+      "/oak:index/newAssetIndex": {
+        "evaluatePathRestrictions": true,
+        "compatVersion": 2,
+        "type": "lucene",
+        "async": "async",
+        "jcr:primaryType": "oak:QueryIndexDefinition",
+        "indexRules": {
+          "jcr:primaryType": "nt:unstructured",
+          "dam:Asset": {
+            "jcr:primaryType": "nt:unstructured",
+            "properties": {
+              "jcr:primaryType": "nt:unstructured",
+              "valid": {
+                "name": "valid",
+                "propertyIndex": true,
+                "jcr:primaryType": "nt:unstructured",
+                "notNullCheckEnabled": true
+              },
+              "mimetype": {
+                "name": "mimetype",
+                "analyzed": true,
+                "jcr:primaryType": "nt:unstructured"
+              }
+            }
+          }
+        }
+      }
+    }
     
+Some points to note about this json file
+* Each key of top level object refers to the index path
+* The value of each such key refers to complete index definition 
+* If the index path is not present in existing repository then it would result 
in a new index being created
+* In case of new index it must be ensured that parent path structure must 
already exist in repository. 
+  So if a new index is being created at `/content/en/oak:index/contentIndex` 
then path upto  `/content/en/oak:index`
+  should already exist in repository
+
+You can also use the json file generated from 
[Oakutils](http://oakutils.appspot.com/generate/index). It needs to be 
+modified to confirm to above structure i.e. enclose the whole definition under 
the intended index path key.
+
+In general the index definitions does not need any special encoding of values 
as Index definitions in Oak use
+only String, Long and Double types mostly. However if the index refers to 
binary config like Tika config then
+the binary data would need to encoded. Refer to next section for more details.
+    
+This option is supported in both online and out-of-band indexing.
+
+For more details refer to [OAK-6471][OAK-6471]
+    
+### <a name="json-file-format"></a> JSON File Format
+
+Some of the standard types used in Oak are not supported directly by JSON like 
names, blobs etc. Those would need to be 
+encoded in a specific format.
+
+Below are the encoding rules
+
+LONG
+: No encoding required
+: _"compatVersion": 2_
+
+BOOLEAN
+: No encoding required
+: _"propertyIndex": true,_
+
+DOUBLE
+: No encoding required
+: _"weight": 1.5_ 
+
+STRING
+: Prefix the value with `str:`
+: Generally the value need not be encoded. Encoding is only required if the 
string starts with 3 letters and then colon
+: _"pathPropertyName": "str:jcr:path"_  
+
+DATE
+: Prefix the value with `dat:`. The value is ISO8601 formatted date string
+: _"created": "dat:2017-07-20T13:23:21.196+05:30"_  
+
+NAME
+: Prefix the value with `nam:`.
+: For `jcr:primaryType` and `jcr:mixins` no encoding is required. Any property 
with these names would be converted to
+  NAME type
+: _"nodetype": "nam:nt:base"_ 
+
+PATH
+: Prefix the value with `pat:`
+: _"imagePath": "pat:/content/assets/book.jpg"_  
+
+URI
+: Prefix the value with `uri:`
+: _"serverURI": "uri:http://foo"_  
+
+BINARY
+: By default the binary values are encoded as Base64 string if the binary is 
less than 1 MB size. The encoded value is 
+  prefixed with `:blobId:`
+: _"jcr:data": ":blobId:axygz="_  
+
 
 ### <a name="tika-setup"></a> Tika Setup
 
@@ -215,5 +327,4 @@ Then modify the index command like below
     java -cp oak-run.jar:tika-app-1.15.jar org.apache.jackrabbit.oak.run.Main 
index
     
 
-
-
+[OAK-6471]: https://issues.apache.org/jira/browse/OAK-6471
\ No newline at end of file


Reply via email to