pre-extract-text.md

chetanm Wed, 21 Jun 2017 23:56:27 -0700

Author: chetanm
Date: Thu Jun 22 06:55:40 2017
New Revision: 1799543

URL: http://svn.apache.org/viewvc?rev=1799543&view=rev
Log:
OAK-6370 - Improve documentation for text pre-extraction


Note on classpath order. Otherwise it causes issue with commons-csv version

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md

Modified: 
jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md?rev=1799543&r1=1799542&r2=1799543&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md 
(original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md 
Thu Jun 22 06:55:40 2017
@@ -73,7 +73,7 @@ with Oak 1.7.2 jar.
 
 To perform the text extraction use the `--extract` action
 
-        java -cp tika-app-1.15.jar:oak-run.jar \
+        java -cp oak-run.jar:tika-app-1.15.jar \
         org.apache.jackrabbit.oak.run.Main tika \
         --data-file binary-stats.csv \
         --store-path ./store  \
@@ -98,7 +98,8 @@ extraction. One can also split the csv i
 stores later. Just ensure that at merge time blobs*.txt files are also merged
 
 Note that we need to launch the command with `-cp` instead of `-jar` as we 
need to include classes outside of oak-run jar 
-like tika-app
+like tika-app. Also ensure that oak-run comes before in classpath. This is 
required due to some old classes being packaged 
+in tika-app 
 
 ## B - PreExtractedTextProvider

svn commit: r1799543 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md

Reply via email to