Author: buildbot
Date: Fri Feb 22 21:31:32 2019
New Revision: 1040794
Log:
Staging update by buildbot for jena
Modified:
websites/staging/jena/trunk/content/ (props changed)
websites/staging/jena/trunk/content/documentation/query/text-query.html
Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Feb 22 21:31:32 2019
@@ -1 +1 @@
-1854173
+1854174
Modified:
websites/staging/jena/trunk/content/documentation/query/text-query.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/query/text-query.html
(original)
+++ websites/staging/jena/trunk/content/documentation/query/text-query.html Fri
Feb 22 21:31:32 2019
@@ -856,54 +856,93 @@ itself.</p>
<p>For simple RDF use, there will be one field, mapping a property to a text
index field. More complex setups, with multiple properties per entity
(URI) are possible.</p>
+<p>The assembler file can be either default configuration file
(.../run/config.ttl)
+or a custom file in ...run/configuration folder. Note that you can use several
files
+simultaneously.</p>
+<p>You have to edit the file (see comments in the assembler code below):</p>
+<ol>
+<li>provide values for paths and a fixed URI for tdb:DatasetTDB</li>
+<li>modify the entity map : add the fields you want to index and desired
options (filters, tokenizers...)</li>
+</ol>
+<p>If your assembler file is run/config.ttl, you can index the dataset with
this command :</p>
+<div class="codehilite"><pre><span class="n">java</span> <span
class="o">-</span><span class="n">cp</span> <span class="o">./</span><span
class="n">fuseki</span><span class="o">-</span><span
class="n">server</span><span class="p">.</span><span class="n">jar</span> <span
class="n">jena</span><span class="p">.</span><span class="n">textindexer</span>
<span class="o">--</span><span class="n">desc</span><span
class="p">=</span><span class="n">run</span><span class="o">/</span><span
class="n">config</span><span class="p">.</span><span class="n">ttl</span>
+</pre></div>
+
+
<p>Once configured, any data added to the text dataset is automatically
-indexed as well.</p>
+indexed as well: <a href="#building-a-text-index">Building a Text
Index</a>.</p>
<h3 id="text-dataset-assembler">Text Dataset Assembler<a class="headerlink"
href="#text-dataset-assembler" title="Permanent link">¶</a></h3>
-<p>The following is an example of a TDB dataset with a text index.</p>
-<div class="codehilite"><pre><span class="n">PREFIX</span> <span
class="p">:</span> <span class="o"><</span><span
class="n">http</span><span class="p">:</span><span class="o">//</span><span
class="n">localhost</span><span class="o">/</span><span
class="n">jena_example</span><span class="o">/</span>#<span
class="o">></span>
-<span class="n">PREFIX</span> <span class="n">rdf</span><span
class="p">:</span> <span class="o"><</span><span
class="n">http</span><span class="p">:</span><span class="o">//</span><span
class="n">www</span><span class="p">.</span><span class="n">w3</span><span
class="p">.</span><span class="n">org</span><span class="o">/</span>1999<span
class="o">/</span>02<span class="o">/</span>22<span class="o">-</span><span
class="n">rdf</span><span class="o">-</span><span class="n">syntax</span><span
class="o">-</span><span class="n">ns</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">rdfs</span><span
class="p">:</span> <span class="o"><</span><span
class="n">http</span><span class="p">:</span><span class="o">//</span><span
class="n">www</span><span class="p">.</span><span class="n">w3</span><span
class="p">.</span><span class="n">org</span><span class="o">/</span>2000<span
class="o">/</span>01<span class="o">/</span><span class="n">rdf</span><span
class="o">-</span><span class="n">schema</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">tdb</span><span
class="p">:</span> <span class="o"><</span><span
class="n">http</span><span class="p">:</span><span class="o">//</span><span
class="n">jena</span><span class="p">.</span><span class="n">hpl</span><span
class="p">.</span><span class="n">hp</span><span class="p">.</span><span
class="n">com</span><span class="o">/</span>2008<span class="o">/</span><span
class="n">tdb</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">ja</span><span
class="p">:</span> <span class="o"><</span><span
class="n">http</span><span class="p">:</span><span class="o">//</span><span
class="n">jena</span><span class="p">.</span><span class="n">hpl</span><span
class="p">.</span><span class="n">hp</span><span class="p">.</span><span
class="n">com</span><span class="o">/</span>2005<span class="o">/</span>11<span
class="o">/</span><span class="n">Assembler</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">text</span><span
class="p">:</span> <span class="o"><</span><span
class="n">http</span><span class="p">:</span><span class="o">//</span><span
class="n">jena</span><span class="p">.</span><span class="n">apache</span><span
class="p">.</span><span class="n">org</span><span class="o">/</span><span
class="n">text</span>#<span class="o">></span>
-
-## <span class="n">Example</span> <span class="n">of</span> <span
class="n">a</span> <span class="n">TDB</span> <span class="n">dataset</span>
<span class="n">and</span> <span class="n">text</span> <span
class="n">index</span>
-
-# <span class="n">A</span> <span class="n">TextDataset</span> <span
class="n">is</span> <span class="n">a</span> <span class="n">regular</span>
<span class="n">dataset</span> <span class="n">with</span> <span
class="n">a</span> <span class="n">text</span> <span
class="n">index</span><span class="p">.</span>
-<span class="n">text</span><span class="p">:</span><span
class="n">TextDataset</span> <span class="n">rdfs</span><span
class="p">:</span><span class="n">subClassOf</span> <span
class="n">ja</span><span class="p">:</span><span class="n">RDFDataset</span>
<span class="p">.</span>
-# <span class="n">Lucene</span> <span class="n">index</span>
-<span class="n">text</span><span class="p">:</span><span
class="n">TextIndexLucene</span> <span class="n">rdfs</span><span
class="p">:</span><span class="n">subClassOf</span> <span
class="n">text</span><span class="p">:</span><span class="n">TextIndex</span>
<span class="p">.</span>
-# <span class="n">Elasticsearch</span> <span class="n">index</span>
-<span class="n">text</span><span class="p">:</span><span
class="n">TextIndexES</span> <span class="n">rdfs</span><span
class="p">:</span><span class="n">subClassOf</span> <span
class="n">text</span><span class="p">:</span><span class="n">TextIndex</span>
<span class="p">.</span>
-
-## <span
class="o">---------------------------------------------------------------</span>
-## <span class="n">This</span> <span class="n">URI</span> <span
class="n">must</span> <span class="n">be</span> <span class="n">fixed</span>
<span class="o">-</span> <span class="n">it</span><span
class="o">'</span><span class="n">s</span> <span class="n">used</span>
<span class="n">to</span> <span class="n">assemble</span> <span
class="n">the</span> <span class="n">text</span> <span
class="n">dataset</span><span class="p">.</span>
-
-<span class="p">:</span><span class="n">text_dataset</span> <span
class="n">rdf</span><span class="p">:</span><span class="n">type</span>
<span class="n">text</span><span class="p">:</span><span
class="n">TextDataset</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">dataset</span> <span class="o"><</span>#<span
class="n">dataset</span><span class="o">></span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">index</span> <span class="o"><</span>#<span
class="n">indexLucene</span><span class="o">></span> <span class="p">;</span>
- <span class="p">.</span>
-
-# <span class="n">A</span> <span class="n">TDB</span> <span
class="n">dataset</span> <span class="n">used</span> <span class="k">for</span>
<span class="n">RDF</span> <span class="n">storage</span>
-<span class="o"><</span>#<span class="n">dataset</span><span
class="o">></span> <span class="n">rdf</span><span class="p">:</span><span
class="n">type</span> <span class="n">tdb</span><span
class="p">:</span><span class="n">DatasetTDB</span> <span class="p">;</span>
- <span class="n">tdb</span><span class="p">:</span><span
class="n">location</span> "<span class="n">DB</span>" <span
class="p">;</span>
- <span class="n">tdb</span><span class="p">:</span><span
class="n">unionDefaultGraph</span> <span class="n">true</span> <span
class="p">;</span> # <span class="n">Optional</span>
- <span class="p">.</span>
-
-# <span class="n">Text</span> <span class="n">index</span> <span
class="n">description</span>
-<span class="o"><</span>#<span class="n">indexLucene</span><span
class="o">></span> <span class="n">a</span> <span class="n">text</span><span
class="p">:</span><span class="n">TextIndexLucene</span> <span
class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">directory</span> <span class="o"><</span><span
class="n">file</span><span class="p">:</span><span class="o">/</span><span
class="n">some</span><span class="o">/</span><span class="n">path</span><span
class="o">/</span><span class="n">lucene</span><span class="o">-</span><span
class="n">index</span><span class="o">></span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">entityMap</span> <span class="o"><</span>#<span
class="n">entMap</span><span class="o">></span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">storeValues</span> <span class="n">true</span> <span
class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">analyzer</span> <span class="p">[</span> <span class="n">a</span>
<span class="n">text</span><span class="p">:</span><span
class="n">StandardAnalyzer</span> <span class="p">]</span> <span
class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">queryAnalyzer</span> <span class="p">[</span> <span
class="n">a</span> <span class="n">text</span><span class="p">:</span><span
class="n">KeywordAnalyzer</span> <span class="p">]</span> <span
class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">queryParser</span> <span class="n">text</span><span
class="p">:</span><span class="n">AnalyzingQueryParser</span> <span
class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">defineAnalyzers</span> <span class="p">[</span> <span
class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span
class="p">]</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span
class="n">multilingualSupport</span> <span class="n">true</span> <span
class="p">;</span>
- <span class="p">.</span>
+<p>The following is an example of an assembler file defining a TDB dataset
with a Lucene text index.</p>
+<div class="codehilite"><pre><span class="c1">######## Example of a TDB
dataset and text index#########################</span>
+<span class="c1"># The main doc sources are:</span>
+<span class="c1"># -
https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html</span>
+<span class="c1"># -
https://jena.apache.org/documentation/assembler/assembler-howto.html</span>
+<span class="c1"># -
https://jena.apache.org/documentation/assembler/assembler.ttl</span>
+<span class="c1"># See
https://jena.apache.org/documentation/fuseki2/fuseki-layout.html for the
destination of this file.</span>
+<span
class="c1">#########################################################################</span>
+
+<span class="p">@</span>prefix : <span class="o"><</span>http:<span
class="o">//</span>localhost<span class="o">/</span>jena_example<span
class="o">/</span><span class="c1">#> .</span>
+<span class="p">@</span>prefix rdf: <span class="o"><</span>http:<span
class="o">//</span>www.w3.org<span class="o">/</span><span
class="m">1999</span><span class="o">/</span><span class="m">02</span><span
class="o">/</span><span class="m">22</span><span class="o">-</span>rdf<span
class="o">-</span>syntax<span class="o">-</span>ns<span class="c1">#>
.</span>
+<span class="p">@</span>prefix rdfs: <span class="o"><</span>http:<span
class="o">//</span>www.w3.org<span class="o">/</span><span
class="m">2000</span><span class="o">/</span><span class="m">01</span><span
class="o">/</span>rdf<span class="o">-</span>schema<span class="c1">#>
.</span>
+<span class="p">@</span>prefix tdb: <span class="o"><</span>http:<span
class="o">//</span>jena.hpl.hp.com<span class="o">/</span><span
class="m">2008</span><span class="o">/</span>tdb<span class="c1">#> .</span>
+<span class="p">@</span>prefix text: <span class="o"><</span>http:<span
class="o">//</span>jena.apache.org<span class="o">/</span>text<span
class="c1">#> .</span>
+<span class="p">@</span>prefix skos: <span class="o"><</span>http:<span
class="o">//</span>www.w3.org<span class="o">/</span><span
class="m">2004</span><span class="o">/</span><span class="m">02</span><span
class="o">/</span>skos<span class="o">/</span>core<span class="c1">#></span>
+<span class="p">@</span>prefix fuseki: <span class="o"><</span>http:<span
class="o">//</span>jena.apache.org<span class="o">/</span>fuseki<span
class="c1">#> .</span>
+
+<span class="p">[]</span> rdf:type fuseki:Server <span class="p">;</span>
+ fuseki:services <span class="p">(</span>
+ :myservice
+ <span class="p">)</span> <span class="m">.</span>
+
+:myservice rdf:type fuseki:Service <span class="p">;</span>
+ fuseki:name <span class="s">"myds"</span>
<span class="p">;</span> <span class="c1"># e.g : `s-query
--service=http://localhost:3030/myds "select * ..."`</span>
+ fuseki:serviceQuery <span class="s">"query"</span>
<span class="p">;</span> <span class="c1"># SPARQL query service</span>
+ fuseki:serviceUpdate <span
class="s">"update"</span> <span class="p">;</span> <span
class="c1"># SPARQL update service</span>
+ fuseki:serviceUpload <span
class="s">"upload"</span> <span class="p">;</span> <span
class="c1"># Non-SPARQL upload service</span>
+ fuseki:serviceReadWriteGraphStore <span class="s">"data"</span>
<span class="p">;</span> <span class="c1"># SPARQL Graph store protocol
(read and write)</span>
+ fuseki:dataset :text_dataset <span class="p">;</span>
+ <span class="m">.</span>
+
+<span class="c1">##
---------------------------------------------------------------</span>
+
+<span class="c1"># A TextDataset is a regular dataset with a text index.</span>
+:text_dataset rdf:type text:TextDataset <span class="p">;</span>
+ text:dataset :mydataset <span class="p">;</span> <span class="c1">#
<-- replace `:my_dataset` with the desired URI</span>
+ text:index <span class="o"><</span><span
class="c1">#indexLucene> ;</span>
+<span class="m">.</span>
+
+<span class="c1"># A TDB dataset used for RDF storage</span>
+:mydataset rdf:type tdb:DatasetTDB <span class="p">;</span> <span
class="c1"># <-- replace `:my_dataset` with the desired URI - as above</span>
+ tdb:location <span class="s">"DB"</span> <span class="p">;</span>
+ tdb:unionDefaultGraph true <span class="p">;</span> <span class="c1">#
Optional</span>
+<span class="m">.</span>
+
+<span class="c1"># Text index description</span>
+<span class="o"><</span><span class="c1">#indexLucene> a
text:TextIndexLucene ;</span>
+ text:directory <span class="o"><</span>file:path<span
class="o">></span> <span class="p">;</span> <span class="c1"># <--
replace `<file:path>` with your path (e.g.,
`<file:/.../fuseki/run/databases/MY_INDEX>`)</span>
+ text:entityMap <span class="o"><</span><span class="c1">#entMap>
;</span>
+ text:storeValues true <span class="p">;</span>
+ text:analyzer <span class="p">[</span> a text:StandardAnalyzer <span
class="p">]</span> <span class="p">;</span>
+ text:queryAnalyzer <span class="p">[</span> a text:KeywordAnalyzer <span
class="p">]</span> <span class="p">;</span>
+ text:queryParser text:AnalyzingQueryParser <span class="p">;</span>
+ text:defineAnalyzers <span class="p">[</span> <span class="m">.</span>
<span class="m">.</span> <span class="m">.</span> <span class="p">]</span>
<span class="p">;</span>
+ text:multilingualSupport true <span class="p">;</span> <span class="c1">#
optional</span>
+<span class="m">.</span>
+<span class="c1"># Entity map (see documentation for other options)</span>
+<span class="o"><</span><span class="c1">#entMap> a text:EntityMap
;</span>
+ text:defaultField <span class="s">"label"</span> <span
class="p">;</span>
+ text:entityField <span class="s">"uri"</span> <span
class="p">;</span>
+ text:uidField <span class="s">"uid"</span> <span
class="p">;</span>
+ text:langField <span class="s">"lang"</span> <span
class="p">;</span>
+ text:graphField <span class="s">"graph"</span> <span
class="p">;</span>
+ text:map <span class="p">(</span>
+ <span class="p">[</span> text:field <span
class="s">"label"</span> <span class="p">;</span>
+ text:predicate skos:prefLabel <span class="p">]</span>
+ <span class="p">)</span> <span class="m">.</span>
</pre></div>
+<p>See below for <a href="#entity-map-definition">more on defining an entity
map</a></p>
<p>The <code>text:TextDataset</code> has two properties:</p>
<ul>
<li>