Re: CMS diff: Jena Full Text Search

2019-01-28 Thread vincent ventresque

Hi Ajs6f

Thanks for including me in the conversation, but I have to confess I've 
never looked at java classes (I only use command line tools).


Le 28/01/2019 à 21:05, ajs6f a écrit :

On Jan 28, 2019, at 2:57 PM, Chris Tomlinson  
wrote:

Hi Adam,

I haven’t seen that error. What I’ve done in the past is to replace the 
jena-text doc file with the new contents in Eclipse in an SVN checkout of the 
jena-doc-site and then committed.

I can definitely do that (and will when we're happy with the patch), but see 
below.


Out of curiosity when is it necessary to use the

 [] ja:loadClass "org.apache.jena.tdb.TDB” .

and

[] ja:loadClass   "org.apache.jena.query.text.TextQuery” .

? I do not use them in the config when running fuseki war in tomcat.

I have no idea whatsoever! :grin: I wouldn't have thought them needed either.

Vincent-- any comment?

ajs6f



Regards,
Chris




On Jan 28, 2019, at 11:11 AM, ajs6f  wrote:

Recently Vincent offered a nice patch to our text indexing documentation, as shown below. 
Oddly, when I now go to merge it (a bit late, sorry!), I get an error: "Can't locate 
anonymous's tree to clone". Is anyone familiar with that? I know very little about 
the SVN-based CMS, so I'm not even sure where to start looking...

ajs6f


Re: CMS diff: Jena Full Text Search

2019-01-28 Thread ajs6f


> On Jan 28, 2019, at 2:57 PM, Chris Tomlinson  
> wrote:
> 
> Hi Adam,
> 
> I haven’t seen that error. What I’ve done in the past is to replace the 
> jena-text doc file with the new contents in Eclipse in an SVN checkout of the 
> jena-doc-site and then committed.

I can definitely do that (and will when we're happy with the patch), but see 
below.

> Out of curiosity when is it necessary to use the
> 
> [] ja:loadClass "org.apache.jena.tdb.TDB” .
> 
> and
> 
>[] ja:loadClass   "org.apache.jena.query.text.TextQuery” .
> 
> ? I do not use them in the config when running fuseki war in tomcat.

I have no idea whatsoever! :grin: I wouldn't have thought them needed either.

Vincent-- any comment?

ajs6f


> Regards,
> Chris
> 
> 
> 
>> On Jan 28, 2019, at 11:11 AM, ajs6f  wrote:
>> 
>> Recently Vincent offered a nice patch to our text indexing documentation, as 
>> shown below. Oddly, when I now go to merge it (a bit late, sorry!), I get an 
>> error: "Can't locate anonymous's tree to clone". Is anyone familiar with 
>> that? I know very little about the SVN-based CMS, so I'm not even sure where 
>> to start looking...
>> 
>> ajs6f
> 



Re: CMS diff: Jena Full Text Search

2019-01-28 Thread Chris Tomlinson
Hi Adam,

I haven’t seen that error. What I’ve done in the past is to replace the 
jena-text doc file with the new contents in Eclipse in an SVN checkout of the 
jena-doc-site and then committed.

Out of curiosity when is it necessary to use the

 [] ja:loadClass "org.apache.jena.tdb.TDB” .

and

[] ja:loadClass   "org.apache.jena.query.text.TextQuery” .

? I do not use them in the config when running fuseki war in tomcat.

Regards,
Chris



> On Jan 28, 2019, at 11:11 AM, ajs6f  wrote:
> 
> Recently Vincent offered a nice patch to our text indexing documentation, as 
> shown below. Oddly, when I now go to merge it (a bit late, sorry!), I get an 
> error: "Can't locate anonymous's tree to clone". Is anyone familiar with 
> that? I know very little about the SVN-based CMS, so I'm not even sure where 
> to start looking...
> 
> ajs6f



Re: CMS diff: Jena Full Text Search

2019-01-28 Thread ajs6f
Recently Vincent offered a nice patch to our text indexing documentation, as 
shown below. Oddly, when I now go to merge it (a bit late, sorry!), I get an 
error: "Can't locate anonymous's tree to clone". Is anyone familiar with that? 
I know very little about the SVN-based CMS, so I'm not even sure where to start 
looking...

ajs6f

> On Jan 23, 2019, at 12:01 PM, vincent.ventres...@ens-lyon.fr 
>  wrote:
> 
> Clone URL (Committers only):
> https://cms.apache.org/redirect?new=anonymous;action=diff;uri=http://jena.apache.org/documentation%2Fquery%2Ftext-query.mdtext
> 
> vincent.ventres...@ens-lyon.fr
> 
> Index: trunk/content/documentation/query/text-query.mdtext
> ===
> --- trunk/content/documentation/query/text-query.mdtext   (revision 
> 1851871)
> +++ trunk/content/documentation/query/text-query.mdtext   (working copy)
> @@ -609,21 +609,47 @@
> index field. More complex setups, with multiple properties per entity
> (URI) are possible.
> 
> +The assembler file can be either default configuration file 
> (.../run/config.ttl)
> +or a custom file in ...run/configuration folder. Note that you can use 
> several files
> +simultaneously.
> +
> +You have to edit the file (see comments in the assembler code below):
> +
> +1. provide values for paths and a fixed URI for tdb:DatasetTDB
> +2. modify the entity map : add the fields you want to index and desired 
> options (filters, tokenizers...)
> +
> +If your assembler file is run/config.ttl, you can index the dataset with 
> this command :
> +
> +java -cp ./fuseki-server.jar jena.textindexer --desc=run/config.ttl
> +
> Once configured, any data added to the text dataset is automatically
> -indexed as well.
> +indexed as well : 
> https://jena.apache.org/documentation/query/text-query.html#building-a-text-index
> 
> +When you change the jena-text in significant ways, such as changing what 
> analyzer 
> +is used for a given property and so on, then you’ll need to rebuild the 
> Lucene index 
> +via reloading the dataset or using the textIndexer.
> +
> ### Text Dataset Assembler
> 
> The following is an example of a TDB dataset with a text index.
> 
> + Example of a TDB dataset and text index#
> +# The main doc sources are:
> +#  - 
> https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html
> +#  - https://jena.apache.org/documentation/assembler/assembler-howto.html
> +#  - https://jena.apache.org/documentation/assembler/assembler.ttl
> +# See https://jena.apache.org/documentation/fuseki2/fuseki-layout.html 
> for the destination of this file.
> +#
> +
> @prefix : .
> @prefix rdf:  .
> @prefix rdfs: .
> @prefix tdb:  .
> @prefix ja:   .
> @prefix text: .
> +@prefix skos: 
> +@prefix fuseki:   .
> 
> -## Example of a TDB dataset and text index
> ## Initialize TDB
> [] ja:loadClass "org.apache.jena.tdb.TDB" .
> tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
> @@ -631,39 +657,64 @@
> 
> ## Initialize text query
> [] ja:loadClass   "org.apache.jena.query.text.TextQuery" .
> +
> # A TextDataset is a regular dataset with a text index.
> text:TextDataset  rdfs:subClassOf   ja:RDFDataset .
> +
> # Lucene index
> text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .
> -# Elasticsearch index
> -text:TextIndexESrdfs:subClassOf   text:TextIndex .
> 
> +
> ## ---
> -## This URI must be fixed - it's used to assemble the text dataset.
> 
> :text_dataset rdf:type text:TextDataset ;
> -text:dataset   <#dataset> ;
> +text:dataset   :my_dataset ; # <-- 
> replace `:my_dataset` with the desired URI
> text:index <#indexLucene> ;
> -.
> +.
> 
> # A TDB dataset used for RDF storage
> -<#dataset> rdf:type  tdb:DatasetTDB ;
> -tdb:location "DB" ;
> -tdb:unionDefaultGraph true ; # Optional
> -.
> 
> -# Text index description
> +:my_dataset rdf:type  tdb:DatasetTDB ;   # <-- 
> replace `:my_dataset` with the desired URI
> +tdb:location "/tmp/tdb-dataset/" ;   # <-- 
> replace `/tmp/tdb-dataset/` with your path 
> (`.../fuseki/run/databases/MY_DATASET`)
> +#tdb:unionDefaultGraph true ; # Optional
> +.
> +
> +# Text index description (see documentation for other options)
> +
> <#indexLucene> a