# Fuseki configuration for BDRC, configures two endpoints:
# - /bdrc is read-only
# - /bdrcrw is read-write
#
# This was painful to come up with but the web interface basically
allows no option
# and there is no subclass inference by default so such a
configuration file is necessary.
#
# The main doc sources are:
# -
https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html
# -
https://jena.apache.org/documentation/assembler/assembler-howto.html
# - https://jena.apache.org/documentation/assembler/assembler.ttl
#
# See
https://jena.apache.org/documentation/fuseki2/fuseki-layout.html
for the destination of this file.
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix tdb2: <http://jena.apache.org/2016/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix : <http://base/#> .
@prefix text: <http://jena.apache.org/text#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix adm: <http://purl.bdrc.io/ontology/admin/> .
@prefix bdd: <http://purl.bdrc.io/data/> .
@prefix bdo: <http://purl.bdrc.io/ontology/core/> .
@prefix bdr: <http://purl.bdrc.io/resource/> .
@prefix f: <java:io.bdrc.ldspdi.sparql.functions.> .
# [] ja:loadClass "org.seaborne.tdb2.TDB2" .
# tdb2:DatasetTDB2 rdfs:subClassOf ja:RDFDataset .
# tdb2:GraphTDB2 rdfs:subClassOf ja:Model .
[] rdf:type fuseki:Server ;
fuseki:services (
:bdrcrw
) .
:bdrcrw rdf:type fuseki:Service ;
fuseki:name "bdrcrw" ; # name of the dataset in the url
fuseki:serviceQuery "query" ; # SPARQL query service
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store
protocol (read and write)
fuseki:dataset :bdrc_text_dataset ;
.
# using TDB
:dataset_bdrc rdf:type tdb:DatasetTDB ;
tdb:location "/usr/local/fuseki/base/databases/bdrc" ;
tdb:unionDefaultGraph true ;
.
# using TDB2
# :dataset_bdrc rdf:type tdb2:DatasetTDB2 ;
# tdb2:location "/usr/local/fuseki/base/databases/bdrc" ;
# tdb2:unionDefaultGraph true ;
# .
:bdrc_text_dataset rdf:type text:TextDataset ;
text:dataset :dataset_bdrc ;
text:index :bdrc_lucene_index ;
.
# Text index description
:bdrc_lucene_index a text:TextIndexLucene ;
text:directory <file:/usr/local/fuseki/base/lucene-bdrc> ;
text:storeValues true ;
text:multilingualSupport true ;
text:entityMap :bdrc_entmap ;
text:defineAnalyzers (
[ text:defineAnalyzer :romanWordAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "word" ]
[ text:paramName "inputEncoding" ;
text:paramValue "roman" ]
[ text:paramName "mergePrepositions" ;
text:paramValue true ]
[ text:paramName "filterGeminates" ;
text:paramValue true ]
)
] ;
]
[ text:defineAnalyzer :devaWordAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "word" ]
[ text:paramName "inputEncoding" ;
text:paramValue "deva" ]
[ text:paramName "mergePrepositions" ;
text:paramValue true ]
[ text:paramName "filterGeminates" ;
text:paramValue true ]
)
] ;
]
[ text:defineAnalyzer :slpWordAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "word" ]
[ text:paramName "inputEncoding" ;
text:paramValue "SLP" ]
[ text:paramName "mergePrepositions" ;
text:paramValue true ]
[ text:paramName "filterGeminates" ;
text:paramValue true ]
)
] ;
]
[ text:defineAnalyzer :romanLenientIndexAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "syl" ]
[ text:paramName "inputEncoding" ;
text:paramValue "roman" ]
[ text:paramName "mergePrepositions" ;
text:paramValue false ]
[ text:paramName "filterGeminates" ;
text:paramValue true ]
[ text:paramName "lenient" ;
text:paramValue "index" ]
)
] ;
]
[ text:defineAnalyzer :devaLenientIndexAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "syl" ]
[ text:paramName "inputEncoding" ;
text:paramValue "deva" ]
[ text:paramName "mergePrepositions" ;
text:paramValue false ]
[ text:paramName "filterGeminates" ;
text:paramValue true ]
[ text:paramName "lenient" ;
text:paramValue "index" ]
)
] ;
]
[ text:defineAnalyzer :slpLenientIndexAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "syl" ]
[ text:paramName "inputEncoding" ;
text:paramValue "SLP" ]
[ text:paramName "mergePrepositions" ;
text:paramValue false ]
[ text:paramName "filterGeminates" ;
text:paramValue true ]
[ text:paramName "lenient" ;
text:paramValue "index" ]
)
] ;
]
[ text:defineAnalyzer :romanLenientQueryAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.sa.SanskritAnalyzer" ;
text:params (
[ text:paramName "mode" ;
text:paramValue "syl" ]
[ text:paramName "inputEncoding" ;
text:paramValue "roman" ]
[ text:paramName "mergePrepositions" ;
text:paramValue false ]
[ text:paramName "filterGeminates" ;
text:paramValue false ]
[ text:paramName "lenient" ;
text:paramValue "query" ]
)
] ;
]
[ text:defineAnalyzer :hanzAnalyzer ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.zh.ChineseAnalyzer" ;
text:params (
[ text:paramName "profile" ;
text:paramValue "TC2SC" ]
[ text:paramName "stopwords" ;
text:paramValue false ]
[ text:paramName "filterChars" ;
text:paramValue 0 ]
)
] ;
]
[ text:defineAnalyzer :han2pinyin ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.zh.ChineseAnalyzer" ;
text:params (
[ text:paramName "profile" ;
text:paramValue "TC2PYstrict" ]
[ text:paramName "stopwords" ;
text:paramValue false ]
[ text:paramName "filterChars" ;
text:paramValue 0 ]
)
] ;
]
[ text:defineAnalyzer :pinyin ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.zh.ChineseAnalyzer" ;
text:params (
[ text:paramName "profile" ;
text:paramValue "PYstrict" ]
)
] ;
]
[ text:addLang "bo" ;
text:searchFor ( "bo" "bo-x-ewts" "bo-alalc97" ) ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.bo.TibetanAnalyzer" ;
text:params (
[ text:paramName "segmentInWords" ;
text:paramValue false ]
[ text:paramName "lemmatize" ;
text:paramValue true ]
[ text:paramName "filterChars" ;
text:paramValue false ]
[ text:paramName "inputMode" ;
text:paramValue "unicode" ]
[ text:paramName "stopFilename" ;
text:paramValue "" ]
)
] ;
]
[ text:addLang "bo-x-ewts" ;
text:searchFor ( "bo" "bo-x-ewts" "bo-alalc97" ) ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.bo.TibetanAnalyzer" ;
text:params (
[ text:paramName "segmentInWords" ;
text:paramValue false ]
[ text:paramName "lemmatize" ;
text:paramValue true ]
[ text:paramName "filterChars" ;
text:paramValue false ]
[ text:paramName "inputMode" ;
text:paramValue "ewts" ]
[ text:paramName "stopFilename" ;
text:paramValue "" ]
)
] ;
]
[ text:addLang "bo-alalc97" ;
text:searchFor ( "bo" "bo-x-ewts" "bo-alalc97" ) ;
text:analyzer [
a text:GenericAnalyzer ;
text:class "io.bdrc.lucene.bo.TibetanAnalyzer" ;
text:params (
[ text:paramName "segmentInWords" ;
text:paramValue false ]
[ text:paramName "lemmatize" ;
text:paramValue true ]
[ text:paramName "filterChars" ;
text:paramValue false ]
[ text:paramName "inputMode" ;
text:paramValue "alalc" ]
[ text:paramName "stopFilename" ;
text:paramValue "" ]
)
] ;
]
[ text:addLang "zh-hans" ;
text:searchFor ( "zh-hans" "zh-hant" ) ;
text:auxIndex ( "zh-aux-han2pinyin" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :hanzAnalyzer ] ;
]
[ text:addLang "zh-hant" ;
text:searchFor ( "zh-hans" "zh-hant" ) ;
text:auxIndex ( "zh-aux-han2pinyin" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :hanzAnalyzer
] ;
]
[ text:addLang "zh-latn-pinyin" ;
text:searchFor ( "zh-latn-pinyin" "zh-aux-han2pinyin" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :pinyin
] ;
]
[ text:addLang "zh-aux-han2pinyin" ;
text:searchFor ( "zh-latn-pinyin" "zh-aux-han2pinyin" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :pinyin
] ;
text:indexAnalyzer :han2pinyin ;
]
[ text:addLang "sa-x-ndia" ;
text:searchFor ( "sa-x-ndia" "sa-aux-deva2Ndia"
"sa-aux-roman2Ndia" "sa-aux-slp2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanLenientQueryAnalyzer
] ;
]
[ text:addLang "sa-aux-deva2Ndia" ;
text:searchFor ( "sa-x-ndia" "sa-aux-roman2Ndia"
"sa-aux-slp2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanLenientQueryAnalyzer
] ;
text:indexAnalyzer :devaLenientIndexAnalyzer ;
]
[ text:addLang "sa-aux-roman2Ndia" ;
text:searchFor ( "sa-x-ndia" "sa-aux-deva2Ndia"
"sa-aux-slp2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanLenientQueryAnalyzer
] ;
text:indexAnalyzer :romanLenientIndexAnalyzer ;
]
[ text:addLang "sa-aux-slp2Ndia" ;
text:searchFor ( "sa-x-ndia" "sa-aux-deva2Ndia"
"sa-aux-roman2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanLenientQueryAnalyzer
] ;
text:indexAnalyzer :slpLenientIndexAnalyzer ;
]
[ text:addLang "sa-deva" ;
text:searchFor ( "sa-deva" "sa-x-iast" "sa-x-slp1"
"sa-x-iso" "sa-alalc97" ) ;
text:auxIndex ( "sa-aux-deva2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :devaWordAnalyzer ] ;
]
[ text:addLang "sa-x-iso" ;
text:searchFor ( "sa-x-iso" "sa-x-iast" "sa-x-slp1"
"sa-deva" "sa-alalc97" ) ;
text:auxIndex ( "sa-aux-roman2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanWordAnalyzer ] ;
]
[ text:addLang "sa-x-slp1" ;
text:searchFor ( "sa-x-slp1" "sa-x-iast" "sa-x-iso"
"sa-deva" "sa-alalc97" ) ;
text:auxIndex ( "sa-aux-slp2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :slpWordAnalyzer ] ;
]
[ text:addLang "sa-x-iast" ;
text:searchFor ( "sa-x-iast" "sa-x-slp1" "sa-x-iso"
"sa-deva" "sa-alalc97" ) ;
text:auxIndex ( "sa-aux-roman2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanWordAnalyzer ] ;
]
[ text:addLang "sa-alalc97" ;
text:searchFor ( "sa-alalc97" "sa-x-slp1" "sa-x-iso"
"sa-deva" "sa-iast" ) ;
text:auxIndex ( "sa-aux-roman2Ndia" ) ;
text:analyzer [
a text:DefinedAnalyzer ;
text:useAnalyzer :romanWordAnalyzer ] ;
]
) ;
.
# Index mappings
:bdrc_entmap a text:EntityMap ;
text:entityField "uri" ;
text:uidField "uid" ;
text:defaultField "label" ;
text:langField "lang" ;
text:graphField "graph" ; ## enable graph-specific indexing
text:map (
[ text:field "label" ;
text:predicate skos:prefLabel ]
[ text:field "altLabel" ;
text:predicate skos:altLabel ; ]
[ text:field "rdfsLabel" ;
text:predicate rdfs:label ; ]
[ text:field "chunkContents" ;
text:predicate bdo:chunkContents ; ]
[ text:field "eTextTitle" ;
text:predicate bdo:eTextTitle ; ]
[ text:field "logMessage" ;
text:predicate adm:logMessage ; ]
[ text:field "noteText" ;
text:predicate bdo:noteText ; ]
[ text:field "workAuthorshipStatement" ;
text:predicate bdo:workAuthorshipStatement ; ]
[ text:field "workColophon" ;
text:predicate bdo:workColophon ; ]
[ text:field "workEditionStatement" ;
text:predicate bdo:workEditionStatement ; ]
[ text:field "workPublisherLocation" ;
text:predicate bdo:workPublisherLocation ; ]
[ text:field "workPublisherName" ;
text:predicate bdo:workPublisherName ; ]
[ text:field "workSeriesName" ;
text:predicate bdo:workSeriesName ; ]
) ;
.
On Mar 11, 2019, at 11:42 AM, Sorin Gheorghiu
<[email protected]
<mailto:[email protected]>> wrote:
Hi Chris,
have you had time to look in my results, by chance? Would this help
to isolate the issue?
Let me know if you need any other data to collect, please.
Best regards,
Sorin
-------- Weitergeleitete Nachricht --------
Betreff: Re: Text Index build with empty fields
Datum: Mon, 4 Mar 2019 17:35:56 +0100
Von: Sorin Gheorghiu <[email protected]>
An: [email protected]
Kopie (CC): Chris Tomlinson <[email protected]>
Hi Chris,
when I reduce the entity map to 3 fields:
[ text:field "oldgndid";
text:predicate gndo:oldAuthorityNumber
]
[ text:field "prefName";
text:predicate gndo:preferredNameForThePerson
]
[ text:field "varName";
text:predicate gndo:variantNameForThePerson
]
then *oldgndid *field only contains data (see
textindexer_3params_040319.pcap attached):
ES...|..........\*.......gnd_fts_es_131018_index.Y6BxYm-hT6qL0_NX10HrZQ..GndSubjectheadings.http://d-nb.info/gnd/4000002-3........
ES...B..........\*.....transport_client.indices:data/write/update..gnd_fts_es_131018_index.........GndSubjectheadings.http://d-nb.info/gnd/4000023-0......painless..if((ctx._source
== null) || (ctx._source.oldgndid == null) ||
(ctx._source.oldgndid.empty == true))
{ctx._source.oldgndid=[params.fieldValue] } else
{ctx._source.oldgndid.add(params.fieldValue)}..fieldValue..(DE-588c)4000023-0...............gnd_fts_es_131018_index....GndSubjectheadings..http://d-nb.info/gnd/4000023-0..>{"varName":[],"prefName":[],"oldgndid":["(DE-588c)4000023-0"]}.............
moreover with 2 fields:
[ text:field "prefName";
text:predicate gndo:preferredNameForThePerson
]
[ text:field "varName";
text:predicate gndo:variantNameForThePerson
]
then *prefName* field only contains data (see
textindexer_2params_040319.pcap attached):
ES...|..........\*.......gnd_fts_es_131018_index.Y6BxYm-hT6qL0_NX10HrZQ..GndSubjectheadings.http://d-nb.info/gnd/134316541........
ES...$..........\*.....transport_client.indices:data/write/update..gnd_fts_es_131018_index.........GndSubjectheadings.http://d-nb.info/gnd/1153446294......painless..if((ctx._source
== null) || (ctx._source.prefName == null) ||
(ctx._source.prefName.empty == true))
{ctx._source.prefName=[params.fieldValue] } else
{ctx._source.prefName.add(params.fieldValue)}..fieldValue.
Pharmakon...............gnd_fts_es_131018_index....GndSubjectheadings..http://d-nb.info/gnd/1153446294..'{"varName":[],"prefName":["Pharmakon"]}.................
Regards,
Sorin
Am 01.03.2019 um 18:06 schrieb Chris Tomlinson:
Hi Sorin,
tcpdump -A -r works fine to view the pcap file; however, I don’t have the time
to delve into the data. I’ll take your word for it that the whole setup worked
in 3.8.0 and I encourage you to try simplifying the entity map perhaps by
having a unique field per property to see if the problem appears related to
prefName and varName fields mapping to multiple properties.
I do notice that the field oldgndid only maps to a single property but not
knowing the data I have no idea whether there’s any of that data in your tests.
Since you indicate that only the field, gndtype, has data (per the pcap file)
then if there is oldgndid data (i.e., occurrences of gndo:oldAuthorityNumber,
then that suggests that there is some rather generic issue w/ textindexer;
however if there is no oldgndid data then there may be a problem that has crept
in since 3.8.0 that is leading to a problem with data for multiple properties
assigned to a single field which I would guess might be related to
google.common.collection.MultiMap that holds the results of parsing the entity
map.
I have no idea how to enable the debug when running the standalone textindexer,
perhaps someone else can answer that.
Regards,
Chris
On Mar 1, 2019, at 2:57 AM, Sorin Gheorghiu<[email protected]>
wrote:
Hi Chris,
1) As I said before, this entity map worked in 3.8.0.
The pcap file I sent you is the proof that Jena delivers inconsistent data. You
may open it with Wireshark
<jndbgnifbhkopbdd.png>
or read it with tcpick:
# tcpick -C -yP -r textindexer_280219.pcap | more
ES...}..........\*.......gnd_fts_es_131018_index.cp-dFuCVTg-dUwvfyREG2w..GndSubjectheadings.http://d-nb.info/gnd/102968225
<dfucvtg-duwvfyreg2w..gndsubjectheadings.http://d-nb.info/gnd/102968225>.........
ES..............\*.....transport_client.indices:data/write/update..gnd_fts_es_131018_index.........GndSubjectheadings.http://d-nb.info/gnd/102968438......painless..if((ctx._source
== null) || (ctx._source.gndtype == null) || (ctx._source.gndtype.empty ==
true)) {ctx._source.gndtype=[params.fieldValue] } else
{ctx._source.gndtype.add(params.fieldValue)}
..fieldValue..Person...............gnd_fts_es_131018_index....GndSubjectheadings..http://d-nb.info/gnd/102968438....{"varName":[],"varName":[],"varName":[],"varName":[],"varName":[],"varName":[],"varName":[],"prefName":[],"prefName":[],"prefName":[],"prefName":[],"prefName":[],"prefName":[],"prefName":[],"oldgndid":[],"gndtype":["Person"]}..................................
As a remark, Jena sends whole text index data within one TCP packet for one
Elasticsearch document.
3) fuseki.log collects logs when Fuseki server is running, but for text indexer
we have to run java command line, i.e.
java -cp ./fuseki-server.jar:<other_jars> jena.textindexer
--desc=run/config.ttl
The question is how to activate the debug logs during text indexer?
Regards,
Sorin
Am 28.02.2019 um 21:41 schrieb Chris Tomlinson:
Hi Sorin,
1) I suggest trying to simplify the entity map. I assume there’s data for each
of the properties other than skos:altLabel in the entity map:
[ text:field "gndtype";
text:predicate skos:altLabel
]
[ text:field "oldgndid";
text:predicate gndo:oldAuthorityNumber
]
[ text:field "prefName";
text:predicate gndo:preferredNameForTheSubjectHeading
]
[ text:field "varName";
text:predicate gndo:variantNameForTheSubjectHeading
]
[ text:field "prefName";
text:predicate gndo:preferredNameForThePlaceOrGeographicName
]
[ text:field "varName";
text:predicate gndo:variantNameForThePlaceOrGeographicName
]
[ text:field "prefName";
text:predicate gndo:preferredNameForTheWork
]
[ text:field "varName";
text:predicate gndo:variantNameForTheWork
]
[ text:field "prefName";
text:predicate gndo:preferredNameForTheConferenceOrEvent
]
[ text:field "varName";
text:predicate gndo:variantNameForTheConferenceOrEvent
]
[ text:field "prefName";
text:predicate gndo:preferredNameForTheCorporateBody
]
[ text:field "varName";
text:predicate gndo:variantNameForTheCorporateBody
]
[ text:field "prefName";
text:predicate gndo:preferredNameForThePerson
]
[ text:field "varName";
text:predicate gndo:variantNameForThePerson
]
[ text:field "prefName";
text:predicate gndo:preferredNameForTheFamily
]
[ text:field "varName";
text:predicate gndo:variantNameForTheFamily
]
2) You might try a TextIndexLucene
3) Adding the line log4j.logger.org.apache.jena.query.text.es=DEBUG should
work. I see no problem with it.
Sorry to be of little help,
Chris
On Feb 28, 2019, at 8:53 AM, Sorin Gheorghiu<[email protected]>
<mailto:[email protected]> wrote:
Hi Chris,
Thank you for answering, I reply you directly because users@jena doesn't accept
messages larger than 1Mb.
The previous text index successful attempt we did was with 3.8.0, not 3.9.0,
sorry for the misinformation.
Attached is the assembler file for 3.10.0 as requested, as well as the packet
capture file to see that only the 'gndtype' field has data.
I tried to enable the debug logs in log4j.properties with
log4j.logger.org.apache.jena.query.text.es=DEBUG but no output in the log file.
Regards,
Sorin
Am 27.02.2019 um 20:01 schrieb Chris Tomlinson:
Hi Sorin,
Please provide the assembler file for Elasticsearch that has the problematic
entity map definitions.
There haven’t been any changes in over a year to textindexer since well before
3.9. I don’t see any relevant changes to the handling of entity maps either so
I can’t begin to pursue the issue further w/o perhaps seeing your current
assembler file.
I don't have any experience with Elasticsearch or with using jena-text-es
beyond a simple change to TextIndexES.java to change
org.elasticsearch.common.transport.InetSocketTransportAddress to
org.elasticsearch.common.transport.TransportAddress as part of the upgrade to
Lucene 7.4.0 and Elasticsearch 6.4.2.
Regards,
Chris
On Feb 25, 2019, at 2:37 AM, Sorin Gheorghiu<[email protected]>
<mailto:[email protected]> <mailto:[email protected]>
<mailto:[email protected]> wrote:
Correction: only the *latest field *from the /text:map/ list contains a value.
To reformulate:
* if there are 3 fields in /text:map/, then during indexing the first
two are empty (let's name them 'text1' and 'text2') and the latest
field contains data (let's name it 'text3')
* if on the next attempt the field 'text3' is commented out, then
'text1' is empty and 'text2' contains data
Am 22.02.2019 um 15:01 schrieb Sorin Gheorghiu:
In addition:
* if there are 3 fields in /text:map/, then during indexing one
contains data (let's name it 'text1'), the others are empty (let's
name them 'text2' and 'text3'),
* if on the next attempt the field 'text1' is commented out, then
'text2' contains data and 'text3' is empty
-------- Weitergeleitete Nachricht --------
Betreff: Text Index build with empty fields
Datum: Fri, 22 Feb 2019 14:01:18 +0100
Von: Sorin Gheorghiu<[email protected]>
<mailto:[email protected]> <mailto:[email protected]>
<mailto:[email protected]>
Antwort an: [email protected] <mailto:[email protected]>
<mailto:[email protected]> <mailto:[email protected]>
An: [email protected] <mailto:[email protected]>
<mailto:[email protected]> <mailto:[email protected]>
Hi,
When building the text index with the /jena.textindexer/ tool in Jena 3.10 for
an external full-text search engine (Elasticsearch of course) and having
multiple fields with different names in /text:map/, just *one field is indexed*
(more precisely one field contains data, the others are empty). It doesn't look
to be an issue with Elasticsearch, in the logs generated during the indexing
the fields are already missing the values, but one. The same setup worked in
Jena 3.9. Changing the Java version from 8 to 9 or 11 didn't change anything.
Could it be that changes of the new release have affected this tool and we deal
with a bug?
--
Sorin Gheorghiu Tel: +49 7531 88-3198
Universität Konstanz Raum: B705
78464 [email protected] <mailto:[email protected]>
<mailto:[email protected]> <mailto:[email protected]>
- KIM: Abteilung Contentdienste -
--
Sorin Gheorghiu Tel: +49 7531 88-3198
Universität Konstanz Raum: B705
78464 [email protected]
<mailto:[email protected]>
- KIM: Abteilung Contentdienste -
--
Sorin Gheorghiu Tel: +49 7531 88-3198
Universität Konstanz Raum: B705
78464 [email protected]
- KIM: Abteilung Contentdienste -
<textindexer_2params_040319.pcap><textindexer_3params_040319.pcap>