On 22/12/15 18:22, Andy Seaborne wrote:
JENA-1104 suggests there is a ordering/timing issue and that it is not
Fuseki1/Fuseki2 expect that things happen in a different order.
I have investigated this further and I think I understand what is
happening.
If we have a configuration with the same dataset+text-index shared
between two services, then when the first service is built,
TextIndexLuceneAssembler is called to create TextIndexLucene object.
When the second service is built, TextIndexLuceneAssembler is called
again and creates another TextIndexLucene object.
Both of these TextIndexLucene objects create a Lucene IndexWriter object
on the same directory. That doesn't work because they both try to grab
the same lock and one fails.
I am happy to offer pull request to change this behaviour. There are
broadly two strategies that I can see, and I'm wondering if there is a
preferred approach from the Jena team.
The first approach is to make a change the way the assemblers work to
only create one TextIndexLucene object per node in the configuration graph.
A second approach is to modify the TextIndexLucene so that two or more
objects can operate on the same directory.
My default approach would be to make the change in the assembler code.
Brian
I'm not sure that a shared index across two different datasets will
work if updates are involved. Maybe someone else can help with that.
The configuration I'm looking at is not an index shared across two data
sets - there is one index+tdb-dataset pair in the configuration.
What's fuseki:allowTimeoutOverride? Is this a local build with the
code for that uncommented out?
Andy
On 21/12/15 14:53, Brian McBride wrote:
The fuseki configuration below sets up two services with a shared
dataset. The dataset has a lucene text index.
This configuration works on Fuseki 1.3.1. Fuseki 2.3.1 fails to start.
The log output is shown below. Looks like the lucene index may be
trying to grab a lock for the dataset twice.
If I change the second fuseki:dataset line to:
[[
fuseki:dataset <#ds> ;
]]
then it works on Fuseki 2.3.1 and Unexpectedly both services have
access to the text index, which doesn't seem right, thought suits me for
the moment as I need both services to have access to the index.
Is there some configuration change I need to make between Fuseki 1 and
Fuseki 2?
Brian
Fuseki 2.3.1 log output
[[
2015-12-21 14:42:20.940 WARN Config :: Fuseki v2:
Management functions are always on the same port as the server.
--mgtPort ignored.
2015-12-21 14:42:21.062 INFO Server :: Fuseki 2.3.1
2015-12-08T09:24:07+0000
2015-12-21 14:42:21.229 INFO Config ::
FUSEKI_HOME=/usr/share/fuseki
2015-12-21 14:42:21.230 INFO Config ::
FUSEKI_BASE=/etc/fuseki
2015-12-21 14:42:21.233 INFO Servlet :: Initializing Shiro
environment
2015-12-21 14:42:21.233 INFO EnvironmentLoader :: Starting Shiro
environment initialization.
2015-12-21 14:42:21.242 INFO Config :: Shiro file:
file:///etc/fuseki/shiro.ini
2015-12-21 14:42:21.415 INFO EnvironmentLoader :: Shiro environment
initialized in 181 ms.
2015-12-21 14:42:21.415 INFO Config :: Configuration
file: /etc/fuseki/config.ttl
2015-12-21 14:42:22.193 WARN AssemblerHelp :: ja:loadClass:
Migration to Jena3: Converting com.hp.hpl.jena.tdb.TDB to
org.apache.jena.tdb.TDB
2015-12-21 14:42:23.557 ERROR Server :: Exception in
initialization: caught:
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out: NativeFSLock@/var/lib/fuseki/databases/ds-lucene/write.lock
2015-12-21 14:42:23.577 INFO Server :: Started 2015/12/21
14:42:23 UTC on port 3030
]]
Fuseki configuration.
[[
# Licensed under the terms of http://www.apache.org/licenses/LICENSE-2.0
@prefix : <#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
[] rdf:type fuseki:Server ;
fuseki:services (
<#service_ds>
<#service_ds_timeout_override>
) .
# TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#service_ds> rdf:type fuseki:Service ;
rdfs:label "TDB Service (RW)" ;
fuseki:name "ds" ;
fuseki:serviceQuery "query" ;
fuseki:dataset <#ds-with-lucene> ;
.
<#service_ds_timeout_override>
rdfs:label "TDB Service Query with
timeout override" ;
fuseki:name "ds_to" ;
fuseki:allowTimeoutOverride true;
fuseki:serviceQuery "query" ;
fuseki:dataset <#ds-with-lucene> ;
.
<#ds> rdf:type tdb:DatasetTDB ;
tdb:location "/var/lib/fuseki/databases/ds" ;
.
@prefix text: <http://jena.apache.org/text#> .
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
text:TextDataset rdfs:subClassOf ja:RDFDataset .
text:TextIndexLucene rdfs:subClassOf text:TextIndex .
<#ds-with-lucene>
rdf:type text:TextDataset;
text:dataset <#ds> ;
text:index <#indexLucene> ;
.
<#indexLucene> a text:TextIndexLucene ;
text:directory <file:///var/lib/fuseki/databases/ds-lucene>;
text:entityMap <#entMap> ;
.
<#entMap> a text:EntityMap ;
text:entityField "uri" ;
text:defaultField "text" ;
text:map (
[
text:field "text" ;
text:predicate rdfs:label ;
]
) .
]]
--
Epimorphics Ltd, http://www.epimorphics.com
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
Epimorphics Ltd. is a limited company registered in England (number 7016688)