Starting Fuseki with an externally created TDB

2019-08-13 Thread Walker, Andreas
Dear all,


I have finally moved on from uploading Turtle files through the Web-GUI by hand 
to creating my TDB locally with tdbloader2. I have also succeeded in loading 
the database into Fuseki by running fuseki-server with the --loc=DATABASE 
parameter. However, there are some small things I haven't quite figured out 
yet. Maybe you can help me out?


[1] Is there a way to pass the --loc parameter to the Fuseki Start/Stop script 
as well? fuseki start --loc=DATABASE does not do anything, and I haven't been 
able to figure out a different way to let the server know about this database 
from the documentation. (I have also tried creating the database in the Web GUI 
and then replacing its contents with the locally created data, but this has 
only thrown errors)


[2] Does Fuseki care, for any purposes, whether the database is stored under 
fuseki/run/databases/ or anywhere else on the server?


Thanks in advance for your help,

Andreas


AW: AW: AW: AW: Error 500: No conversion to a Node:

2019-03-20 Thread Walker, Andreas
Hi Andy,


unfortunately (in terms of reproducing the error, not in terms of my project), 
starting with a fresh database seems to have solved the problem. It is well 
possible that the previous database was corrupted due to an abort transaction, 
and that this only happened once - I assumed that dropping the graph would also 
get rid of any corrupted data, but if I understand you correctly, that may not 
have been the case, and the error appearing again and again may just have been 
a consequence of the initial corruption.


If you still want me to run a test, I'd be happy to, but I guess this may not 
be necessary in that case.


Best,

Andreas


Von: Andy Seaborne 
Gesendet: Mittwoch, 20. März 2019 00:15:48
An: users@jena.apache.org
Betreff: Re: AW: AW: AW: Error 500: No conversion to a Node: 

Hi Andreas,

Do you have a reproducible test case even if it happens only occasionally?

Having looked at the code, I can't see a risk point and certainly not
like earlier problems with TDB1.

Have you had any abort transactions - these happen in the Fuseki UI only
if you browse away from a file upload or there is a data error.

I can build a special for you with a layer of node write caching removed
if its easier to run an test case in your environment rather than try to
extract one.

 Andy

On 11/03/2019 21:59, Andy Seaborne wrote:
> Hi Andreas,
>
> On 11/03/2019 14:37, Walker, Andreas wrote:
>> Hi Andy,
>>
>>
>> the database was created from the web interface, and I've only used
>> the web interface to add data to it, so no other version has ever
>> touched it.
>
> OK - so you have only run v3.10.0.
>
>> If I understand you correctly, the problem is with the database as a
>> whole, so dropping and reloading the graph might not solve the
>> problem. I have now switched to a fresh database and am currently
>> reloading the data, so I can see whether the problem persists beyond
>> the original database.
>
> If it does then we have a reproducible test case.  That said, I can
> think of a way of a single load or a single load and a sequence of
> updates not in a parallel can break the node table.
>
> The email of Osma's is a compaction - have you compacted the database?
> (Fuseki must not be running at the time - this is supposed to be caught
> using OS file locks but ... I'm told VMs can get this wrong (but I don't
> know which and when).
>
>> The only backup I have done is by making snapshots of the entire
>> virtual server this is running on, so I don't think that is related in
>> any way.
>
> Probably not related but is it an instantaneous backup of the
> filesystem? If not, then it isn't a reliable backup (in the same way
> that copying all the files isn't a safe backup procedure).
>
> The problem is that if the copy is done while a write transaction is
> running, some files may be copied before the commit point and some
> after, which risks chaos.
>
>  Andy
>
>>
>>
>> Thanks again for your help,
>>
>> Andreas
>>
>> 
>> Von: Andy Seaborne 
>> Gesendet: Freitag, 8. März 2019 10:50:28
>> An: users@jena.apache.org
>> Betreff: Re: AW: AW: Error 500: No conversion to a Node: 
>>
>> Hi Andreas,
>>
>> Is this a database that has only ever been used with 3.10.0 or was the
>> data loaded with a previous version at some time in the past?
>>
>> The problem occurs silently during loading. There is no sign of the
>> problem at the time and the system works just fine while the RDF term,
>> or terms, are also in the node table cache.
>>
>> Then the system is restarted.
>>
>> Then the RDF term is needed for a query and the errors are reported.
>>
>> But the problem originated back when the data was loaded or updated, may
>> be several restarts ago.
>>
>> Of course, it may be a different issue in which case, but the error
>> message is consistent with the known bug.
>>
>> Have you been backing up the server on a regular basis? A backup is
>> NQuads so it is pulling every RDF term from disk (subject to already
>> being cached).
>>
>>   Andy
>>
>> On 07/03/2019 20:47, Walker, Andreas wrote:
>>> Hi Andy,
>>>
>>>
>>> I am running Version 3.10.0. The problem with reloading the database
>>> is the regular (multiple times a day) recurrence of the problem, so
>>> if there are any strategies to avoid it, I'd appreciate any advice.
>>>
>>>
>>> Best,
>>>
>>> Andreas
>>>
>>>
>>> 
>>> Von: Andy Seaborne 
>&g

AW: AW: AW: Error 500: No conversion to a Node:

2019-03-11 Thread Walker, Andreas
Hi Andy,


the database was created from the web interface, and I've only used the web 
interface to add data to it, so no other version has ever touched it.


If I understand you correctly, the problem is with the database as a whole, so 
dropping and reloading the graph might not solve the problem. I have now 
switched to a fresh database and am currently reloading the data, so I can see 
whether the problem persists beyond the original database.


The only backup I have done is by making snapshots of the entire virtual server 
this is running on, so I don't think that is related in any way.


Thanks again for your help,

Andreas


Von: Andy Seaborne 
Gesendet: Freitag, 8. März 2019 10:50:28
An: users@jena.apache.org
Betreff: Re: AW: AW: Error 500: No conversion to a Node: 

Hi Andreas,

Is this a database that has only ever been used with 3.10.0 or was the
data loaded with a previous version at some time in the past?

The problem occurs silently during loading. There is no sign of the
problem at the time and the system works just fine while the RDF term,
or terms, are also in the node table cache.

Then the system is restarted.

Then the RDF term is needed for a query and the errors are reported.

But the problem originated back when the data was loaded or updated, may
be several restarts ago.

Of course, it may be a different issue in which case, but the error
message is consistent with the known bug.

Have you been backing up the server on a regular basis? A backup is
NQuads so it is pulling every RDF term from disk (subject to already
being cached).

 Andy

On 07/03/2019 20:47, Walker, Andreas wrote:
> Hi Andy,
>
>
> I am running Version 3.10.0. The problem with reloading the database is the 
> regular (multiple times a day) recurrence of the problem, so if there are any 
> strategies to avoid it, I'd appreciate any advice.
>
>
> Best,
>
> Andreas
>
>
> 
> Von: Andy Seaborne 
> Gesendet: Donnerstag, 7. März 2019 21:12
> An: users@jena.apache.org
> Betreff: Re: AW: Error 500: No conversion to a Node: 
>
> Hi Andreas - which version are you running?
>
> It does not look like the  corruption problem, which is now fixed.
>
> The best thing to do is reload the database again. Whatever terms were
> messed up are permanently damaged I'm afraid.
>
>   Andy
>
> On 07/03/2019 10:49, Walker, Andreas wrote:
>> Dear all,
>>
>>
>> as a quick follow-up which might be helpful in identifying the error; I can 
>> currently run a SPARQL query (just listing any triples) with LIMIT 80, but 
>> no higher, before I run into the error, so it seems like there might indeed 
>> be a particular part of the database that is corrupted.
>>
>>
>> Best,
>>
>> Andreas
>>
>> 
>> Von: Walker, Andreas 
>> Gesendet: Mittwoch, 6. März 2019 10:42:32
>> An: users@jena.apache.org
>> Betreff: Error 500: No conversion to a Node: 
>>
>> Dear all,
>>
>>
>> from time to time, my Fuseki server starts throwing the following error 
>> message on any SPARQL query I pose to one of my graphs:
>>
>>
>> "Error 500: No conversion to a Node: "
>>
>>
>> Unfortunately, I couldn't find any explanation of this error message, beyond 
>> a discussion of a corrupted TDB2 database.
>>
>>
>> (https://users.jena.apache.narkive.com/LF4XE801/corrupted-tdb2-database)<https://users.jena.apache.narkive.com/LF4XE801/corrupted-tdb2-database>
>>
>>
>> Once this happens, the only thing I could do so far is to drop the entire 
>> afflicted graph and rebuild it, but of course that isn't going to be a 
>> viable solution in the long term.
>>
>>
>> The only way I interact with Fuseki is by starting and stopping it, querying 
>> it via the SPARQL endpoint (and sometimes through the web interface, e.g. 
>> when troubleshooting my application), and uploading new triples (as turtle 
>> files) via the web interface. I haven't been able to find a pattern in when 
>> the error appears so far.
>>
>>
>> Any insights into why this error appears, and what to do in order to avoid 
>> it? I'd appreciate any help.
>>
>>
>> Best,
>>
>> Andreas
>>
>


AW: AW: Error 500: No conversion to a Node:

2019-03-07 Thread Walker, Andreas
Hi Andy,


I am running Version 3.10.0. The problem with reloading the database is the 
regular (multiple times a day) recurrence of the problem, so if there are any 
strategies to avoid it, I'd appreciate any advice.


Best,

Andreas



Von: Andy Seaborne 
Gesendet: Donnerstag, 7. März 2019 21:12
An: users@jena.apache.org
Betreff: Re: AW: Error 500: No conversion to a Node: 

Hi Andreas - which version are you running?

It does not look like the  corruption problem, which is now fixed.

The best thing to do is reload the database again. Whatever terms were
messed up are permanently damaged I'm afraid.

 Andy

On 07/03/2019 10:49, Walker, Andreas wrote:
> Dear all,
>
>
> as a quick follow-up which might be helpful in identifying the error; I can 
> currently run a SPARQL query (just listing any triples) with LIMIT 80, but no 
> higher, before I run into the error, so it seems like there might indeed be a 
> particular part of the database that is corrupted.
>
>
> Best,
>
> Andreas
>
> ________
> Von: Walker, Andreas 
> Gesendet: Mittwoch, 6. März 2019 10:42:32
> An: users@jena.apache.org
> Betreff: Error 500: No conversion to a Node: 
>
> Dear all,
>
>
> from time to time, my Fuseki server starts throwing the following error 
> message on any SPARQL query I pose to one of my graphs:
>
>
> "Error 500: No conversion to a Node: "
>
>
> Unfortunately, I couldn't find any explanation of this error message, beyond 
> a discussion of a corrupted TDB2 database.
>
>
> (https://users.jena.apache.narkive.com/LF4XE801/corrupted-tdb2-database)<https://users.jena.apache.narkive.com/LF4XE801/corrupted-tdb2-database>
>
>
> Once this happens, the only thing I could do so far is to drop the entire 
> afflicted graph and rebuild it, but of course that isn't going to be a viable 
> solution in the long term.
>
>
> The only way I interact with Fuseki is by starting and stopping it, querying 
> it via the SPARQL endpoint (and sometimes through the web interface, e.g. 
> when troubleshooting my application), and uploading new triples (as turtle 
> files) via the web interface. I haven't been able to find a pattern in when 
> the error appears so far.
>
>
> Any insights into why this error appears, and what to do in order to avoid 
> it? I'd appreciate any help.
>
>
> Best,
>
> Andreas
>


AW: Error 500: No conversion to a Node:

2019-03-07 Thread Walker, Andreas
Dear all,


as a quick follow-up which might be helpful in identifying the error; I can 
currently run a SPARQL query (just listing any triples) with LIMIT 80, but no 
higher, before I run into the error, so it seems like there might indeed be a 
particular part of the database that is corrupted.


Best,

Andreas


Von: Walker, Andreas 
Gesendet: Mittwoch, 6. März 2019 10:42:32
An: users@jena.apache.org
Betreff: Error 500: No conversion to a Node: 

Dear all,


from time to time, my Fuseki server starts throwing the following error message 
on any SPARQL query I pose to one of my graphs:


"Error 500: No conversion to a Node: "


Unfortunately, I couldn't find any explanation of this error message, beyond a 
discussion of a corrupted TDB2 database.


(https://users.jena.apache.narkive.com/LF4XE801/corrupted-tdb2-database)<https://users.jena.apache.narkive.com/LF4XE801/corrupted-tdb2-database>


Once this happens, the only thing I could do so far is to drop the entire 
afflicted graph and rebuild it, but of course that isn't going to be a viable 
solution in the long term.


The only way I interact with Fuseki is by starting and stopping it, querying it 
via the SPARQL endpoint (and sometimes through the web interface, e.g. when 
troubleshooting my application), and uploading new triples (as turtle files) 
via the web interface. I haven't been able to find a pattern in when the error 
appears so far.


Any insights into why this error appears, and what to do in order to avoid it? 
I'd appreciate any help.


Best,

Andreas


AW: AW: AW: Using FROM on external RDF files in Fuseki

2019-02-14 Thread Walker, Andreas
Dear Andy,


thanks for looking into it. Too bad it doesn't work, but I can work around the 
problem (parsing the RDF directly in Python instead) for my current purposes, 
so at least we got a bug report out of it.


Best,

Andreas




Von: Andy Seaborne 
Gesendet: Donnerstag, 14. Februar 2019 12:17:56
An: users@jena.apache.org
Betreff: Re: AW: AW: Using FROM on external RDF files in Fuseki

I looked - there are a couple of problem bocking its use which need fixing:

https://issues.apache.org/jira/browse/JENA-1671

Sorry about that,

 Andy

On 13/02/2019 13:42, Andy Seaborne wrote:
> Andreas,
>
> I'll take a look - one of the problems of running that service has been
> (unsurprisingly) it getting trashed.  A public endpoint reading files
> from the web is a DOS vector. People trying to load DBpedia etc. (no,
> that does not work!).
>
> (The VM it runs on is also quite locked down which may be the issue as
> well).
>
> If you are using the apache-jena-fuseki dwn and service scripts, that is
> the webapp server.  The standalone form is Jetty+Fuseki as webapp.
>
> It will be better to run the general query endpoint in a separate server
> - that isolates it from the handling of the datasets.
>
>  Andy
>
> On 13/02/2019 10:51, Walker, Andreas wrote:
>> Hi Andy,
>>
>>
>> I am running Fuseki as a service (i.e. with "fuseki start"), without
>> any modifications to the configuration. I have added a dataset with
>> multiple named graphs through the web interface, and I tried running
>> the queries through the web interface as well, although I am later
>> planning to query the SPARQL endpoint via Python.
>>
>>
>> When I try the following minimal query in
>> <http://www.sparql.org/sparql.html>, it also doesn't work:
>>
>>
>> SELECT ?s ?p ?o
>> FROM <http://id.loc.gov/vocabulary/iso639-2/deu.rdf>
>> WHERE { ?s ?p ?o }
>>
>> Do I understand you correctly that this should work? If yes, any idea
>> what I might be doing wrong? And how would I tell my own server to
>> allow queries like that?
>>
>> Thanks for your help,
>> Andreas
>>
>>
>> 
>> Von: Andy Seaborne 
>> Gesendet: Mittwoch, 13. Februar 2019 11:20:16
>> An: users@jena.apache.org
>> Betreff: Re: AW: Using FROM on external RDF files in Fuseki
>>
>> Fuseki does support loading FROM URLs from the web in the "main" version
>> of Fuseki.  If the service is backed by a dataset it will use the graphs
>> from the dataset but there is the "general" query service as well.
>>
>> It is --general in Fuseki.main and that is what
>> http://www.sparql.org/sparql.html
>> is using.
>>
>> Andreas - were you looking for it in the webapp version?
>>
>>   Andy
>>
>> On 12/02/2019 19:52, Charles Abela wrote:
>>> Hehe
>>>
>>> On 12 Feb 2019 20:14, "Walker, Andreas" <
>>> andreas.wal...@sub.uni-goettingen.de> wrote:
>>>
>>> Hi ajs6f,
>>>
>>> yes, what I am aiming for is what the FROM statement would do if it
>>> could
>>> target graphs outside my own dataset.
>>>
>>> The only thing I have come up with so far - as an idea, not tested - is
>>> loading the RDF file into a named graph, querying it and then
>>> dropping it
>>> again. What I am asking is whether there is a simpler way of instructing
>>> Fuseki to temporarily load that graph just for the purposes of one
>>> query.
>>>
>>> Best,
>>> Andreas
>>> 
>>> Von: ajs6f 
>>> Gesendet: Dienstag, 12. Februar 2019 19:06:36
>>> An: users@jena.apache.org
>>> Betreff: Re: Using FROM on external RDF files in Fuseki
>>>
>>> I'm not quite sure what you are asking about here: Do you mean to query
>>> both a new graph and the main dataset at the same time, and to do that
>>> without using anything other than SPARQL, and without loading the new
>>> graph
>>> into your dataset?
>>>
>>>
>>> ajs6f
>>>
>>>> On Feb 12, 2019, at 10:58 AM, Walker, Andreas <
>>> andreas.wal...@sub.uni-goettingen.de> wrote:
>>>>
>>>> Dear all,
>>>>
>>>>
>>>> after trying for a while, I found out that Fuseki does not temporarily
>>> add external RDF files to the default graph when they are included
>>> through
>>> a FROM statement in the SPARQL query, which was also confirmed on
>>> StackExchange [1].
>>>>
>>>>
>>>> Since this option isn't available, is there a good way of querying an
>>> external RDF file without permanently adding it to the graph by
>>> loading it?
>>> For example, my own graph might contain a link to an RDF file like
>>> [2], and
>>> I want to query that file from my application (through Fuseki), but not
>>> store it in my own triple store.
>>>>
>>>>
>>>> Any help and/or advice would be welcome,
>>>>
>>>>
>>>> Andreas
>>>>
>>>>
>>>> [1]
>>> https://stackoverflow.com/questions/36532737/sparql-queries-with-from-clause-in-fuseki2
>>>
>>>>
>>>> and
>>> https://stackoverflow.com/questions/54358099/fuseki-sparql-service-not-able-to-refer-to-external-rdf-resources
>>>
>>>>
>>>> [2] http://id.loc.gov/vocabulary/iso639-2/deu.rdf
>>>> 
>>>
>>


AW: AW: Using FROM on external RDF files in Fuseki

2019-02-13 Thread Walker, Andreas
Hi Andy,


I am running Fuseki as a service (i.e. with "fuseki start"), without any 
modifications to the configuration. I have added a dataset with multiple named 
graphs through the web interface, and I tried running the queries through the 
web interface as well, although I am later planning to query the SPARQL 
endpoint via Python.


When I try the following minimal query in <http://www.sparql.org/sparql.html>, 
it also doesn't work:


SELECT ?s ?p ?o
FROM <http://id.loc.gov/vocabulary/iso639-2/deu.rdf>
WHERE { ?s ?p ?o }

Do I understand you correctly that this should work? If yes, any idea what I 
might be doing wrong? And how would I tell my own server to allow queries like 
that?

Thanks for your help,
Andreas



Von: Andy Seaborne 
Gesendet: Mittwoch, 13. Februar 2019 11:20:16
An: users@jena.apache.org
Betreff: Re: AW: Using FROM on external RDF files in Fuseki

Fuseki does support loading FROM URLs from the web in the "main" version
of Fuseki.  If the service is backed by a dataset it will use the graphs
from the dataset but there is the "general" query service as well.

It is --general in Fuseki.main and that is what
   http://www.sparql.org/sparql.html
is using.

Andreas - were you looking for it in the webapp version?

 Andy

On 12/02/2019 19:52, Charles Abela wrote:
> Hehe
>
> On 12 Feb 2019 20:14, "Walker, Andreas" <
> andreas.wal...@sub.uni-goettingen.de> wrote:
>
> Hi ajs6f,
>
> yes, what I am aiming for is what the FROM statement would do if it could
> target graphs outside my own dataset.
>
> The only thing I have come up with so far - as an idea, not tested - is
> loading the RDF file into a named graph, querying it and then dropping it
> again. What I am asking is whether there is a simpler way of instructing
> Fuseki to temporarily load that graph just for the purposes of one query.
>
> Best,
> Andreas
> 
> Von: ajs6f 
> Gesendet: Dienstag, 12. Februar 2019 19:06:36
> An: users@jena.apache.org
> Betreff: Re: Using FROM on external RDF files in Fuseki
>
> I'm not quite sure what you are asking about here: Do you mean to query
> both a new graph and the main dataset at the same time, and to do that
> without using anything other than SPARQL, and without loading the new graph
> into your dataset?
>
>
> ajs6f
>
>> On Feb 12, 2019, at 10:58 AM, Walker, Andreas <
> andreas.wal...@sub.uni-goettingen.de> wrote:
>>
>> Dear all,
>>
>>
>> after trying for a while, I found out that Fuseki does not temporarily
> add external RDF files to the default graph when they are included through
> a FROM statement in the SPARQL query, which was also confirmed on
> StackExchange [1].
>>
>>
>> Since this option isn't available, is there a good way of querying an
> external RDF file without permanently adding it to the graph by loading it?
> For example, my own graph might contain a link to an RDF file like [2], and
> I want to query that file from my application (through Fuseki), but not
> store it in my own triple store.
>>
>>
>> Any help and/or advice would be welcome,
>>
>>
>> Andreas
>>
>>
>> [1]
> https://stackoverflow.com/questions/36532737/sparql-queries-with-from-clause-in-fuseki2
>>
>> and
> https://stackoverflow.com/questions/54358099/fuseki-sparql-service-not-able-to-refer-to-external-rdf-resources
>>
>> [2] http://id.loc.gov/vocabulary/iso639-2/deu.rdf
>> 
>


AW: Using FROM on external RDF files in Fuseki

2019-02-12 Thread Walker, Andreas
Hi ajs6f,

yes, what I am aiming for is what the FROM statement would do if it could 
target graphs outside my own dataset.

The only thing I have come up with so far - as an idea, not tested - is loading 
the RDF file into a named graph, querying it and then dropping it again. What I 
am asking is whether there is a simpler way of instructing Fuseki to 
temporarily load that graph just for the purposes of one query.

Best,
Andreas

Von: ajs6f 
Gesendet: Dienstag, 12. Februar 2019 19:06:36
An: users@jena.apache.org
Betreff: Re: Using FROM on external RDF files in Fuseki

I'm not quite sure what you are asking about here: Do you mean to query both a 
new graph and the main dataset at the same time, and to do that without using 
anything other than SPARQL, and without loading the new graph into your dataset?


ajs6f

> On Feb 12, 2019, at 10:58 AM, Walker, Andreas 
>  wrote:
>
> Dear all,
>
>
> after trying for a while, I found out that Fuseki does not temporarily add 
> external RDF files to the default graph when they are included through a FROM 
> statement in the SPARQL query, which was also confirmed on StackExchange [1].
>
>
> Since this option isn't available, is there a good way of querying an 
> external RDF file without permanently adding it to the graph by loading it? 
> For example, my own graph might contain a link to an RDF file like [2], and I 
> want to query that file from my application (through Fuseki), but not store 
> it in my own triple store.
>
>
> Any help and/or advice would be welcome,
>
>
> Andreas
>
>
> [1] 
> https://stackoverflow.com/questions/36532737/sparql-queries-with-from-clause-in-fuseki2
>
> and 
> https://stackoverflow.com/questions/54358099/fuseki-sparql-service-not-able-to-refer-to-external-rdf-resources
>
> [2] http://id.loc.gov/vocabulary/iso639-2/deu.rdf
> 



Using FROM on external RDF files in Fuseki

2019-02-12 Thread Walker, Andreas
Dear all,


after trying for a while, I found out that Fuseki does not temporarily add 
external RDF files to the default graph when they are included through a FROM 
statement in the SPARQL query, which was also confirmed on StackExchange [1].


Since this option isn't available, is there a good way of querying an external 
RDF file without permanently adding it to the graph by loading it? For example, 
my own graph might contain a link to an RDF file like [2], and I want to query 
that file from my application (through Fuseki), but not store it in my own 
triple store.


Any help and/or advice would be welcome,


Andreas


[1] 
https://stackoverflow.com/questions/36532737/sparql-queries-with-from-clause-in-fuseki2

and 
https://stackoverflow.com/questions/54358099/fuseki-sparql-service-not-able-to-refer-to-external-rdf-resources

[2] http://id.loc.gov/vocabulary/iso639-2/deu.rdf