Re: Jena/Fuseki graph sync

2017-11-24 Thread Dan Davis
In terms of UNIX utilities, there's a command called "comm" which outputs
three columns:
* lines only in the first file (column 1)
* lines only in the second file (column 2)
* lines in common (column 3)

Then arguments can suppress columns:
* comm -23 a b  - will show lines only in a
* comm -13 a b - will show lines only in b

Of course checksums would not work on the whole graph, but on a sub-graph
defined by a DESCRIBE query, e.g. one subject aka owl:Thing, it could be
perfectly feasible.  Especially because you are essentially comparing a
graph digest and do not need to load the data.



On Fri, Nov 24, 2017 at 10:02 AM, Osma Suominen 
wrote:

> Dan Davis kirjoitti 24.11.2017 klo 16:53:
>
>> Rdflib has a graph_diff method that returns common, triples, only in left,
>> only in right.   It is in IsonorpgicGraph class, so it should handle blank
>> nodes.
>>
>
> Good luck running that on something like Wikidata though. It's far too big
> to fit in memory.
>
> I'd use N-Triple files (old and new) sorted using the unix command sort,
> then use diff to determine added and removed triples, and finally turn
> those into INSERT DATA and DELETE DATA update operations. Assuming there
> are no blank nodes.
>
> -Osma
>
> (speaking as the author of the current rdflib in-memory store, IOMemory)
>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Kaiku
> katu 4)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suomi...@helsinki.fi
> http://www.nationallibrary.fi
>


Re: Jena/Fuseki graph sync

2017-11-24 Thread ajs6f
Wikimedia does offer a sort of general procedure for this: you can check to see 
the updates since the last dump and do per-resource changes.

https://www.wikidata.org/wiki/Wikidata:Data_access#Incremental_updates

But perhaps more efficiently for yourself, you could use their incremental 
dumps:

https://dumps.wikimedia.org/other/incr/wikidatawiki/

which are for some reason only provided in XML. 

ajs6f

> On Nov 24, 2017, at 12:24 PM, Laura Morales  wrote:
> 
>> Laura, can you tell us a little more about why you are trying to avoid 
>> transmitting the whole graph? Is it because of an unreliable network between 
>> your client and Fuseki or because of something else?
> 
> Wikidata is about 4 billion triples, and it takes a lot of time to create the 
> TDB store from the nt file. They release a new dump about once a week, and I 
> would like to update my local copy when they release a new dump. Reloading 
> the entire graph from scratch every time seems very inefficient (as well as 
> an intensive process) considering that only a tiny % of the wikidata graph 
> changes in a week.



Re: Jena/Fuseki graph sync

2017-11-24 Thread Laura Morales
> Laura, can you tell us a little more about why you are trying to avoid 
> transmitting the whole graph? Is it because of an unreliable network between 
> your client and Fuseki or because of something else?

Wikidata is about 4 billion triples, and it takes a lot of time to create the 
TDB store from the nt file. They release a new dump about once a week, and I 
would like to update my local copy when they release a new dump. Reloading the 
entire graph from scratch every time seems very inefficient (as well as an 
intensive process) considering that only a tiny % of the wikidata graph changes 
in a week.


Re: Jena/Fuseki graph sync

2017-11-24 Thread ajs6f
Hey, Lorenz--

Laura specifically asked for ways to avoid transmitting the whole graph; Osma's 
solution (sort NTriples) is better than Model::difference in the absence of 
bnodes (actually, difference() seems to work based on triple equality anyway), 
and Andy's (RDF Patch) would be good if you want to automate the process. Andy 
wrote a whole system for that. [1]

Laura, can you tell us a little more about why you are trying to avoid 
transmitting the whole graph? Is it because of an unreliable network between 
your client and Fuseki or because of something else?

ajs6f

[1] https://github.com/afs/rdf-delta

> On Nov 24, 2017, at 10:21 AM, Lorenz Buehmann 
>  wrote:
> 
> Which means to load the whole new Wikidata dump first, or not?
> 
> Which means, it can simply be used the new loaded dataset.
> 
> 
> On 24.11.2017 15:57, ajs6f wrote:
>> You can use Model.difference(Model m) to do these calculations. 
>> 
>> ajs6f
>> 
>>> On Nov 24, 2017, at 7:21 AM, Laura Morales  wrote:
>>> 
 The s-put tool that comes with Fuseki (or just doing a HTTP PUT to the
 SPARQL Graph Store endpoint using e.g. curl) does exactly this -
 replaces a graph with a new one in a single operation.
>>> Deleting a whole graph, and pushing all the new triples over HTTP doesn't 
>>> look like a good fit for large graphs (even for graphs of a few GBs).
>> 
> 



Assembler for GenericRuleEngine Custom Builtin

2017-11-24 Thread Nouwt, B. (Barry)
Hi all,

Does anyone know whether there is an Assember to load a custom Builtin (for 
usage in rules for the GenericRuleEngine) using a Fuseki configuration .ttl 
file. I assume not, because I cannot find it in: 
https://github.com/apache/jena/tree/master/jena-core/src/main/java/org/apache/jena/assembler/assemblers

I do see a RuleSetAssembler and a ReasonerFactory, but they do not seem to have 
a Builtin load option. Also, via code I use the BuiltinRegistry class, so it 
would probably be not too difficult to add this feature myself.

Any pointers?

Regards, Barry




This message may contain information that is not intended for you. If you are 
not the addressee or if this message was sent to you by mistake, you are 
requested to inform the sender and delete the message. TNO accepts no liability 
for the content of this e-mail, for the manner in which you use it and for 
damage of any kind resulting from the risks inherent to the electronic 
transmission of messages.


Re: Jena/Fuseki graph sync

2017-11-24 Thread Andy Seaborne



On 24/11/17 15:02, Osma Suominen wrote:

Dan Davis kirjoitti 24.11.2017 klo 16:53:
Rdflib has a graph_diff method that returns common, triples, only in 
left,
only in right.   It is in IsonorpgicGraph class, so it should handle 
blank

nodes.


Good luck running that on something like Wikidata though. It's far too 
big to fit in memory.


I'd use N-Triple files (old and new) sorted using the unix command sort, 
then use diff to determine added and removed triples, and finally turn 
those into INSERT DATA and DELETE DATA update operations. Assuming there 
are no blank nodes.


-Osma

(speaking as the author of the current rdflib in-memory store, IOMemory)



This is where RDF patch comes in:

https://afs.github.io/rdf-delta/rdf-patch.html

send the adds and removes.

If it is just additions, you can POST the new RDF to datasets and it 
gets added.


If the deletes than need something else.

Either SPARQL Update, or RDF Patch.

Reloading the whole thing (offline) may be slow but it is reliable. 
Load it (batch job) then swap the datasets over (brief outage or get 
clever with a load balancer).


Andy


Re: Jena/Fuseki graph sync

2017-11-24 Thread Lorenz Buehmann
Which means to load the whole new Wikidata dump first, or not?

Which means, it can simply be used the new loaded dataset.


On 24.11.2017 15:57, ajs6f wrote:
> You can use Model.difference(Model m) to do these calculations. 
>
> ajs6f
>
>> On Nov 24, 2017, at 7:21 AM, Laura Morales  wrote:
>>
>>> The s-put tool that comes with Fuseki (or just doing a HTTP PUT to the
>>> SPARQL Graph Store endpoint using e.g. curl) does exactly this -
>>> replaces a graph with a new one in a single operation.
>> Deleting a whole graph, and pushing all the new triples over HTTP doesn't 
>> look like a good fit for large graphs (even for graphs of a few GBs).
>



Re: Jena/Fuseki graph sync

2017-11-24 Thread Osma Suominen

Dan Davis kirjoitti 24.11.2017 klo 16:53:

Rdflib has a graph_diff method that returns common, triples, only in left,
only in right.   It is in IsonorpgicGraph class, so it should handle blank
nodes.


Good luck running that on something like Wikidata though. It's far too 
big to fit in memory.


I'd use N-Triple files (old and new) sorted using the unix command sort, 
then use diff to determine added and removed triples, and finally turn 
those into INSERT DATA and DELETE DATA update operations. Assuming there 
are no blank nodes.


-Osma

(speaking as the author of the current rdflib in-memory store, IOMemory)

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi


Re: Jena/Fuseki graph sync

2017-11-24 Thread Dan Davis
What I do is load the new Nt file into an updates graph and then use
capabilities of the triple store to compare.   For me, that's virtuoso, so
SQL queries are also available.

On Nov 24, 2017 9:53 AM, "Dan Davis"  wrote:

> Rdflib has a graph_diff method that returns common, triples, only in left,
> only in right.   It is in IsonorpgicGraph class, so it should handle blank
> nodes.
>
> On Nov 24, 2017 7:19 AM, "Laura Morales"  wrote:
>
>> > What about simply deleting the old graph and loading the triples of the
>> > .nt file into the graph afterwards? I don't see any benefit of such a
>> > "tool" - you could just write your own bash script for this if you need
>> > this quite often.
>>
>> The advantage is with large graphs, such as wikidata. If I download their
>> dumps once a week, it's much more efficient to only change a few triples
>> instead of deleting the entire graph and recreating the whole TDB store.
>>
>


Re: Jena/Fuseki graph sync

2017-11-24 Thread ajs6f
You can use Model.difference(Model m) to do these calculations. 

ajs6f

> On Nov 24, 2017, at 7:21 AM, Laura Morales  wrote:
> 
>> The s-put tool that comes with Fuseki (or just doing a HTTP PUT to the
>> SPARQL Graph Store endpoint using e.g. curl) does exactly this -
>> replaces a graph with a new one in a single operation.
> 
> Deleting a whole graph, and pushing all the new triples over HTTP doesn't 
> look like a good fit for large graphs (even for graphs of a few GBs).



Re: Jena/Fuseki graph sync

2017-11-24 Thread Dan Davis
Rdflib has a graph_diff method that returns common, triples, only in left,
only in right.   It is in IsonorpgicGraph class, so it should handle blank
nodes.

On Nov 24, 2017 7:19 AM, "Laura Morales"  wrote:

> > What about simply deleting the old graph and loading the triples of the
> > .nt file into the graph afterwards? I don't see any benefit of such a
> > "tool" - you could just write your own bash script for this if you need
> > this quite often.
>
> The advantage is with large graphs, such as wikidata. If I download their
> dumps once a week, it's much more efficient to only change a few triples
> instead of deleting the entire graph and recreating the whole TDB store.
>


Re: Connection reset

2017-11-24 Thread Mohammad Noorani Bakerally
The same query:

PREFIX dcat: 
PREFIX data: 
PREFIX skos: 
PREFIX : 
CONSTRUCT {  ?p ?o . } WHERE {   ?p ?o . }

if %2C%20
 is
removed from https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%
20Loisirs, it works even with the prefix unexpanded


So, the query below works !!:

PREFIX dcat: 
PREFIX data: 
PREFIX skos: 
PREFIX : 
CONSTRUCT { > ?p ?o
. } WHERE {  > ?p ?o
. }



‌

On Fri, Nov 24, 2017 at 2:02 PM, Mohammad Noorani Bakerally <
noorani.bakera...@gmail.com> wrote:

> A temporary solution for my problem, is that I expand the query and sends
> it, it's temporary because for logs or analysis of prefixes or whatever...
> it may not be the best thing to do
>
> Query gq = QueryFactory.create(Global.prefixes + queryStr);
> gq.setPrefixMapping(Global.prefixMap);
> gq.getPrologue().getPrefixMapping().clearNsPrefixMap();
> QueryExecution qe = QueryExecutionFactory.sparqlService(this.location, 
> gq.serialize());
>
>
>
>
> ‌
>
> On Fri, Nov 24, 2017 at 11:21 AM, Mohammad Noorani Bakerally <
> noorani.bakera...@gmail.com> wrote:
>
>> I think it's how the client is creating the request, since one client can
>> send it and get a reply and another cannot send or not properly creating it
>> such that low level programs cannot send the request, not sure though
>>
>> On 24 Nov 2017 10:58, "Andy Seaborne"  wrote:
>>
>>>
>>>
>>> On 24/11/17 09:30, Mohammad Noorani Bakerally wrote:
>>>
 Just checked something, I've used the SPARQL Client YASGUI and the
 sparql
 query is answered properly, can we deduce that the issue is with the
 client
 ?

>>>
>>> That would seem most likely, something on the network path from client
>>> to server - it is so hard to be definite about these low level errors.
>>>
>>> Andy
>>>
>>>


 ‌

 On Fri, Nov 24, 2017 at 10:22 AM, Mohammad Noorani Bakerally <
 noorani.bakera...@gmail.com> wrote:

 Yes, I'm going to check the logs, but so far, a query like SELECT *
> WHERE
> { ?s ?p ?o .} LIMIT 10 is properly handled and results is returned, I
> can
> share the sparql endpoint, it is http://opensensingcity.
> emse.fr/sparql/bistro, it's just for some testing purposes, so if I
> understand, if a query is answered, Fuseki must be properly configured
> with
> apache, the resets happens immediately and there is no delay, i've not
> checked the log but it seems the request doesn't even go to the server
>
>
> ‌
>
> On Fri, Nov 24, 2017 at 10:03 AM, Andy Seaborne 
> wrote:
>
> HttpException: -1 Unexpected error making the query:
>>> java.net.SocketException: Connection reset
>>>
>>
>> This is a problem at a low level in the networking stack (fake status
>> code -1 from Jena also says it's not an HTTP error).  The other end
>> responded with a TCP RST (the connection reset bit) which is attempt
>> to use
>> a connection the other end thinks is closed or does not exist.
>>
>> There are many reasons that can cause this - some kind of network
>> environmental issue between client and server.
>>
>> Having a reverse proxy (RP) in front of the Fuseki server is one
>> possible
>> cause e.g. when Fuseki isn't there but the reverse proxy is, there
>> can be a
>> rejection at the TCP level. Or the RP has rebooted reboot.
>>
>> There are many reasons (StackOverflow has many questions about this).
>>
>> Check the Fuseki server log - did the query even reach the server?
>> Resets
>> usually happen at the start (e.g after a long period of no use and
>> the RP
>> has timed the connection out (Fuseki, standalone, hasn't configured to
>> Jetty to do this)..
>>
>> If it did reach the server, then some intermediate may have forcefully
>> closed the connection.
>>
>>  Andy
>>
>>
>> On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:
>>
>> I am getting an exception when executing the following a valid
>>> construct
>>> query on Fuseki via jena. Any idea about this problem ?
>>>
>>> The query:
>>> 

Re: Connection reset

2017-11-24 Thread Mohammad Noorani Bakerally
A temporary solution for my problem, is that I expand the query and sends
it, it's temporary because for logs or analysis of prefixes or whatever...
it may not be the best thing to do

Query gq = QueryFactory.create(Global.prefixes + queryStr);
gq.setPrefixMapping(Global.prefixMap);
gq.getPrologue().getPrefixMapping().clearNsPrefixMap();
QueryExecution qe = QueryExecutionFactory.sparqlService(this.location,
gq.serialize());




‌

On Fri, Nov 24, 2017 at 11:21 AM, Mohammad Noorani Bakerally <
noorani.bakera...@gmail.com> wrote:

> I think it's how the client is creating the request, since one client can
> send it and get a reply and another cannot send or not properly creating it
> such that low level programs cannot send the request, not sure though
>
> On 24 Nov 2017 10:58, "Andy Seaborne"  wrote:
>
>>
>>
>> On 24/11/17 09:30, Mohammad Noorani Bakerally wrote:
>>
>>> Just checked something, I've used the SPARQL Client YASGUI and the sparql
>>> query is answered properly, can we deduce that the issue is with the
>>> client
>>> ?
>>>
>>
>> That would seem most likely, something on the network path from client to
>> server - it is so hard to be definite about these low level errors.
>>
>> Andy
>>
>>
>>>
>>>
>>> ‌
>>>
>>> On Fri, Nov 24, 2017 at 10:22 AM, Mohammad Noorani Bakerally <
>>> noorani.bakera...@gmail.com> wrote:
>>>
>>> Yes, I'm going to check the logs, but so far, a query like SELECT * WHERE
 { ?s ?p ?o .} LIMIT 10 is properly handled and results is returned, I
 can
 share the sparql endpoint, it is http://opensensingcity.
 emse.fr/sparql/bistro, it's just for some testing purposes, so if I
 understand, if a query is answered, Fuseki must be properly configured
 with
 apache, the resets happens immediately and there is no delay, i've not
 checked the log but it seems the request doesn't even go to the server


 ‌

 On Fri, Nov 24, 2017 at 10:03 AM, Andy Seaborne 
 wrote:

 HttpException: -1 Unexpected error making the query:
>> java.net.SocketException: Connection reset
>>
>
> This is a problem at a low level in the networking stack (fake status
> code -1 from Jena also says it's not an HTTP error).  The other end
> responded with a TCP RST (the connection reset bit) which is attempt
> to use
> a connection the other end thinks is closed or does not exist.
>
> There are many reasons that can cause this - some kind of network
> environmental issue between client and server.
>
> Having a reverse proxy (RP) in front of the Fuseki server is one
> possible
> cause e.g. when Fuseki isn't there but the reverse proxy is, there can
> be a
> rejection at the TCP level. Or the RP has rebooted reboot.
>
> There are many reasons (StackOverflow has many questions about this).
>
> Check the Fuseki server log - did the query even reach the server?
> Resets
> usually happen at the start (e.g after a long period of no use and the
> RP
> has timed the connection out (Fuseki, standalone, hasn't configured to
> Jetty to do this)..
>
> If it did reach the server, then some intermediate may have forcefully
> closed the connection.
>
>  Andy
>
>
> On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:
>
> I am getting an exception when executing the following a valid
>> construct
>> query on Fuseki via jena. Any idea about this problem ?
>>
>> The query:
>> ==
>> PREFIX dcat: 
>> PREFIX data: 
>> PREFIX skos: 
>> PREFIX : 
>> CONSTRUCT { <
>> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs>
>> ?p
>> ?o .
>> } WHERE {  <
>> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs>
>> ?p
>> ?o .
>> }
>>
>>
>>
>> The exception:
>> 
>> HttpException: -1 Unexpected error making the query:
>> java.net.SocketException: Connection reset
>>
>> at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuer
>> y.java:374)
>> at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQue
>> ry.java:337)
>> at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.
>> java:288)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>> ructWorker(QueryEngineHTTP.java:465)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel
>> (QueryEngineHTTP.java:428)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>> ruct(QueryEngineHTTP.java:389)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>> ruct(QueryEngineHTTP.java:384)

Re: Jena/Fuseki graph sync

2017-11-24 Thread Lorenz Buehmann
But you would have to do an expensive computation anyways. The
computation of a diff would have to be done. That means you have to
compare two big datasets somehow.

Input: existing graph G and new set of triple T

For each triple t in T

  If !(t in G)

 G := G union {t}

For each triple t in G

  If !(t in T)

 G := G \ {t}



And it becomes more complex once blank nodes occur.

The better way would be to provide incremental changesets by the source.
For example, DBpedia did this some time ago.


On 24.11.2017 13:19, Laura Morales wrote:
>> What about simply deleting the old graph and loading the triples of the
>> .nt file into the graph afterwards? I don't see any benefit of such a
>> "tool" - you could just write your own bash script for this if you need
>> this quite often.
> The advantage is with large graphs, such as wikidata. If I download their 
> dumps once a week, it's much more efficient to only change a few triples 
> instead of deleting the entire graph and recreating the whole TDB store.



Re: Jena/Fuseki graph sync

2017-11-24 Thread Laura Morales
> The s-put tool that comes with Fuseki (or just doing a HTTP PUT to the
> SPARQL Graph Store endpoint using e.g. curl) does exactly this -
> replaces a graph with a new one in a single operation.

Deleting a whole graph, and pushing all the new triples over HTTP doesn't look 
like a good fit for large graphs (even for graphs of a few GBs).


Re: Jena/Fuseki graph sync

2017-11-24 Thread Laura Morales
> What about simply deleting the old graph and loading the triples of the
> .nt file into the graph afterwards? I don't see any benefit of such a
> "tool" - you could just write your own bash script for this if you need
> this quite often.

The advantage is with large graphs, such as wikidata. If I download their dumps 
once a week, it's much more efficient to only change a few triples instead of 
deleting the entire graph and recreating the whole TDB store.


Re: Connection reset

2017-11-24 Thread Mohammad Noorani Bakerally
I think it's how the client is creating the request, since one client can
send it and get a reply and another cannot send or not properly creating it
such that low level programs cannot send the request, not sure though

On 24 Nov 2017 10:58, "Andy Seaborne"  wrote:

>
>
> On 24/11/17 09:30, Mohammad Noorani Bakerally wrote:
>
>> Just checked something, I've used the SPARQL Client YASGUI and the sparql
>> query is answered properly, can we deduce that the issue is with the
>> client
>> ?
>>
>
> That would seem most likely, something on the network path from client to
> server - it is so hard to be definite about these low level errors.
>
> Andy
>
>
>>
>>
>> ‌
>>
>> On Fri, Nov 24, 2017 at 10:22 AM, Mohammad Noorani Bakerally <
>> noorani.bakera...@gmail.com> wrote:
>>
>> Yes, I'm going to check the logs, but so far, a query like SELECT * WHERE
>>> { ?s ?p ?o .} LIMIT 10 is properly handled and results is returned, I can
>>> share the sparql endpoint, it is http://opensensingcity.
>>> emse.fr/sparql/bistro, it's just for some testing purposes, so if I
>>> understand, if a query is answered, Fuseki must be properly configured
>>> with
>>> apache, the resets happens immediately and there is no delay, i've not
>>> checked the log but it seems the request doesn't even go to the server
>>>
>>>
>>> ‌
>>>
>>> On Fri, Nov 24, 2017 at 10:03 AM, Andy Seaborne  wrote:
>>>
>>> HttpException: -1 Unexpected error making the query:
> java.net.SocketException: Connection reset
>

 This is a problem at a low level in the networking stack (fake status
 code -1 from Jena also says it's not an HTTP error).  The other end
 responded with a TCP RST (the connection reset bit) which is attempt to
 use
 a connection the other end thinks is closed or does not exist.

 There are many reasons that can cause this - some kind of network
 environmental issue between client and server.

 Having a reverse proxy (RP) in front of the Fuseki server is one
 possible
 cause e.g. when Fuseki isn't there but the reverse proxy is, there can
 be a
 rejection at the TCP level. Or the RP has rebooted reboot.

 There are many reasons (StackOverflow has many questions about this).

 Check the Fuseki server log - did the query even reach the server?
 Resets
 usually happen at the start (e.g after a long period of no use and the
 RP
 has timed the connection out (Fuseki, standalone, hasn't configured to
 Jetty to do this)..

 If it did reach the server, then some intermediate may have forcefully
 closed the connection.

  Andy


 On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:

 I am getting an exception when executing the following a valid construct
> query on Fuseki via jena. Any idea about this problem ?
>
> The query:
> ==
> PREFIX dcat: 
> PREFIX data: 
> PREFIX skos: 
> PREFIX : 
> CONSTRUCT { <
> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
> ?o .
> } WHERE {  <
> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
> ?o .
> }
>
>
>
> The exception:
> 
> HttpException: -1 Unexpected error making the query:
> java.net.SocketException: Connection reset
>
> at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuer
> y.java:374)
> at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQue
> ry.java:337)
> at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.
> java:288)
> at
> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
> ructWorker(QueryEngineHTTP.java:465)
> at
> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel
> (QueryEngineHTTP.java:428)
> at
> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
> ruct(QueryEngineHTTP.java:389)
> at
> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
> ruct(QueryEngineHTTP.java:384)
> at
> loader.configuration.SPARQLDataSource.executeGraphQuery(SPAR
> QLDataSource.java:43)
> at genPLDPD.Evaluation.evalRM(Evaluation.java:136)
>
>

>>>
>>


Re: Connection reset

2017-11-24 Thread Andy Seaborne



On 24/11/17 09:30, Mohammad Noorani Bakerally wrote:

Just checked something, I've used the SPARQL Client YASGUI and the sparql
query is answered properly, can we deduce that the issue is with the client
?


That would seem most likely, something on the network path from client 
to server - it is so hard to be definite about these low level errors.


Andy





‌

On Fri, Nov 24, 2017 at 10:22 AM, Mohammad Noorani Bakerally <
noorani.bakera...@gmail.com> wrote:


Yes, I'm going to check the logs, but so far, a query like SELECT * WHERE
{ ?s ?p ?o .} LIMIT 10 is properly handled and results is returned, I can
share the sparql endpoint, it is http://opensensingcity.
emse.fr/sparql/bistro, it's just for some testing purposes, so if I
understand, if a query is answered, Fuseki must be properly configured with
apache, the resets happens immediately and there is no delay, i've not
checked the log but it seems the request doesn't even go to the server


‌

On Fri, Nov 24, 2017 at 10:03 AM, Andy Seaborne  wrote:


HttpException: -1 Unexpected error making the query:
java.net.SocketException: Connection reset


This is a problem at a low level in the networking stack (fake status
code -1 from Jena also says it's not an HTTP error).  The other end
responded with a TCP RST (the connection reset bit) which is attempt to use
a connection the other end thinks is closed or does not exist.

There are many reasons that can cause this - some kind of network
environmental issue between client and server.

Having a reverse proxy (RP) in front of the Fuseki server is one possible
cause e.g. when Fuseki isn't there but the reverse proxy is, there can be a
rejection at the TCP level. Or the RP has rebooted reboot.

There are many reasons (StackOverflow has many questions about this).

Check the Fuseki server log - did the query even reach the server? Resets
usually happen at the start (e.g after a long period of no use and the RP
has timed the connection out (Fuseki, standalone, hasn't configured to
Jetty to do this)..

If it did reach the server, then some intermediate may have forcefully
closed the connection.

 Andy


On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:


I am getting an exception when executing the following a valid construct
query on Fuseki via jena. Any idea about this problem ?

The query:
==
PREFIX dcat: 
PREFIX data: 
PREFIX skos: 
PREFIX : 
CONSTRUCT { <
https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
?o .
} WHERE {  <
https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
?o .
}



The exception:

HttpException: -1 Unexpected error making the query:
java.net.SocketException: Connection reset

at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuer
y.java:374)
at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQue
ry.java:337)
at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:288)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
ructWorker(QueryEngineHTTP.java:465)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel
(QueryEngineHTTP.java:428)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
ruct(QueryEngineHTTP.java:389)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
ruct(QueryEngineHTTP.java:384)
at
loader.configuration.SPARQLDataSource.executeGraphQuery(SPAR
QLDataSource.java:43)
at genPLDPD.Evaluation.evalRM(Evaluation.java:136)









Re: Jena/Fuseki graph sync

2017-11-24 Thread Osma Suominen

Lorenz Buehmann kirjoitti 24.11.2017 klo 11:53:

Ok, but there is no magic behind the tool I guess. I mean, it's not a
tool like incrementally updating a dataset by doing some diffs, etc.


No, there's no magic. It's all in the SPARQL HTTP Graph Store spec. The 
update is not incremental, it's a replacement in a single atomic 
operation, so perhaps somewhat simpler than "deleting the old graph and 
loading the triples of the .nt file into the graph afterwards" that you 
suggested.


-Osma

--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi


Re: Jena/Fuseki graph sync

2017-11-24 Thread Lorenz Buehmann
Ok, but there is no magic behind the tool I guess. I mean, it's not a
tool like incrementally updating a dataset by doing some diffs, etc.

Or am I wrong?


On 24.11.2017 10:51, Osma Suominen wrote:
> Lorenz Buehmann kirjoitti 24.11.2017 klo 11:46:
>
>> What about simply deleting the old graph and loading the triples of the
>> .nt file into the graph afterwards? I don't see any benefit of such a
>> "tool" - you could just write your own bash script for this if you need
>> this quite often.
>
> The s-put tool that comes with Fuseki (or just doing a HTTP PUT to the
> SPARQL Graph Store endpoint using e.g. curl) does exactly this -
> replaces a graph with a new one in a single operation.
>
> In the original scenario, blank nodes can be a problem if you have
> them in your data. There is no way (at least not efficiently) to
> compare blank nodes in two graphs.
>
> -Osma
>
>



Re: Jena/Fuseki graph sync

2017-11-24 Thread Osma Suominen

Lorenz Buehmann kirjoitti 24.11.2017 klo 11:46:


What about simply deleting the old graph and loading the triples of the
.nt file into the graph afterwards? I don't see any benefit of such a
"tool" - you could just write your own bash script for this if you need
this quite often.


The s-put tool that comes with Fuseki (or just doing a HTTP PUT to the 
SPARQL Graph Store endpoint using e.g. curl) does exactly this - 
replaces a graph with a new one in a single operation.


In the original scenario, blank nodes can be a problem if you have them 
in your data. There is no way (at least not efficiently) to compare 
blank nodes in two graphs.


-Osma


--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi


Re: Jena/Fuseki graph sync

2017-11-24 Thread Lorenz Buehmann
> process the graph in the dataset has the exact same triples of the .nt file?

What about simply deleting the old graph and loading the triples of the
.nt file into the graph afterwards? I don't see any benefit of such a
"tool" - you could just write your own bash script for this if you need
this quite often.


On 24.11.2017 10:03, Laura Morales wrote:
> Does Fuseki have any tool to "synchronize" a graph in the dataset with a .nt 
> file? In other words, some tool that given a dataset/graph and a .nt file as 
> input, will parse the triples in the .nt file and automatically add/delete 
> triples in the dataset/graph such that at the end of the process the graph in 
> the dataset has the exact same triples of the .nt file?
>
> If no such tool exists, how could I achieve something like this with the 
> existing tools?
>
> Thank you.



Re: Connection reset

2017-11-24 Thread Mohammad Noorani Bakerally
Just checked something, I've used the SPARQL Client YASGUI and the sparql
query is answered properly, can we deduce that the issue is with the client
?



‌

On Fri, Nov 24, 2017 at 10:22 AM, Mohammad Noorani Bakerally <
noorani.bakera...@gmail.com> wrote:

> Yes, I'm going to check the logs, but so far, a query like SELECT * WHERE
> { ?s ?p ?o .} LIMIT 10 is properly handled and results is returned, I can
> share the sparql endpoint, it is http://opensensingcity.
> emse.fr/sparql/bistro, it's just for some testing purposes, so if I
> understand, if a query is answered, Fuseki must be properly configured with
> apache, the resets happens immediately and there is no delay, i've not
> checked the log but it seems the request doesn't even go to the server
>
>
> ‌
>
> On Fri, Nov 24, 2017 at 10:03 AM, Andy Seaborne  wrote:
>
>> > HttpException: -1 Unexpected error making the query:
>> > java.net.SocketException: Connection reset
>>
>> This is a problem at a low level in the networking stack (fake status
>> code -1 from Jena also says it's not an HTTP error).  The other end
>> responded with a TCP RST (the connection reset bit) which is attempt to use
>> a connection the other end thinks is closed or does not exist.
>>
>> There are many reasons that can cause this - some kind of network
>> environmental issue between client and server.
>>
>> Having a reverse proxy (RP) in front of the Fuseki server is one possible
>> cause e.g. when Fuseki isn't there but the reverse proxy is, there can be a
>> rejection at the TCP level. Or the RP has rebooted reboot.
>>
>> There are many reasons (StackOverflow has many questions about this).
>>
>> Check the Fuseki server log - did the query even reach the server? Resets
>> usually happen at the start (e.g after a long period of no use and the RP
>> has timed the connection out (Fuseki, standalone, hasn't configured to
>> Jetty to do this)..
>>
>> If it did reach the server, then some intermediate may have forcefully
>> closed the connection.
>>
>> Andy
>>
>>
>> On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:
>>
>>> I am getting an exception when executing the following a valid construct
>>> query on Fuseki via jena. Any idea about this problem ?
>>>
>>> The query:
>>> ==
>>> PREFIX dcat: 
>>> PREFIX data: 
>>> PREFIX skos: 
>>> PREFIX : 
>>> CONSTRUCT { <
>>> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
>>> ?o .
>>> } WHERE {  <
>>> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
>>> ?o .
>>> }
>>>
>>>
>>>
>>> The exception:
>>> 
>>> HttpException: -1 Unexpected error making the query:
>>> java.net.SocketException: Connection reset
>>>
>>> at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuer
>>> y.java:374)
>>> at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQue
>>> ry.java:337)
>>> at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:288)
>>> at
>>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>>> ructWorker(QueryEngineHTTP.java:465)
>>> at
>>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel
>>> (QueryEngineHTTP.java:428)
>>> at
>>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>>> ruct(QueryEngineHTTP.java:389)
>>> at
>>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>>> ruct(QueryEngineHTTP.java:384)
>>> at
>>> loader.configuration.SPARQLDataSource.executeGraphQuery(SPAR
>>> QLDataSource.java:43)
>>> at genPLDPD.Evaluation.evalRM(Evaluation.java:136)
>>>
>>
>


Re: Connection reset

2017-11-24 Thread Mohammad Noorani Bakerally
Yes, I'm going to check the logs, but so far, a query like SELECT * WHERE {
?s ?p ?o .} LIMIT 10 is properly handled and results is returned, I can
share the sparql endpoint, it is
http://opensensingcity.emse.fr/sparql/bistro, it's just for some testing
purposes, so if I understand, if a query is answered, Fuseki must be
properly configured with apache, the resets happens immediately and there
is no delay, i've not checked the log but it seems the request doesn't even
go to the server


‌

On Fri, Nov 24, 2017 at 10:03 AM, Andy Seaborne  wrote:

> > HttpException: -1 Unexpected error making the query:
> > java.net.SocketException: Connection reset
>
> This is a problem at a low level in the networking stack (fake status code
> -1 from Jena also says it's not an HTTP error).  The other end responded
> with a TCP RST (the connection reset bit) which is attempt to use a
> connection the other end thinks is closed or does not exist.
>
> There are many reasons that can cause this - some kind of network
> environmental issue between client and server.
>
> Having a reverse proxy (RP) in front of the Fuseki server is one possible
> cause e.g. when Fuseki isn't there but the reverse proxy is, there can be a
> rejection at the TCP level. Or the RP has rebooted reboot.
>
> There are many reasons (StackOverflow has many questions about this).
>
> Check the Fuseki server log - did the query even reach the server? Resets
> usually happen at the start (e.g after a long period of no use and the RP
> has timed the connection out (Fuseki, standalone, hasn't configured to
> Jetty to do this)..
>
> If it did reach the server, then some intermediate may have forcefully
> closed the connection.
>
> Andy
>
>
> On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:
>
>> I am getting an exception when executing the following a valid construct
>> query on Fuseki via jena. Any idea about this problem ?
>>
>> The query:
>> ==
>> PREFIX dcat: 
>> PREFIX data: 
>> PREFIX skos: 
>> PREFIX : 
>> CONSTRUCT { <
>> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
>> ?o .
>> } WHERE {  <
>> https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p
>> ?o .
>> }
>>
>>
>>
>> The exception:
>> 
>> HttpException: -1 Unexpected error making the query:
>> java.net.SocketException: Connection reset
>>
>> at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuer
>> y.java:374)
>> at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQue
>> ry.java:337)
>> at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:288)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>> ructWorker(QueryEngineHTTP.java:465)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel
>> (QueryEngineHTTP.java:428)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>> ruct(QueryEngineHTTP.java:389)
>> at
>> org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConst
>> ruct(QueryEngineHTTP.java:384)
>> at
>> loader.configuration.SPARQLDataSource.executeGraphQuery(SPAR
>> QLDataSource.java:43)
>> at genPLDPD.Evaluation.evalRM(Evaluation.java:136)
>>
>


Jena/Fuseki graph sync

2017-11-24 Thread Laura Morales
Does Fuseki have any tool to "synchronize" a graph in the dataset with a .nt 
file? In other words, some tool that given a dataset/graph and a .nt file as 
input, will parse the triples in the .nt file and automatically add/delete 
triples in the dataset/graph such that at the end of the process the graph in 
the dataset has the exact same triples of the .nt file?

If no such tool exists, how could I achieve something like this with the 
existing tools?

Thank you.


Re: Connection reset

2017-11-24 Thread Andy Seaborne

> HttpException: -1 Unexpected error making the query:
> java.net.SocketException: Connection reset

This is a problem at a low level in the networking stack (fake status 
code -1 from Jena also says it's not an HTTP error).  The other end 
responded with a TCP RST (the connection reset bit) which is attempt to 
use a connection the other end thinks is closed or does not exist.


There are many reasons that can cause this - some kind of network 
environmental issue between client and server.


Having a reverse proxy (RP) in front of the Fuseki server is one 
possible cause e.g. when Fuseki isn't there but the reverse proxy is, 
there can be a rejection at the TCP level. Or the RP has rebooted reboot.


There are many reasons (StackOverflow has many questions about this).

Check the Fuseki server log - did the query even reach the server? 
Resets usually happen at the start (e.g after a long period of no use 
and the RP has timed the connection out (Fuseki, standalone, hasn't 
configured to Jetty to do this)..


If it did reach the server, then some intermediate may have forcefully 
closed the connection.


Andy

On 23/11/17 22:38, Mohammad Noorani Bakerally wrote:

I am getting an exception when executing the following a valid construct
query on Fuseki via jena. Any idea about this problem ?

The query:
==
PREFIX dcat: 
PREFIX data: 
PREFIX skos: 
PREFIX : 
CONSTRUCT { <
https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p ?o .
} WHERE {  <
https://bistrotdepays.opendatasoft.com/id/theme/Sport%2C%20Loisirs> ?p ?o .
}



The exception:

HttpException: -1 Unexpected error making the query:
java.net.SocketException: Connection reset

at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuery.java:374)
at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:337)
at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:288)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConstructWorker(QueryEngineHTTP.java:465)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execModel(QueryEngineHTTP.java:428)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConstruct(QueryEngineHTTP.java:389)
at
org.apache.jena.sparql.engine.http.QueryEngineHTTP.execConstruct(QueryEngineHTTP.java:384)
at
loader.configuration.SPARQLDataSource.executeGraphQuery(SPARQLDataSource.java:43)
at genPLDPD.Evaluation.evalRM(Evaluation.java:136)