On 28/12/17 04:43, Stefano Cossu wrote:
Hi Andy,
By doing a straight POST on Fuseki/TDB/Jetty, on a quad-core i7 laptop
with 12Gb RAM:
time curl -i --data-binary 'WITH <info:graph/__root__> DELETE {}
INSERT{<info:s#1> <info:p#1> <info:o#1> . } WHERE {}'
-H'Content-Type:application/sparql-update'
'http://localhost:3030/lakesuperior-dev/update'
HTTP/1.1 204 No Content
Date: Thu, 28 Dec 2017 04:34:56 GMT
Fuseki-Request-ID: 12
real 0m0.144s
user 0m0.010s
sys 0m0.002s
Repeated calls will be faster because when curl is invoked like that,
the connection is created each time. Max HTTP overhead.
That operation is the same as doing:
INSERT DATA {}
Fuseki log:
[2017-12-27 22:34:56] Fuseki INFO [12] POST
http://localhost:3030/lakesuperior-dev/update
[2017-12-27 22:34:56] Fuseki INFO [12] POST /lakesuperior-dev ::
'update' :: [application/sparql-update] ?
[2017-12-27 22:34:56] Fuseki INFO [12] 204 No Content (125 ms)
There is some HTTP overhead indeed but as you suggest it seems to be
mostly Fuseki doing work. The time for sending the same request goes
from 120 to 189ms. Would you consider this normal and should I settle on
it?
When sending from Python?
Sorry - I have no experience using python with or without rdflib where
this type of performance matters.
This is important for me because so far I have bundled more complex
requests in one SPARQL update or query request to avoid the HTTP tax,
but if that were less severe than having Fuseki parse one complex query
I could rethink my application code.
A sequence of SPARQL Update requests can be sent in one request by using
";" between them.
Thanks,
Stefano
On 12/26/2017 11:50 AM, Andy Seaborne wrote:
I suspect due to the HTTP overhead: profiling shows a large chunk of
time waiting for sockets.
If it waiting, then either it is because Fuseki is doing work (see the
log file which has entries at start and end of an operation), or the
client is waiting (maybe connection management issues?).
Fuseki does keep the connection open (connection caching). If log
looks correct, how long is the client waiting?
Andy
On 26/12/17 03:25, Stefano Cossu wrote:
Dick,
I am interested in hearing the reasons behind your developers
dropping RDFLib, which I find very convenient for de/serializing RDF
but I feel like it is somewhat brittle and quite obscure in the back
end connection part. I think that your approach to using straight
HTTP calls for that may be a better choice.
Also, thanks for the tip on Thrift. I am not familiar with it but I
would be interested in knowing how your team is building Python
bindings for the Jena API if it is meant to become a public project
at some point.
Best,
Stefano
On 12/24/2017 04:33 PM, dandh988 wrote:
We use Python against Jena/Fuseki/CustomHTTP and find direct SPARQL
against the endpoint to be "fast". The Python Devs dropped using the
RDFLib.
We also have a Thirft connection in development which is proving
useful for low level Jena API access.
Dick
-------- Original message --------From: Stefano Cossu
<[email protected]> Date: 24/12/2017 22:10 (GMT+00:00) To:
[email protected] Subject: Python bindings?
Hello,
I am writing a LDP server using Python's RDFlib and Fuseki/TDB as a
back
end store.
Right now my application is very slow, I suspect due to the HTTP
overhead: profiling shows a large chunk of time waiting for sockets.
Is there a reliable way to write Python code against the Fuseki Java
API? I understand that Fuseki is written in Java and there are no
native
Python bindings. I have looked at options such as Jython, Jpype and
PyJnius but I am wondering how reliable these options are. Any
suggestions?
Thanks,
Stefano