"""
How can i get the Fuseki API via SPARQLWrapper to properly report a detailed error message e.g. with something like "error in line # cr:Event__102140gtm20003 cr:Event_location "M\\"unster, Germany". is not a valid triple?
"""

This is a Q about SPARQLWrapper, not Fuseki.

Look in the response body because, for Fuseki, it has the details of the error in plain text.

You can also print the query out in Python and parse it with Jena locally. Or send it with curl which prints the body.


    Andy

On 19/08/2020 13:18, Wolfgang Fahl wrote:
Dear Apache Jena Users,

you'll find this mail also as https://stackoverflow.com/questions/63486767/how-can-i-get-the-fuseki-api-via-sparqlwrapper-to-properly-report-a-detailed-err

in the last few weeks i tried out some graph databases in the python environment. Namely:

- weaviate see http://wiki.bitplan.com/index.php/Weaviate

- dgraph http://wiki.bitplan.com/index.php/Dgraph

- ruruki https://pypi.org/project/ruruki/

and created a test project documented at http://wiki.bitplan.com/index.php/DgraphAndWeaviateTest and open source at:
https://github.com/WolfgangFahl/DgraphAndWeaviateTest

After some ups and downs in the evaluation process i decided to try out Apache Jena / Fuseki /SPARQL as an alternative and added:

https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/storage/sparql.py
and
https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/tests/testSPARQL.py

to allow for a "round trip" operation between python list of dicts and Jena/SPARQL based storage.

The approach performs very well for my usecase and after trying it out for a while i get into more details that need to be addressed.

The stackoverflow question https://stackoverflow.com/questions/63435157/listofdict-to-rdf-conversion-in-python-targeting-apache-jena-fuseki/63440396#63440396 addresses the initial issues and https://github.com/WolfgangFahl/DgraphAndWeaviateTest/issues?q=is%3Aissue+is%3Aclosed issues 2-5 show some detail problems that were already fixed.

Now I am working with some 180000 records i'd like to import from 6 different data sources and each data source seems to have new exotic records
that make the approach fail.

E.g. one batch of records gives me the following log:

read 45601 events in   0.6 s
storing 45601 events to sparql
  batch for         1 -      2000 of     45601 cr:Event in    0.6 s ->    0.6 s   batch for      2001 -      4000 of     45601 cr:Event in    0.5 s ->    1.1 s   batch for      4001 -      6000 of     45601 cr:Event in    0.5 s ->    1.6 s   batch for      6001 -      8000 of     45601 cr:Event in    0.5 s ->    2.1 s   batch for      8001 -     10000 of     45601 cr:Event in    0.5 s ->    2.6 s   batch for     10001 -     12000 of     45601 cr:Event in    0.7 s ->    3.2 s
======================================================================
ERROR: testCrossref (tests.test_Crossref.TestCrossref)
test loading crossref data
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/wf/Library/Python/3.8/lib/python/site-packages/SPARQLWrapper/Wrapper.py", line 1073, in _query
     response = urlopener(request)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen
     return opener.open(url, data, timeout)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 531, in open
     response = meth(req, response)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 640, in http_response
     response = self.parent.error(
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 569, in error
     return self._call_chain(*args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
     result = func(*args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 649, in http_error_default
     raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

SPARQLWrapper.SPARQLExceptions.QueryBadFormed: QueryBadFormed: a bad request has been sent to the endpoint, probably the sparql query is bad formed.

Response:
b'Error 400: Bad Request\n'

Now since I don't get any details on what the problem is i am working with a binary search. With the error above i only know the problem is with a record with a batchIndex between 12000 and 14000 so I am . setting the limit to 14000 and batchSize to 100 to get closer.

 batch for     13301 -     13400 of     14000 cr:Event in    0.0 s ->    4.3 s

is now the last successful batch. So i am using a binary search: 13450 fail, 13425 fail, 13412 ok, 13418 ok, 13422 fail, 13420 ok, 13421 ok So record 13422 is the culprit and I switch on debug mode to see the INSERT Data created for the record:

   cr:Event__102140gtm20003 cr:Event_name "Higher local fields".
   cr:Event__102140gtm20003 cr:Event_location "M\\"unster, Germany".
   cr:Event__102140gtm20003 cr:Event_source "crossref".
   cr:Event__102140gtm20003 cr:Event_eventId "10.2140/gtm.2000.3".
  cr:Event__102140gtm20003 cr:Event_title "Invitation to higher local fields".   cr:Event__102140gtm20003 cr:Event_startDate "1999-08-29"^^<http://www.w3.org/2001/XMLSchema#date>.
   cr:Event__102140gtm20003 cr:Event_year 1999.
   cr:Event__102140gtm20003 cr:Event_month 9.
  cr:Event__102140gtm20003 cr:Event_endDate "1999-09-05"^^<http://www.w3.org/2001/XMLSchema#date>.

So the Umlaut-encoding "\\u" in the location "Münster" is the culprit here. I will work around this issue. The real question is:

*How can i get the Fuseki API via SPARQLWrapper to properly report a detailed error message e.g. with something like "error in line # cr:Event__102140gtm20003 cr:Event_location "M\\"unster, Germany". is not a valid triple?**
*


Yours

    Wolfgang

--

BITPlan - smart solutions
Wolfgang Fahl
Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
Tel. +49 2154 811-480, Fax +49 2154 811-481
Web:http://www.bitplan.de
BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, 
Geschäftsführer: Wolfgang Fahl

Reply via email to