Andy - thx for the response.

I tried:

curl -X POST -H "Content-Type:application/sparql-update" -d @error.data 
localhost:3030/cr/update

Error 400: Bad Request

and

curl -X POST -H "Content-Type:application/sparql-update" -d @insert.data 
localhost:3030/cr/update

cat insert.data 
PREFIX cr: <http://cr.bitplan.com/>
        INSERT DATA { 
          cr:version cr:author "Wolfgang Fahl". 
        }

cat error.data

PREFIX cr: <http://cr.bitplan.com/Event/0.1/>

INSERT DATA {

  
cr:Event__BioinformaticsofGenomeRegulationandStructureSystemsBiologyBGRSSB2018 
cr:Event_title "“Bioinformatics of Genome Regulation and Structure\Systems 
Biology” – BGRS\SB-2018".

  
cr:Event__BioinformaticsofGenomeRegulationandStructureSystemsBiologyBGRSSB2018 
cr:Event_url 
"https://thenode.biologists.com/event/11th-international-multiconference-bioinformatics-genome-regulation-structuresystems-biology-bgrssb-2018/";.

}

and followed the hint of Standislav Kralin at
https://stackoverflow.com/questions/63486767/how-can-i-get-the-fuseki-api-via-sparqlwrapper-to-properly-report-a-detailed-err
to add a new test

 def testSPARQLErrorMessage(self):
        '''
        test error handling 
        see 
https://stackoverflow.com/questions/63486767/how-can-i-get-the-fuseki-api-via-sparqlwrapper-to-properly-report-a-detailed-err
        '''
        listOfDicts=[{
            'title': '“Bioinformatics of Genome Regulation and 
Structure\Systems Biology” – BGRS\SB-2018',
            'url': 
'https://thenode.biologists.com/event/11th-international-multiconference-bioinformatics-genome-regulation-structuresystems-biology-bgrssb-2018/'}]
        entityType="cr:Event"   
        primaryKey='title'
        prefixes="PREFIX cr: <http://cr.bitplan.com/Event/0.1/>"
        jena=self.getJena(mode='update',typedLiterals=False,debug=True)
        
errors=jena.insertListOfDicts(listOfDicts,entityType,primaryKey,prefixes)
        self.checkErrors(errors,1)
        error=errors[0]
        self.assertTrue("probably the sparql query is bad formed" in error)

which gives:

Response:

b'Error 400: Bad Request\n'

ERRORS:

QueryBadFormed: a bad request has been sent to the endpoint, probably the 
sparql query is bad formed. 

Response:

b'Error 400: Bad Request\n' for record 0

The response body of the 400 HttpError doesn't seem to have more data
and i would not know how to get extra information via the curl request.
The question is IMHO still unsolved and i am not sure whether
SPARQLWrapper could do better or how...
Cheers

  Wolfgang

Am 19.08.20 um 16:15 schrieb Andy Seaborne:
> """
> How can i get the Fuseki API via SPARQLWrapper to properly report a
> detailed error message e.g. with something like "error in line #
> cr:Event__102140gtm20003 cr:Event_location "M\\"unster, Germany". is
> not a valid triple?
> """
>
> This is a Q about SPARQLWrapper, not Fuseki.
>
> Look in the response body because, for Fuseki, it has the details of
> the error in plain text.
>
> You can also print the query out in Python and parse it with Jena
> locally. Or send it with curl which prints the body.
>
>
>     Andy
>
> On 19/08/2020 13:18, Wolfgang Fahl wrote:
>> Dear Apache Jena Users,
>>
>> you'll find this mail also as
>> https://stackoverflow.com/questions/63486767/how-can-i-get-the-fuseki-api-via-sparqlwrapper-to-properly-report-a-detailed-err
>>
>> in the last few weeks i tried out some graph databases in the python
>> environment. Namely:
>>
>> - weaviate see http://wiki.bitplan.com/index.php/Weaviate
>>
>> - dgraph http://wiki.bitplan.com/index.php/Dgraph
>>
>> - ruruki https://pypi.org/project/ruruki/
>>
>> and created a test project documented at
>> http://wiki.bitplan.com/index.php/DgraphAndWeaviateTest and open
>> source at:
>> https://github.com/WolfgangFahl/DgraphAndWeaviateTest
>>
>> After some ups and downs in the evaluation process i decided to try
>> out Apache Jena / Fuseki /SPARQL as an alternative and added:
>>
>> https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/storage/sparql.py
>>
>> and
>> https://github.com/WolfgangFahl/DgraphAndWeaviateTest/blob/master/tests/testSPARQL.py
>>
>>
>> to allow for a "round trip" operation between python list of dicts
>> and Jena/SPARQL based storage.
>>
>> The approach performs very well for my usecase and after trying it
>> out for a while i get into more details that need to be addressed.
>>
>> The stackoverflow question
>> https://stackoverflow.com/questions/63435157/listofdict-to-rdf-conversion-in-python-targeting-apache-jena-fuseki/63440396#63440396
>> addresses the initial issues and
>> https://github.com/WolfgangFahl/DgraphAndWeaviateTest/issues?q=is%3Aissue+is%3Aclosed
>> issues 2-5 show some detail problems that were already fixed.
>>
>> Now I am working with some 180000 records i'd like to import from 6
>> different data sources and each data source seems to have new exotic
>> records
>> that make the approach fail.
>>
>> E.g. one batch of records gives me the following log:
>>
>> read 45601 events in   0.6 s
>> storing 45601 events to sparql
>>    batch for         1 -      2000 of     45601 cr:Event in    0.6 s
>> ->    0.6 s
>>    batch for      2001 -      4000 of     45601 cr:Event in    0.5 s
>> ->    1.1 s
>>    batch for      4001 -      6000 of     45601 cr:Event in    0.5 s
>> ->    1.6 s
>>    batch for      6001 -      8000 of     45601 cr:Event in    0.5 s
>> ->    2.1 s
>>    batch for      8001 -     10000 of     45601 cr:Event in    0.5 s
>> ->    2.6 s
>>    batch for     10001 -     12000 of     45601 cr:Event in    0.7 s
>> ->    3.2 s
>> ======================================================================
>> ERROR: testCrossref (tests.test_Crossref.TestCrossref)
>> test loading crossref data
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>    File
>> "/Users/wf/Library/Python/3.8/lib/python/site-packages/SPARQLWrapper/Wrapper.py",
>> line 1073, in _query
>>      response = urlopener(request)
>>    File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py",
>> line 222, in urlopen
>>      return opener.open(url, data, timeout)
>>    File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py",
>> line 531, in open
>>      response = meth(req, response)
>>    File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py",
>> line 640, in http_response
>>      response = self.parent.error(
>>    File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py",
>> line 569, in error
>>      return self._call_chain(*args)
>>    File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py",
>> line 502, in _call_chain
>>      result = func(*args)
>>    File
>> "/opt/local/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py",
>> line 649, in http_error_default
>>      raise HTTPError(req.full_url, code, msg, hdrs, fp)
>> urllib.error.HTTPError: HTTP Error 400: Bad Request
>>
>> SPARQLWrapper.SPARQLExceptions.QueryBadFormed: QueryBadFormed: a bad
>> request has been sent to the endpoint, probably the sparql query is
>> bad formed.
>>
>> Response:
>> b'Error 400: Bad Request\n'
>>
>> Now since I don't get any details on what the problem is i am working
>> with a binary search. With the error above i only know the problem
>> is with a record with a batchIndex between 12000 and 14000 so I am .
>> setting the limit to 14000 and batchSize to 100 to get closer.
>>
>>   batch for     13301 -     13400 of     14000 cr:Event in    0.0 s
>> ->    4.3 s
>>
>> is now the last successful batch. So i am using a binary search:
>> 13450 fail, 13425 fail, 13412 ok, 13418 ok, 13422 fail, 13420 ok,
>> 13421 ok
>> So record 13422 is the culprit and I switch on debug mode to see the
>> INSERT Data created for the record:
>>
>>    cr:Event__102140gtm20003 cr:Event_name "Higher local fields".
>>    cr:Event__102140gtm20003 cr:Event_location "M\\"unster, Germany".
>>    cr:Event__102140gtm20003 cr:Event_source "crossref".
>>    cr:Event__102140gtm20003 cr:Event_eventId "10.2140/gtm.2000.3".
>>    cr:Event__102140gtm20003 cr:Event_title "Invitation to higher
>> local fields".
>>    cr:Event__102140gtm20003 cr:Event_startDate
>> "1999-08-29"^^<http://www.w3.org/2001/XMLSchema#date>.
>>    cr:Event__102140gtm20003 cr:Event_year 1999.
>>    cr:Event__102140gtm20003 cr:Event_month 9.
>>    cr:Event__102140gtm20003 cr:Event_endDate
>> "1999-09-05"^^<http://www.w3.org/2001/XMLSchema#date>.
>>
>> So the Umlaut-encoding "\\u" in the location "Münster" is the culprit
>> here. I will work around this issue. The real question is:
>>
>> *How can i get the Fuseki API via SPARQLWrapper to properly report a
>> detailed error message e.g. with something like "error in line #
>> cr:Event__102140gtm20003 cr:Event_location "M\\"unster, Germany". is 
>> not a valid triple?**
>> *
>>
>>
>> Yours
>>
>>     Wolfgang
>>
>> -- 
>>
>> BITPlan - smart solutions
>> Wolfgang Fahl
>> Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
>> Tel. +49 2154 811-480, Fax +49 2154 811-481
>> Web:http://www.bitplan.de
>> BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548,
>> Geschäftsführer: Wolfgang Fahl
>>
>
-- 

BITPlan - smart solutions
Wolfgang Fahl
Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
Tel. +49 2154 811-480, Fax +49 2154 811-481
Web: http://www.bitplan.de
BITPlan GmbH, Willich - HRB 6820 Krefeld, Steuer-Nr.: 10258040548, 
Geschäftsführer: Wolfgang Fahl 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to