As was stated by Andy, this is not a parsing issue.
riot is not reporting anything, nor rapper
<http://librdf.org/raptor/rapper.html> .
This is an issue with how TDB renders the URI once it has been stored in
TDB.

Jean-Marc Vanel
<http://semantic-forms.cc:9112/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me>
+33 (0)6 89 16 29 52
Twitter: @jmvanel , @jmvanel_fr ; chat: irc://irc.freenode.net#eulergui
 Chroniques jardin
<http://semantic-forms.cc:1952/history?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FChronicle>


Le sam. 25 avr. 2020 à 09:34, Lorenz Buehmann <
[email protected]> a écrit :

> Hi,
>
> I tried with cURL + riot CLI tools manually and can't reproduce the
> parsing issue, neither with RDF/XML nor with Turtle.
>
> curl -L -H "Accept: text/turtle" http://dbpedia.org/resource/User_guide
> > /tmp/test.ttl
> curl -L -H "Accept: application/rdf+xml"
> http://dbpedia.org/resource/User_guide > /tmp/test.rdf
>
>
> I know, that a few years ago DBpedia (resp. its Virtuoso backend) had
> some issues with serialization, but this has been fixed long time ago.
>
> Also, I don't understand what you mean by "suspicious"? The parser can
> easily convert the UTF-8 encoded URIs as expected:
>
> riot --check /tmp/test.ttl
>
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://nl.dbpedia.org/resource/Handleiding> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://cs.dbpedia.org/resource/Uživatelská_příručka> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://wikidata.dbpedia.org/resource/Q1057179> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://www.wikidata.org/entity/Q1057179> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://ko.dbpedia.org/resource/사용_설명서> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://es.dbpedia.org/resource/Guía_del_usuario> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://ja.dbpedia.org/resource/マニュアル> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://it.dbpedia.org/resource/Manuale> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://rdf.freebase.com/ns/m.04mqbf> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://fr.dbpedia.org/resource/Mode_d'emploi> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://yago-knowledge.org/resource/User_guide> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://de.dbpedia.org/resource/Gebrauchsanleitung> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://id.dbpedia.org/resource/Manual_pengguna> .
> <http://dbpedia.org/resource/User_guide>
> <http://www.w3.org/2002/07/owl#sameAs>
> <http://dbpedia.org/resource/User_guide> .
>
> On 24.04.20 22:33, Jean-Marc Vanel wrote:
> > Le ven. 24 avr. 2020 à 22:17, Andy Seaborne <[email protected]> a écrit :
> >
> >> On 24/04/2020 15:17, Jean-Marc Vanel wrote:
> >>> How to reproduce with 3.14.0
> >>>
> >>> bin/*tdbloader* --loc TDB --graph=
> http://dbpedia.org/resource/User_guide
> >> \
> >>>    --verbose http://dbpedia.org/resource/User_guide
> >> Did the log say anything?
> >>
> > NO, nothing special, neither with --debug .
> >
> > As this is a remote URL, did it all arrive and parse without warnings?
> > No warning.
> >
> > Was the database fresh or was there data in it to start with?
> > database fresh, of course.
> >
> >
> >>> echo "
> >>> CONSTRUCT {
> >>>   <http://dbpedia.org/resource/User_guide>
> >>>    ?P ?O . }
> >>> WHERE { GRAPH ?G {
> >>>   <http://dbpedia.org/resource/User_guide>
> >>>    ?P ?O . } }
> >>> LIMIT
> >>> # 30 # OK
> >>> 35 # KO !!!
> >>> " > /tmp/const.ql
> >>>
> >>> bin/*tdbquery* --debug --loc=TDB --query /tmp/const.ql
> >>>
> >>> And here is the *stack*:
> >>>
> >>> 16:14:23 ERROR BindingTDB           :: get1(?O)
> >>> java.lang.StringIndexOutOfBoundsException: String index out of range:
> 39
> >>> at java.lang.String.charAt(String.java:658)
> >>> at org.apache.jena.atlas.lib.StrUtils.decodeHex(StrUtils.java:212)
> >>> at
> org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:121)
> >> If the load was clean, the database is intact and it is a decoding bug
> >> in Jena for an URI. The data has a lot of encoded \u terms but its a URI
> >> in the object position causing a problem.  (I don't see why these are
> >> encoded - it's not necessary).
> >>
> > Indeed these URI are suspect:
> >
> > <http://fr.dbpedia.org/resource/Mode_d\u0027emploi> ,
> > <http://es.dbpedia.org/resource/Gu\u00EDa_del_usuario> .
> >
> > <http://ja.dbpedia.org/resource/\u30DE\u30CB\u30E5\u30A2\u30EB> ,
> > <
> >
> http://cs.dbpedia.org/resource/U\u017Eivatelsk\u00E1_p\u0159\u00EDru\u010Dka
> >
> > ,
> > <http://ko.dbpedia.org/resource/\uC0AC\uC6A9_\uC124\uBA85\uC11C> .
> >
> >
> >>      Andy
> >>
> >> ...
> >>> at tdb.tdbquery.main(tdbquery.java:33)
> >>>
> >>> NOTE : no problem with apache-jena-3.10.0-SNAPSHOT !?
> >>>
> >>>
> >>> Jean-Marc Vanel
> >>> <
> >>
> http://semantic-forms.cc:9112/display?displayuri=http://jmvanel.free.fr/jmv.rdf%23me
> >>> +33 (0)6 89 16 29 52
> >>> Twitter: @jmvanel , @jmvanel_fr ; chat: irc://
> irc.freenode.net#eulergui
> >>>   Chroniques jardin
> >>> <
> >>
> http://semantic-forms.cc:1952/history?uri=http%3A%2F%2Fdbpedia.org%2Fresource%2FChronicle
> >>>
>
>

Reply via email to