> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:dbpedia-
> [EMAIL PROTECTED] On Behalf Of Georgi Kobilarov
> Sent: 20 August 2008 00:17
> To: [EMAIL PROTECTED]; [email protected]
> Subject: Re: [Dbpedia-discussion] Ampersand in dbpedia returned URI
> breakingJena code
>
> Marvin,
>
> yes, it's a bug in our dataset. In particular in the Yago dataset, which
> has been contributed externally and wasn't created with the DBpedia
> framework (but hey, we've got many similar bugs in datasets created by
> our framework ;))
>
> Yago URIs have not been url-encoded. So as a workaround, you can
> url_encode all URIs starting with http://dbpedia.org/class/yago/ in the
> yago_en.nt file before loading it into your Jena model. That should do
> it.
>
> And we'll fix that bug for the future.
>
> Best,
> Georgi
Fixing the dataset will be a great help. This is the second report I have
received recently but both are actually related to the XML, not RDF.
There are two things here: use of the & in the URI (from the data as you say)
but also the DBPedia endpoint is emitting illegal XML. It's the second that is
the cause of the exception.
For the query :
select distinct ?Concept where {[] a ?Concept}
The OP got:
<result>
<binding
name="Concept"><uri>http://dbpedia.org/class/yago/Bill&MelindaGatesFoundationPeople</uri></binding>
</result>
which uses & in XML and it should be & so the XML is bad at the entity
level. That's what is breaking the StAX parser in the stacktrace not the bad
URI. It didn't get as far as knowing it was a URI!
I'd guess that a legal use of & in a URL will also cause problems.
The SPARQL endpoint needs fixing as well. I though this had been fixed - is it
just a case of upgrade or is still broken?
Andy
>
> --
> Georgi Kobilarov
> Freie Universität Berlin
> www.georgikobilarov.com
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> [mailto:dbpedia-
> > [EMAIL PROTECTED] On Behalf Of Marvin Lugair
> > Sent: Wednesday, August 20, 2008 12:57 AM
> > To: [email protected]
> > Subject: [Dbpedia-discussion] Ampersand in dbpedia returned URI
> > breakingJena code
> >
> >
> > Hi,
> >
> > The following sparql query:
> > select distinct ?Concept where {[] a ?Concept
> >
> > Is the default query at the dbpedia endpoint http://dbpedia.org/sparql
> > It returns several URI's including the following one (notice the and
> > sign):
> >
> > http://dbpedia.org/class/yago/Bill&MelindaGatesFoundationPeople
> >
> > So DBPedia is returning URI's containing an ampersand. This is causing
> > an exception in the Jena parser.
> >
> > How do I fix this? None of Jenas methods will work, I cant transofrm
> > the resultset into a model or even print is with the resultformatter.
> > If i iterate over it, I can print the results one by one till I get to
> > the malformed URI. How do I check in my code for malformed URI's?
> >
> >
> > Any ideas?
> > Thanks!
> > Marv
> > -------------
> >
> > The code below works till i get a URI with an ampersand.
> > The exception is coming from results.nextSolution(). Other Jena
> > methods to convert the retrieved resultset to a model directly or
> > format it produce the same exception (I assume they have a similar
> > iterator inside)
> >
> >
> > QueryExecution qexec =
> > QueryExecutionFactory.sparqlService("http://DBpedia.org/sparql",
> > "select distinct ?Concept where {[] a ?Concept}");
> >
> > try {
> > ResultSet results = qexec.execSelect();
> > for ( ; results.hasNext() ; )
> > {
> > QuerySolution soln = results.nextSolution() ;
> > String x = soln.get("Concept").toString();
> > System.out.print(x +"\n");
> > }
> > }
> >
> > finally {
> > System.out.println("closing!");
> > qexec.close() ;
> > }
> >
> >
> > This will result in the following error:
> >
> >
> > [com.ctc.wstx.exc.WstxLazyException]
> > com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '<'
> > (code 60); expected a semi-colon after the reference for entity
> > 'MelindaGatesFoundationPeople'
> > at [row,col {unknown-source}]: [2609,96]
> > at
> >
> com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:4
> > 5)
> > at
> > com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:671)
> > at
> >
> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.jav
> > a:3505)
> > at
> > com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:804)
> > at
> >
> com.ctc.wstx.sr.BasicStreamReader.getElementText(BasicStreamReader.java
> > :674)
> > at
> >
> com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.getOneSolut
> > ion(XMLIn\
> > putStAX.java:472)
> > at
> >
> com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.hasNext(XML
> > InputStAX\
> > .java:213)
> >
> >
> >
> > I also posted this on the Jena group but some seem to suggest it is a
> > dbpedia issue: http://tech.groups.yahoo.com/group/jena-
> > dev/message/36210
> >
> >
> >
> >
> >
> >
> -----------------------------------------------------------------------
> > --
> > This SF.Net email is sponsored by the Moblin Your Move Developer's
> > challenge
> > Build the coolest Linux based applications with Moblin SDK & win great
> > prizes
> > Grand prize is a trip for two to an Open Source event anywhere in the
> > world
> > http://moblin-contest.org/redirect.php?banner_id=100&url=/
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the
> world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion