Peter Ansell wrote: > Are any of the http://dbpedia.org/resource/Category:ABCD... redirects > displaying this behaviour? DBpedia doesn't seem to encode the : > between "Category" and the category name, even if it percent encodes > the category name.
Yes, this also concerns Category:XXX resources: curl -v -H "Accept: application/rdf+xml" http://dbpedia.org/resource/Category:City-states < HTTP/1.1 303 See Other < Location: http://dbpedia.org/data/Category%3ACity-states.xml curl -v -H "Accept: application/rdf+xml" http://dbpedia.org/data/Category%3ACity-states.xml (only some foaf:primaryTopic triples there) curl -v -H "Accept: application/rdf+xml" http://dbpedia.org/data/Category:City-states.xml (this returns the correct data) curl -v http://dbpedia.org/resource/Category:City-states < HTTP/1.1 303 See Other < Location: http://dbpedia.org/page/Category:City-states Note the ":" does not get transcoded for this redirect (HTML). The same also applies to parentheses (as Gunnar pointed out): curl -v -H "Accept: application/rdf+xml" 'http://dbpedia.org/resource/The_Good_Shepherd_(film)' < HTTP/1.1 303 See Other < Location: http://dbpedia.org/data/The_Good_Shepherd_%28film%29.xml ...so here the () got URLEncoded for some reason... curl -v 'http://dbpedia.org/resource/The_Good_Shepherd_(film)' < HTTP/1.1 303 See Other < Location: http://dbpedia.org/page/The_Good_Shepherd_(film) ...while that's not done for normal HTML requests. Additionally, as Gunnar said, there seem to be related (but separate) bugs in the DBpedia extraction framework causing triples for one resource to get scattered over multiple resources with different URI encodings. Regards Malte > > On 6 May 2010 01:05, Malte Kiesel <[email protected]> wrote: >> Hi! >> >> Apparently there's something odd with the 303 redirects for resources >> with ":" in their title. Basically, that seems to work from for example >> curl, but it fails from Java. I'm not sure what component is buggy there. >> >> Example: >> >> $ curl -v -H "Accept: application/rdf+xml" >> http://dbpedia.org/resource/X-Men:_Evolution >> ... >> < HTTP/1.1 303 See Other >> < Content-Location: /data/X-Men%3A_Evolution.xml >> >> $ curl -H "Accept: application/rdf+xml" >> http://dbpedia.org/data/X-Men:_Evolution >> ...is fine. >> >> $ curl -H "Accept: application/rdf+xml" >> http://dbpedia.org/data/X-Men%3A_Evolution >> ...isn't - that strangely returns some foaf triples though (seems these >> are returned for whatever data/ URI you request). >> >> Java seems to get redirected to the latter (broken) URI: >> >> url = "http://dbpedia.org/resource/X-Men:_Evolution"; >> URL urlU = new URL(url); >> HttpURLConnection uc = (HttpURLConnection) urlU.openConnection(); >> uc.setInstanceFollowRedirects(true); >> uc.setRequestProperty("Accept", "application/rdf+xml"); >> uc.connect(); >> InputStream is = uc.getInputStream(); >> int read; >> while ((read = is.read()) != -1) { System.out.write(read); } >> ...outputs the triples the last (broken) curl command also fetches. >> >> Bug in Java? Bug in Virtuoso? >> >> I found a related discussion at [1] but that didn't cover the ":" case. >> >> Regards >> Malte >> >> [1] >> http://www.mail-archive.com/[email protected]/msg00776.html >> >> -- >> Malte Kiesel, DFKI GmbH ------------------------------------------------------------------------------ _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
