Hi, 

cc'ing Sören as he authored the paper (sorry for the spam).

<snip/> 

On 30 Sep 2010, at 08:52, Sebastian Hellmann wrote:

> Hi,
> here is more to read 
> http://www.informatik.uni-leipzig.de/~auer/publication/I18n.pdf
> I think there might be an answer to your question on page 15....

I have had a read of this paper, and it has left me slightly confused, and am 
hoping someone can help me figure this out. There is mention of the "currently 
prevalent resource identification mechanism on the Semantic Web i.e. URIs, are 
not sufficient ..." in the Introduction, paragraph 4. But i am not sure exactly 
what is mean by the term URI.

In terms of RDF/XML[1], and SPARQL[2]  (sorry, am leaving out RDFa, for I am 
not that clued up on it) the terms URI Reference[3] and IRIs[4], are used 
within the context of URI encoding. Is the paper talking about "URI Refs"[3] or 
something else?

What I gathered when I was looking at this before (see thread of the swig 
mailing list [5], they are both unicode, but with a small and subtle 
difference, quoting SPARQL specification : 

QUOTE 1: 

"The set of RDF terms defined in RDF Concepts and Abstract Syntax includes RDF 
URI references while SPARQL terms include IRIs. RDF URI references containing 
"<", ">", '"' (double quote), space, "{", "}", "|", "\", "^", and "`" are not 
IRIs. The behavior of a SPARQL query against RDF statements composed of such 
RDF URI references is not defined." [4]

But I guess there is more to it than this, quoting [3] : 

QUOTE 2: 

"The encoding consists of:
encoding the Unicode string as UTF-8 [RFC-2279], giving a sequence of octet 
values.
%-escaping octets that do not correspond to permitted US-ASCII characters.
The disallowed octets that must be %-escaped include all those that do not 
correspond to US-ASCII characters, and the excluded characters listed in  
Section 2.4 of [URI], except for the number sign (#), percent sign (%), and the 
square bracket characters re-allowed in [RFC-2732]." 

Mmm, how annoying, as (what I think is being pointed out by the paper), there 
is a bigger discrepancy between URI Refs (which I think is what is being talked 
about in the paper), and IRIs. For example : 

Given that SPARQL, and eventually SPARQL Update support IRIs, I will be able to 
write a SPARQL query such as : 

INSERT DATA {graph <http://foo.com> { <http://foo.com/#년> 
<http://example.com/hasA> <http://example.com/bar>}} which has valid IRIs, and 
not valid URIRefs ? Which is in turn a valid sparql update (pending syntax) 
query, but if I attempted to do a CONSTRUCT SPARQL query after it being 
imported into a triplestore, will result in invalid RDF/XML .... mmm ....

So, this is my take on things, please do let me know if I am wrong (would help 
me get my head around this), as I am trying to figure out the situations where 
things can be written in RDF (specially RDF/XML here sorry it has a spec [I do 
dislike the serialisation but it is kind official]) which can't be queried in 
SPARQL or vice versa. 

Regards, 

Mischa 

[1] http://www.w3.org/TR/REC-rdf-syntax/

[2] http://www.w3.org/TR/rdf-sparql-query/

[3] http://www.w3.org/TR/rdf-concepts/#dfn-URI-reference

[4] http://www.w3.org/TR/rdf-sparql-query/#QSynIRI

[5] http://lists.w3.org/Archives/Public/semantic-web/2010Jul/0430.html

> The Korean DBpedia is also abailable here:
> http://ko.dbpedia.org/
> unfortunately the endpoint is down.
> Would you like to host the Greek DBpedia? http://gr.dbpedia.org? 
> Did you extract all articles or just those, that have an English language 
> link?
> Cheers,
> Sebastian
> 
> 
> 
> Am 29.09.2010 18:18, schrieb Dimitris Kontokostas:
>> 
>> Hi,
>> 
>> i run the dbpedia dump for the greek wikipedia couple of moths ago and setup 
>> a local virtuoso server with the greek dbpedia on it
>> but the "page" and "resource" pages were mostly unreadable because of the 
>> %XX characters (URI's)
>> 
>> i tried to search around the problem, but i didn't find anything specific 
>> until i came up an article about the korean dbpedia  and noticed that they 
>> publish their contents both in URI and IRI form and claim to use IRI as 
>> their default "url encoding"
>> (if i understood correctly, IRI uses "unicode" characters instead of %XX)
>> 
>> my questions are:
>> 1) is the IRI form acceptable by the virtuoso and the dbpedia (if the 
>> wikipedia /dbpedia "links" will continue to work)
>> and 2) if the answer in #1 is yes, is there a configuration parameter in the 
>> extraction framework that creates IRI triples (instead of URI's), or do i 
>> have to manually create one?
>> 
>> thanks a lot
>> Jim
>> 
>> ------------------------------------------------------------------------------
>> Start uncovering the many advantages of virtual appliances
>> and start using them to simplify application deployment and
>> accelerate your shift to cloud computing.
>> http://p.sf.net/sfu/novell-sfdev2dev
>> 
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>   
> 
> 
> -- 
> Dipl. Inf. Sebastian Hellmann
> Department of Computer Science, University of Leipzig
> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
> Research Group: http://aksw.org
> ------------------------------------------------------------------------------
> Start uncovering the many advantages of virtual appliances
> and start using them to simplify application deployment and
> accelerate your shift to cloud computing.
> http://p.sf.net/sfu/novell-sfdev2dev_______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

___________________________________
Mischa Tuffield PhD
Email: [email protected]
Homepage - http://mmt.me.uk/
Garlik Limited, 1-3 Halford Road, Richmond, TW10 6AW
+44(0)845 652 2824  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD

Attachment: PGP.sig
Description: This is a digitally signed message part

------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to