Workshop: Why, Where and How? of Linked Data
Hi all, The DNF Expert Group is pleased to announce the publication of the agenda for the Why, Where and How? of Linked Data workshop being run in conjunction with UK Location and the Chartered Institute of IT. Registration is now open through this site at http://www.dnf.org/events/register The event is taking place in London on 10/02/11 and is free of charge and is the follow on event to the original workshop back in September 2010. As the venue imposes some strict limits on us as to audience size, places will be allocated on a strictly first come, first served basis. John Dr John Goodwin Research Scientist, Research, Ordnance Survey Adanac Drive, SOUTHAMPTON, UK, SO16 0AS Phone: +44 (0) 23 8005 5761 www.ordnancesurvey.co.uk| john.good...@ordnancesurvey.co.uk Please consider your environmental responsibility before printing this email This email is only intended for the person to whom it is addressed and may contain confidential information. If you have received this email in error, please notify the sender and delete this email which must not be copied, distributed or disclosed to any other person. Unless stated otherwise, the contents of this email are personal to the writer and do not represent the official view of Ordnance Survey. Nor can any contract be formed on Ordnance Survey's behalf via email. We reserve the right to monitor emails and attachments without prior notice. Thank you for your cooperation. Ordnance Survey Adanac Drive Southampton SO16 0AS Tel: 08456 050505 http://www.ordnancesurvey.co.uk
Re: URI Comparisons: RFC 2616 vs. RDF
Alan Ruttenberg wrote: On Wed, Jan 19, 2011 at 4:45 PM, Nathan nat...@webr3.org wrote: David Wood wrote: On Jan 19, 2011, at 10:59, Nathan wrote: ps: as an illustration of how engrained URI normalization is, I've capitalized the domain names in the to: and cc: fields, I do hope the mail still come through, and hope that you'll accept this email as being sent to you. Hopefully we'll also find this mail in the archives shortly at htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd hope that any statements made using these URIs (asserted by man or machine) would remain valid regardless of the (incorrect?-)casing. Heh. OK, I'll bite. Domain names in email addressing are defined in IETF RFC 2822 (and its predecessor RFC 822), which defers the interpretation to RFC 1035 (Domain names - implementation and specification). RFC 1035 section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, are to be compared in a case-insensitive manner. As far as I know, the W3C specs do not so refer to RFC 1035. And I'll bite in the other direction, why not treat URIs as URIs? why go against both the RDF Specification [1] and the URI specification when they say /not/ to encode permitted US-ASCII characters (like ~ %7E)? why force case-sensitive matching on the scheme and domain on URIs matching the generic syntax when the specs say must be compared case insensitively? and so on and so forth. [AR] Which specs? The various URI/IRI specs and previous revisions of. http://www.w3.org/TR/REC-xml-names/#NSNameComparison URI references identifying namespaces .. In a namespace declaration, the URI reference is .. The URI references below are all different for the purposes of identifying namespaces .. The URI references below are also all different for the purposes of identifying namespaces .. So here is another spec that *explicitly* disagrees with the idea that URI normalization should be a built-in processing. As far as I can see, that's only for a URI reference used within a namespace, and does not govern usage or normalization when you join the URI reference up with the local name to make the full URI. Out of interest, where is that process defined? I was looking for it the other day - for instance in the quoted specification we have the example: edi:price xmlns:edi='http://ecommerce.example.org/schema' units='Euro'32.18/edi:price Where's the bit of the XML specification which says you join them up by concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) and 'Euro' to get 'http://ecommerce.example.org/schema#Euro'? And finally, this is why I specifically asked if the non-normalization of RDF URI References had XML Namespace heritage, which had then filtered down through OWL, SPARQL and RIF. [AR] More to document, please: Which data is being junked and scrapped? will document, but essentially every statement made using a non normalized URI when other statements are also being made about the same resource using normalized URIs - the two most common cases for this will be when people are using CMS systems and enter their domain name as uppercase in some admin, only to have that filter through to URIs in serialized RDF/RDFa, and where bugs in software have led to inconsistent URIs over time (for instance where % encoding has been fixed, or a :80 has been removed from a URI). [AR] Hmm. Are you suggesting that the behavior of libraries and clients should have precedence over specification? My view is that one first looks to specifications, and then only if specifications are poor or do not speak to the issue do we look at existing behavior. Yes I am, that specification should standardize the behaviour of libraries and clients - the level of normalization in URIs published, consumed or used by these tools is often determined by non sem web stack components, and the sem web components are blocked from normalizing these should-not-be-differing-URIs by the sem web specifications. [AR] I think there are many ways to lose in this scenario. For instance, if the server redirects then the base is the last in the chain of redirects. http://tools.ietf.org/html/rfc3986#page-29, 5.1.3. Base URI from the Retrieval URI. My conclusion - don't engineer this way. That would be my conclusion too, but as RDF(a) moves in to the realms of the CMS systems and out of the hands of the sem web community, it will be increasingly engineered this way, it's a very common pattern when working with (X)HTML (allows people to test locally or on dev servers without changing the content). Further, essentially all RDFa ever encountered by a browser has the casing on all URIs in href and src, and all these which are resolved, automatically normalized - so even if you set the base to htTp://EXAMPLE.org/ or use it in a URI, browser tools, extensions, and js based libraries will only ever see the normalized URIs (and thus be incompatible with the rest
Nice domain name for the take
Hi all, Last year, Christophe Gueret and I registered the domain name linkeddatamarketplace.com, thinking we might find the time to set up a market place for ... eh... linked data. The idea was to bring together publishers and users of data (along the lines of http://www.datamarketplace.com). As it turned out we didn't have the time to do any of this (apart from setting up a 'coming soon' picture), and now the domain name is about to expire. Question: if anyone's interested in using it, please let us know and we can transfer it to your name. The domain is registered at geni.com and will expire in 2 months. Cheers, Rinke --- Dr Rinke Hoekstra AI Department | Leibniz Center for Law Faculty of Sciences | Faculty of Law Vrije Universiteit| Universiteit van Amsterdam De Boelelaan 1081a| Kloveniersburgwal 48 1081 HV Amsterdam | 1012 CX Amsterdam +31-(0)20-5987752 | +31-(0)20-5253497 r.j.hoeks...@vu.nl| hoeks...@uva.nl Homepage: http://www.few.vu.nl/~hoekstra
Re: URI Comparisons: RFC 2616 vs. RDF
On 1/19/11 11:27 PM, Alan Ruttenberg wrote: On Wed, Jan 19, 2011 at 11:11 AM, Kingsley Idehen kide...@openlinksw.com mailto:kide...@openlinksw.com wrote: On 1/19/11 10:59 AM, Nathan wrote: htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd hope that any statements made using these URIs (asserted by man or machine) would remain valid regardless of the (incorrect?-)casing. Okay for Data Source Address Ref. (URL), no good for Entity (Data Item or Data Object) Name Ref., bar system specific handling via IFP property or owl:sameAs :-) Kingsley, same for you as Nathan. To what specification do you refer to for the definitions and behavior of: - Data source address ref - Entity - Statement. -Alan Alan, My response is purely about managing Identifiers that are used as functional unambiguous Name or Address References. Not quoting a W3C spec. Basically, expressing a view based on my understanding of what's practical. A system (e.g. a database or client app.) can (should) make a decision about how it handles resolvable Identifiers when used as Name or Address references. Kingsley -- Regards, Kingsley Idehen President CEO OpenLink Software Web:http://www.openlinksw.com Weblog:http://www.openlinksw.com/blog/~kidehen http://www.openlinksw.com/blog/%7Ekidehen Twitter/Identi.ca: kidehen -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Re: URI Comparisons: RFC 2616 vs. RDF
On Wed, 2011-01-19 at 21:45 +, Nathan wrote: David Wood wrote: On Jan 19, 2011, at 10:59, Nathan wrote: ps: as an illustration of how engrained URI normalization is, I've capitalized the domain names in the to: and cc: fields, I do hope the mail still come through, and hope that you'll accept this email as being sent to you. Hopefully we'll also find this mail in the archives shortly at htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd hope that any statements made using these URIs (asserted by man or machine) would remain valid regardless of the (incorrect?-)casing. Heh. OK, I'll bite. Domain names in email addressing are defined in IETF RFC 2822 (and its predecessor RFC 822), which defers the interpretation to RFC 1035 (Domain names - implementation and specification). RFC 1035 section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, are to be compared in a case-insensitive manner. As far as I know, the W3C specs do not so refer to RFC 1035. And I'll bite in the other direction, why not treat URIs as URIs? It seems to me the underlying question here is whether aliasing of URIs (whether they dereference to the same resource) should imply semantic equality (i.e. use as an identifier in a web logic language like RDF or OWL). The position so far in RDF, OWL and RIF has been no As far as the specifications for those languages are concerned a URI is just a convenient spelling for an identifier and they require comparison of identifiers to be stable and context-independent. Those specs don't constrain what you get back from dereferencing some URI U to include statements about U. The URI spec (rfc3986[1]) does allow this usage. In particular Section 6 Normalization and Comparison says: URI comparison is performed for some particular purpose. Protocols or implementations that compare URIs for different purposes will often be subject to differing design trade-offs in regards to how much effort should be spent in reducing aliased identifiers. This section describes various methods that may be used to compare URIs, the trade-offs between them, and the types of applications that might use them. and We use the terms different and equivalent to describe the possible outcomes of such comparisons, but there are many application-dependent versions of equivalence. While RDF predates this spec it seems to me that the RDF usage remains consistent with it. The purpose of comparison in RDF is different from that of cache retrieval of web pages or message delivery of email. This quote also makes clear that there is no single definitive normalization. There are different levels of normalization possible depending on your needs. Earlier you pointed out that the place where the URI specs and RDF do collide is in resolving relative URIs into absolute URIs. Again rfc3986 does not preclude the RDF usage. Section 5.2.1 says: Normalization of the base URI, as described in Sections 6.2.2 and 6.2.3, is optional. So I claim that in terms of formal published specifications: (1) RDF, OWL and RIF do not require any normalization of URIs (beyond the character encoding level) and compare URIs by simple string comparison. (2) This usage is *not* precluded by the URI specs, at least by 3986 which sets the current framework for the application of scheme-specific specs. ** Now we turn to linked data ... As we've already mentioned :) there are no specs for linked data so we move onto more subjective grounds. The linked data convention is that dereferencing some URI U in your RDF document should return information about U, including further onward links. So if data set A spells a URI hTTp://example.com/foo but the data you get from dereferencing that URI talks only about http://example.com/foo then someone has a problem somewhere. The question is who, where and how to fix it. It seems to me that this is primarily a issue with publishing, and a little about being sensible about how you pass on links. If I'm going to put up some linked data I should mint normalized URIs; I should use the same spelling of the URIs throughout my data; I'll make sure those URIs dereference and that the data that comes back is stable and useful. If someone else refers to my resources using an aliased URI (such as a different case for the protocol) and makes statements about those aliases then they have simply made a mistake. To make sure that dereference returns what I expect, independent of aliasing, then I should publish data with explicit base URIs (or just absolute URIs). Publishing with relative URIs and no base is a recipe for having your data look different from different places. Just don't do it. No surprise there. None of this requires us to force URI normalization into the heart of identifier comparison in RDF itself. It is not a necessary solution and it is not a sufficient one because there is no universal
Re: URI Comparisons: RFC 2616 vs. RDF
Hi Dave, Generally I agree, will address a few specific points in line (just to address them) then summarize my intended goals at the end (being the substance of the mail). Dave Reynolds wrote: The URI spec (rfc3986[1]) does allow this usage. In particular Section 6 Normalization and Comparison says: URI comparison is performed for some particular purpose. Protocols or implementations that compare URIs for different purposes will often be subject to differing design trade-offs in regards to how much effort should be spent in reducing aliased identifiers. This section describes various methods that may be used to compare URIs, the trade-offs between them, and the types of applications that might use them. and We use the terms different and equivalent to describe the possible outcomes of such comparisons, but there are many application-dependent versions of equivalence. While RDF predates this spec it seems to me that the RDF usage remains consistent with it. The purpose of comparison in RDF is different from that of cache retrieval of web pages or message delivery of email. Indeed, I also read though: For all URIs, the hexadecimal digits within a percent-encoding triplet (e.g., %3a versus %3A) are case-insensitive and therefore should be normalized to use uppercase letters for the digits A-F. When a URI uses components of the generic syntax, the component syntax equivalence rules always apply; namely, that the scheme and host are case-insensitive and therefore should be normalized to lowercase... - http://tools.ietf.org/html/rfc3986#section-6.2.2.1 And took the For all and always to literally mean for all and always. Unsure where this leaves things, and which takes precedence. This quote also makes clear that there is no single definitive normalization. There are different levels of normalization possible depending on your needs. agree So I claim that in terms of formal published specifications: (1) RDF, OWL and RIF do not require any normalization of URIs (beyond the character encoding level) and compare URIs by simple string comparison. One potential issue on the % encoding, clarified further down. (2) This usage is *not* precluded by the URI specs, at least by 3986 which sets the current framework for the application of scheme-specific specs. Not a 100% sure but tempted to agree with you, would make sense not to preclude it. As we've already mentioned :) there are no specs for linked data so we move onto more subjective grounds. Would be nice to get some specs at some point... The linked data convention is that dereferencing some URI U in your RDF document should return information about U, including further onward links. So if data set A spells a URI hTTp://example.com/foo but the data you get from dereferencing that URI talks only about http://example.com/foo then someone has a problem somewhere. The question is who, where and how to fix it. agree, good way of putting it. against both the RDF Specification [1] and the URI specification when they say /not/ to encode permitted US-ASCII characters (like ~ %7E)? Where did that example come from? The encoding consists of... %-escaping octets that do not correspond to permitted US-ASCII characters. - http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers. - http://tools.ietf.org/html/rfc3986#section-2.3 I read those quotes as saying do not encode permitted US-ASCII characters in RDF URI References. At what point have we suggested doing that? As above why force case-sensitive matching on the scheme and domain on URIs matching the generic syntax when the specs say must be compared case insensitively? No, the specs do not say that, see above. See for all and always quote earlier on. So use normalized URIs in the first place. ... RDF/OWL/RIF aren't designed the way they are because someone thought it would be a good idea to allow such things to be used side by side or because they *want* people to use denormalized URIs. ... The point is that there is no single, simple, universal (i.e. across all schemes) normalization algorithm that could be used. The current approach gives stable, well-defined behaviour which doesn't change as people invent new URI schemes. The RDF serializations give you enough control to enable you to be certain about what URI you are talking about. Job done. Okay, I agree, and I'm really not looking to create a lot of work here, the general gist of what I'm hoping for is along the lines of: RDF Publishers MUST perform Case Normalization and Percent-Encoding Normalization on all
Re: URI Comparisons: RFC 2616 vs. RDF
On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote: [ . . . ] It seems to me that this is primarily a issue with publishing, and a little about being sensible about how you pass on links. If I'm going to put up some linked data I should mint normalized URIs; I should use the same spelling of the URIs throughout my data; I'll make sure those URIs dereference and that the data that comes back is stable and useful. If someone else refers to my resources using an aliased URI (such as a different case for the protocol) and makes statements about those aliases then they have simply made a mistake. To make sure that dereference returns what I expect, independent of aliasing, then I should publish data with explicit base URIs (or just absolute URIs). Publishing with relative URIs and no base is a recipe for having your data look different from different places. Just don't do it. This advice sounds like an excellent candidate for publication in a best practices document. And if it is merely best practice guidance, perhaps that *is* something that the new RDF working group could address. -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
Re: URI Comparisons: RFC 2616 vs. RDF
David Booth wrote: On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote: [ . . . ] It seems to me that this is primarily a issue with publishing, and a little about being sensible about how you pass on links. If I'm going to put up some linked data I should mint normalized URIs; I should use the same spelling of the URIs throughout my data; I'll make sure those URIs dereference and that the data that comes back is stable and useful. If someone else refers to my resources using an aliased URI (such as a different case for the protocol) and makes statements about those aliases then they have simply made a mistake. To make sure that dereference returns what I expect, independent of aliasing, then I should publish data with explicit base URIs (or just absolute URIs). Publishing with relative URIs and no base is a recipe for having your data look different from different places. Just don't do it. This advice sounds like an excellent candidate for publication in a best practices document. And if it is merely best practice guidance, perhaps that *is* something that the new RDF working group could address. +1 from me, address at the publishing phase, allow at the consuming phase, keep comparison simple.
Re: URI Comparisons: RFC 2616 vs. RDF
* [2011-01-20 14:29:35 +] Nathan nat...@webr3.org écrit: ] RDF Publishers MUST perform Case Normalization and Percent-Encoding ] Normalization on all URIs prior to publishing. When using relative URIs ] publishers SHOULD include a well defined base using a serialization ] specific mechanism. Publishers are advised to perform additional ] normalization steps as specified by URI (RFC 3986) where possible. ] ] RDF Consumers MAY normalize URIs they encounter and SHOULD perform ] Case Normalization and Percent-Encoding Normalization. ] ] Two RDF URIs are equal if and only if they compare as equal, ] character by character, as Unicode strings. ] ] For many reasons it would be good to solve this at the publishing phase, ] allow normalization at the consuming phase (can't be precluded as ] intermediary components may normalize), and keep simple case sensitive ] string comparison throughout the stack and specs (so implementations ] remain simple and fast.) ] ] Does anybody find the above disagreeable? Sounds about right to me, but what about port numbers, http://example.org/ vs http://example.org:80/? -w -- William Waitesmailto:w...@styx.org http://eris.okfn.org/ww/ sip:w...@styx.org F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45
Re: Nice domain name for the take
Hi all, Thanks for the responses! This call for applicants is now closed ;) Cheers, Rinke PS Did I say geni.com? Surely I meant gandi.net ... On 20 jan 2011, at 13:13, Rinke Hoekstra wrote: Hi all, Last year, Christophe Gueret and I registered the domain name linkeddatamarketplace.com, thinking we might find the time to set up a market place for ... eh... linked data. The idea was to bring together publishers and users of data (along the lines of http://www.datamarketplace.com). As it turned out we didn't have the time to do any of this (apart from setting up a 'coming soon' picture), and now the domain name is about to expire. Question: if anyone's interested in using it, please let us know and we can transfer it to your name. The domain is registered at geni.com and will expire in 2 months. Cheers, Rinke --- Dr Rinke Hoekstra AI Department | Leibniz Center for Law Faculty of Sciences | Faculty of Law Vrije Universiteit| Universiteit van Amsterdam De Boelelaan 1081a| Kloveniersburgwal 48 1081 HV Amsterdam | 1012 CX Amsterdam +31-(0)20-5987752 | +31-(0)20-5253497 r.j.hoeks...@vu.nl| hoeks...@uva.nl Homepage: http://www.few.vu.nl/~hoekstra --- Dr Rinke Hoekstra AI Department | Leibniz Center for Law Faculty of Sciences | Faculty of Law Vrije Universiteit| Universiteit van Amsterdam De Boelelaan 1081a| Kloveniersburgwal 48 1081 HV Amsterdam | 1012 CX Amsterdam +31-(0)20-5987752 | +31-(0)20-5253497 r.j.hoeks...@vu.nl| hoeks...@uva.nl Homepage: http://www.few.vu.nl/~hoekstra
How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes
Hi. I'm considering the different options that could help embed (with slightest modifications possible) in the in HTML interface of a Web app, a description of which app it is and/or which interfaces it exposes, so that this would be discoverable and lead to exploitation of such data by SemWeb apps, or existing harvesters. Which SemWeb standards could be used to do so ? Thanks in advance. Best regards, -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8 Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)
Re: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes
Olivier, I'm considering the different options that could help embed (with slightest modifications possible) in the in HTML interface of a Web app, a description of which app it is and/or which interfaces it exposes, so that this would be discoverable and lead to exploitation of such data by SemWeb apps, or existing harvesters. You might find my blog post 'Announcing Application Metadata on the Web of Data' [1] along with the template [2] useful for this purpose. Cheers, Michael [1] http://webofdata.wordpress.com/2010/01/06/announcing-application-metadata [2] http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html -- Dr. Michael Hausenblas, Research Fellow LiDRC - Linked Data Research Centre DERI - Digital Enterprise Research Institute NUIG - National University of Ireland, Galway Ireland, Europe Tel. +353 91 495730 http://linkeddata.deri.ie/ http://sw-app.org/about.html From: Olivier Berger olivier.ber...@it-sudparis.eu Date: Thu, 20 Jan 2011 16:42:16 +0100 To: Linked Data community public-lod@w3.org Subject: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes Resent-From: Linked Data community public-lod@w3.org Resent-Date: Thu, 20 Jan 2011 15:43:59 + Hi. I'm considering the different options that could help embed (with slightest modifications possible) in the in HTML interface of a Web app, a description of which app it is and/or which interfaces it exposes, so that this would be discoverable and lead to exploitation of such data by SemWeb apps, or existing harvesters. Which SemWeb standards could be used to do so ? Thanks in advance. Best regards, -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8 Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)
Re: URI Comparisons: RFC 2616 vs. RDF
Hi: On 20.01.2011, at 15:40, Nathan wrote: David Booth wrote: On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote: [ . . . ] To make sure that dereference returns what I expect, independent of aliasing, then I should publish data with explicit base URIs (or just absolute URIs). Publishing with relative URIs and no base is a recipe for having your data look different from different places. Just don't do it. This advice sounds like an excellent candidate for publication in a best practices document. And if it is merely best practice guidance, perhaps that *is* something that the new RDF working group could address. +1 from me, address at the publishing phase, allow at the consuming phase, keep comparison simple. I am not sure whether you are also talking of RDFa, but in case you do, I would like to add the following: Our experiences with helping about 2,000 sites with adding GoodRelations via our form-based tools shows that 1. RDFa is in many cases the only viable way for people to publish RDF 2. They can often not control and not even predict the exact URI of the page that will contain the markup (imagine uncool URIs loaded with parameters etc.) In those scenarios, relative URIs are essential. We even recommend that people include an empty div rel=foaf:page resource=/div at the proper position in the nesting so that there will be a link between the data entity and the page that contains it. Martin
Re: URI Comparisons: RFC 2616 vs. RDF
Martin Hepp wrote: On 20.01.2011, at 15:40, Nathan wrote: David Booth wrote: On Thu, 2011-01-20 at 13:08 +, Dave Reynolds wrote: [ . . . ] To make sure that dereference returns what I expect, independent of aliasing, then I should publish data with explicit base URIs (or just absolute URIs). Publishing with relative URIs and no base is a recipe for having your data look different from different places. Just don't do it. This advice sounds like an excellent candidate for publication in a best practices document. And if it is merely best practice guidance, perhaps that *is* something that the new RDF working group could address. +1 from me, address at the publishing phase, allow at the consuming phase, keep comparison simple. I am not sure whether you are also talking of RDFa, but in case you do, I would like to add the following: Hi Martin, Yes (re RDFa), see: http://webr3.org/urinorm/2 - all the browsers do the normalization so you can't even get to the non-normalized URI. in a browser you'll note that all the URIs get normalized automatically, in that it's impossible to programmatically access the correct casing. That's a problem. if you run it through the RDFa distiller at w3.org [2] you'll find: htTp://WEBR3.org/urinorm/2 dc:creator http://WEBR3.org/nathan#me . http://WEBR3.org/urinorm/2#example dc:title URI Normalization Example 2 . note one of the URIs (the one which required relative path resolution) has the scheme normalised. if you run if through check.rdfa.info you'll find that all the URIs are normalized. [3] if you run it through sigma [4] you'll find everything has been normalized. You can also see an RDF view of this [5] if you run it through URI Burner [6], you'll find that /some/ URIs have been normalized. It's also worth noting that this caused all kinds of problems - I ended up having to create a new resource at this point w/ some RDF N3 to test URI Burner: http://webr3.org/urinorm/3 which lead to the empty [7] then I figured I'd try [8] and if you click the creator ( htTp://WEBR3.org/nathan#me ) since in this case there's no normalization (not it was normalized in [6]) you get a 400 Bad Request [9]. and so on and so forth - far from ideal. Best, Nathan [1] http://www.rdfabout.com/demo/validator/ (normalizes all RDF URIs) [2] http://www.w3.org/2007/08/pyRdfa/ [3] http://check.rdfa.info/check?url=http://webr3.org/urinorm/2version=1.0 [4] http://sig.ma/search?q=http://webr3.org/urinorm/2 [5] http://sig.ma/entity/e6a2c8319bb3bf21f4b4639216f114a4.rdf#this [6] http://linkeddata.uriburner.com/about/html/http/webr3.org/urinorm/2%01this [7] http://linkeddata.uriburner.com/about/html/http/webr3.org/urinorm/3 [8] http://linkeddata.uriburner.com/about/html/htTp://WEBR3.org/urinorm/3 [9] http://linkeddata.uriburner.com/about/html/htTp/WEBR3.org/nathan%01me
Re: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes
Le jeudi 20 janvier 2011 à 15:50 +, Michael Hausenblas a écrit : Olivier, I'm considering the different options that could help embed (with slightest modifications possible) in the in HTML interface of a Web app, a description of which app it is and/or which interfaces it exposes, so that this would be discoverable and lead to exploitation of such data by SemWeb apps, or existing harvesters. You might find my blog post 'Announcing Application Metadata on the Web of Data' [1] along with the template [2] useful for this purpose. Cheers, Michael [1] http://webofdata.wordpress.com/2010/01/06/announcing-application-metadata [2] http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html Thanks for sharing this. The only critique I could have is about the use of DOAP, where there's a confusion between a project (community) and a software (developped by that community) behind doap:Project, IMHO... but that's a common problem with DOAP, that is counterweighted by its popularity (perfect model vs. available data). I was thinking of something maybe less intrusive : RDFa addition to existing apps is maybe too hard, as requiring to change its code (in particular as some (X)HTML may not convert so easily to XHTML+RDFa. Anything in the domain of HTML headers maybe ? Such could more easily be added by quick+dirty patches / sysadmin configuration. Thanks in advance. -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8 Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)
2nd CfP: USEWOD2011 - 1st International Workshop on Usage Analysis and the Web of Data
Just a quick reminder that the deadline for USEWOD2011 is approaching fast! === Second Call for Papers === Workshop on: USAGE ANALYSIS AND THE WEB OF DATA (USEWOD2011) USEWOD DATA CHALLENGE Workshop at WWW 2011 – Hyderabad, India, 28 or 29 March 2011 http://data.semanticweb.org/usewod/2011/ Important dates === * Release of Dataset for the USEWOD Challenge: 21 December 2011 * Paper submission deadline: 8 February 2011 * Workshop and Prize for USEWOD Challenge: 28 or 29 March 2011 Submission = * Long papers: up to 8 pages * Short papers: up to 4 pages * Data Challenge papers (see below): up to 4 pages all in ACM format (http://www.acm.org/sigs/publications/proceedings-templates) Overview This workshop will investigate the synergy between semantics and semantic-web technology on the one hand and analysis and mining of usage data on the other hand. The two fields are a promising combination. First, semantics can be used to enhance the analysis of usage data. Usage logs contain information that can help to better understand users or to adapt a system to a user’s needs and preferences. Now that more and more explicit knowledge is represented on the Web, in the form of ontologies, folksonomies, or linked data, the question arises how these semantics can be used to aid large scale web usage analysis and mining. Second, usage data analysis can enhance semantic resources as well as Semantic Web applications. Traces of users can be used to evaluate, adapt or personalize Semantic Web applications. Since logs record real-life users, they provide an opportunity to create gold standards for search or recommendation tools. In addition, logs can form valuable resources from which knowledge (e.g. in the form of ontologies or thesauri) can be extracted bottom-up. Also, the emerging Web of Data demands a re-evaluation of existing usage mining techniques; new ways of accessing information enabled by the Web of Data imply the need to develop or adapt algorithms, methods, and techniques to analyze and interpret the usage of Web data instead of Web pages. An important question at this time is how the Web of Data is being used: how are datasets being accessed by human users and how by machines, what kinds of queries are being performed, and what can we learn about the usage of semantic applications? The primary goals of this workshop are to foment a new community of researchers from various fields sharing an interest in usage mining and semantics and to create a roadmap for future research in this direction. Data Challenge == In addition to regular papers, we will release a dataset of usage data (server log files) from two Linked Open Data sources: Semantic Web Dog Food (data.semanticweb.org) and DBpedia (dbpedia.org). Participants are invited to present interesting analyses, applications, alignments, etc. for these datasets, and to submit their findings as a Data Challenge paper. The best Data Challenge paper will get a prize. Topics of interest == include, but are not limited to: • Analysis and mining of usage logs of semantic resources and applications • Inferring semantic information from usage logs • Methods and tools for semantic analysis of usage logs • Representing and enriching usage logs with semantic information • Usage-based evaluation methods and frameworks; gold standards for evaluation of semantic web applications • Specifics and semantics of logs for content-consumption and content-creation • Using semantics for recommendation, personalization and adaptation • Usage-based recommendation, personalization and adaptation of semantic web applications • Exploiting usage logs for semantic search • Data sharing, privacy, and privacy-protecting policies and techniques Workshop chairs === * Bettina Berendt, K.U. Leuven, Belgium * Laura Hollink, Delft University of Technology, The Netherlands * Vera Hollink, Centre for Mathematics and Computer Science, Amsterdam, The Netherlands * Markus Luczak-Roesch, Freie Universitaet Berlin, Germany * Knud Moeller, DERI / National University of Ireland, Galway, Ireland * David Vallet, Universidad Autonoma de Madrid, Spain --- Please contact us at usewod2011-cha...@googlegroups.com Program committee === see the workshop web page: http://data.semanticweb.org/usewod/2011/ - Knud Möller, PhD +353 - 91 - 495086 Smile Group: http://smile.deri.ie Digital Enterprise Research Institute National University of Ireland, Galway Institiúid Taighde na Fiontraíochta Digití Ollscoil na hÉireann, Gaillimh
Re: How to declare in a web app's interface which kind of app/version/features and or interfaces or formats it exposes
Hi Olivier, Do you know http://LinkedOpenServices.org/? This might be what you're looking for (assuming your website is your API, where website reads like Web app). Cheers, Tom Thank God not sent from a BlackBerry, but from my iPhone On 20.01.2011, at 18:37, Olivier Berger olivier.ber...@it-sudparis.eu wrote: Le jeudi 20 janvier 2011 à 15:50 +, Michael Hausenblas a écrit : Olivier, I'm considering the different options that could help embed (with slightest modifications possible) in the in HTML interface of a Web app, a description of which app it is and/or which interfaces it exposes, so that this would be discoverable and lead to exploitation of such data by SemWeb apps, or existing harvesters. You might find my blog post 'Announcing Application Metadata on the Web of Data' [1] along with the template [2] useful for this purpose. Cheers, Michael [1] http://webofdata.wordpress.com/2010/01/06/announcing-application-metadata [2] http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html Thanks for sharing this. The only critique I could have is about the use of DOAP, where there's a confusion between a project (community) and a software (developped by that community) behind doap:Project, IMHO... but that's a common problem with DOAP, that is counterweighted by its popularity (perfect model vs. available data). I was thinking of something maybe less intrusive : RDFa addition to existing apps is maybe too hard, as requiring to change its code (in particular as some (X)HTML may not convert so easily to XHTML+RDFa. Anything in the domain of HTML headers maybe ? Such could more easily be added by quick+dirty patches / sysadmin configuration. Thanks in advance. -- Olivier BERGER olivier.ber...@it-sudparis.eu http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8 Ingénieur Recherche - Dept INF Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)
Re: URI Comparisons: RFC 2616 vs. RDF
Hi Nathan, I largely agree but have a few quibbles :) On 20/01/2011 2:29 PM, Nathan wrote: Dave Reynolds wrote: The URI spec (rfc3986[1]) does allow this usage. In particular Section 6 Normalization and Comparison says: URI comparison is performed for some particular purpose. Protocols or implementations that compare URIs for different purposes will often be subject to differing design trade-offs in regards to how much effort should be spent in reducing aliased identifiers. This section describes various methods that may be used to compare URIs, the trade-offs between them, and the types of applications that might use them. and We use the terms different and equivalent to describe the possible outcomes of such comparisons, but there are many application-dependent versions of equivalence. While RDF predates this spec it seems to me that the RDF usage remains consistent with it. The purpose of comparison in RDF is different from that of cache retrieval of web pages or message delivery of email. Indeed, I also read though: For all URIs, the hexadecimal digits within a percent-encoding triplet (e.g., %3a versus %3A) are case-insensitive and therefore should be normalized to use uppercase letters for the digits A-F. When a URI uses components of the generic syntax, the component syntax equivalence rules always apply; namely, that the scheme and host are case-insensitive and therefore should be normalized to lowercase... - http://tools.ietf.org/html/rfc3986#section-6.2.2.1 And took the For all and always to literally mean for all and always. Those quotes come from section (6.2.2) describing normalization but the earlier quote is from the start of section 6 saying that choice of normalization is application dependent. I interpret the two together as *if* you are normalizing then always ...blah That was certainly the RIF position where we explicitly said that sections 6.2.2 and 6.2.3 of rfc3986 were not applicable. against both the RDF Specification [1] and the URI specification when they say /not/ to encode permitted US-ASCII characters (like ~ %7E)? Where did that example come from? The encoding consists of... %-escaping octets that do not correspond to permitted US-ASCII characters. - http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers. - http://tools.ietf.org/html/rfc3986#section-2.3 I read those quotes as saying do not encode permitted US-ASCII characters in RDF URI References. At what point have we suggested doing that? As above Sorry, I didn't mean to dispute that you shouldn't %-encode ~, I was wondering where the suggestion that you should do so came from. I believe there are some corner cases, such as the handling of spaces, which differ between the RDF spec and the IRI spec. This was down to timing. The RDF Core WG was doing its best to anticipate what the IRI spec would look like but couldn't wait until that was finalized. Resolving any such small discrepancies between that anticipation and the actual IRI specs is something I believe to be in scope for the proposed new RDF WG. So use normalized URIs in the first place. ... RDF/OWL/RIF aren't designed the way they are because someone thought it would be a good idea to allow such things to be used side by side or because they *want* people to use denormalized URIs. ... The point is that there is no single, simple, universal (i.e. across all schemes) normalization algorithm that could be used. The current approach gives stable, well-defined behaviour which doesn't change as people invent new URI schemes. The RDF serializations give you enough control to enable you to be certain about what URI you are talking about. Job done. Okay, I agree, and I'm really not looking to create a lot of work here, the general gist of what I'm hoping for is along the lines of: RDF Publishers MUST perform Case Normalization and Percent-Encoding Normalization on all URIs prior to publishing. When using relative URIs publishers SHOULD include a well defined base using a serialization specific mechanism. Publishers are advised to perform additional normalization steps as specified by URI (RFC 3986) where possible. RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case Normalization and Percent-Encoding Normalization. Two RDF URIs are equal if and only if they compare as equal, character by character, as Unicode strings. I sort of OK with that but ... Terms like RDF Publisher and RDF Consumer need to be defined in order to make formal statements like these. The RDF/OWL/RIF specs are careful to define what sort of processors are subject to conformance statements and I don't think RDF
Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF
Dave Reynolds wrote: Okay, I agree, and I'm really not looking to create a lot of work here, the general gist of what I'm hoping for is along the lines of: RDF Publishers MUST perform Case Normalization and Percent-Encoding Normalization on all URIs prior to publishing. When using relative URIs publishers SHOULD include a well defined base using a serialization specific mechanism. Publishers are advised to perform additional normalization steps as specified by URI (RFC 3986) where possible. RDF Consumers MAY normalize URIs they encounter and SHOULD perform Case Normalization and Percent-Encoding Normalization. Two RDF URIs are equal if and only if they compare as equal, character by character, as Unicode strings. I sort of OK with that but ... Terms like RDF Publisher and RDF Consumer need to be defined in order to make formal statements like these. The RDF/OWL/RIF specs are careful to define what sort of processors are subject to conformance statements and I don't think RDF Publisher is a conformance point for the existing specs. This may sound like nit-picking that's life with specifications. You need to be clear how the last para about RDF URIs relates to notions like RDF Consumer. I wonder whether you might want to instead define notions of Linked Data Publisher and Linked Data Consumer to which these MUST/MAY/SHOULD conformance statements apply. That way it is clear that a component such as an RDF store or RDF parser is correct in following the existing RDF specs and not doing any of these transformations but that in order to construct a Linked Data Consumer/Publisher some other component can be introduced to perform the normalizations. Linked Data as a set of constraints and conventions layered on top of the RDF/OWL specs. Fully agree, had the same conversation with DanC this afternoon and he too immediately suggested changing RDF Publisher/Consumer to Linked Data Publisher/Consumer. Also ties in with earlier comments about standardizing Linked Data, however it's done, or worded, my only care here is that it positively impacts the current situation, and doesn't negatively impact anybody else. The specific point on the normalization ladder would have to defined, of course, and you would need to define how to handle schemes unknown to the consumer. All this presupposes some work to formalize and specify linked data. Is there anything like that planned? In some ways Linked Data is an engineering experiment and benefits from that freedom to experiment. On the other hand interoperability eventually needs clear specifications. Unsure, but I'll also ask the question, is there anything planned? I'd certainly +1 standardization and do anything I could to help the process along. For many reasons it would be good to solve this at the publishing phase, allow normalization at the consuming phase (can't be precluded as intermediary components may normalize), and keep simple case sensitive string comparison throughout the stack and specs (so implementations remain simple and fast.) Agreed. cool, thanks again Dave, Nathan
Re: Standardizing linked data - was Re: URI Comparisons: RFC 2616 vs. RDF
Nathan wrote: Dave Reynolds wrote: All this presupposes some work to formalize and specify linked data. Is there anything like that planned? In some ways Linked Data is an engineering experiment and benefits from that freedom to experiment. On the other hand interoperability eventually needs clear specifications. Unsure, but I'll also ask the question, is there anything planned? I'd certainly +1 standardization and do anything I could to help the process along. or perhaps an IG/XG follow up to the SWEO, taking in to account Read Write Web of Data, hopefully with a some protocol or best practice report giving a migration path to standardization? There are certainly plenty of other groups to take in to account and consider in all of this, like the WebID XG. Best, Nathan
Re: URI Comparisons: RFC 2616 vs. RDF
On Thu, Jan 20, 2011 at 5:15 AM, Nathan nat...@webr3.org wrote: As far as I can see, that's only for a URI reference used within a namespace, and does not govern usage or normalization when you join the URI reference up with the local name to make the full URI. Out of interest, where is that process defined? I was looking for it the other day - for instance in the quoted specification we have the example: edi:price xmlns:edi='http://ecommerce.example.org/schema' units='Euro'32.18/edi:price Where's the bit of the XML specification which says you join them up by concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) and 'Euro' to get 'http://ecommerce.example.org/schema#Euro'? My understanding is that this is governed by the definition of qnames. As I understand things, the concatenation you write would happen only if the attribute was defined in the schema to be an xsi:type http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/structures.html#xsi_type, and without the #. The only case where a # would be added is when rdf:id or xml:id is used. And finally, this is why I specifically asked if the non-normalization of RDF URI References had XML Namespace heritage, which had then filtered down through OWL, SPARQL and RIF. I don't believe so. I believe the genesis are the reasons that I discussed earlier - the difficulty of actually implementing it combined with the indeterminacy. But I would be glad if someone else has better information and can either confirm or deny this. -Alan
Re: URI Comparisons: RFC 2616 vs. RDF
On Thu, Jan 20, 2011 at 11:15 AM, Nathan nat...@webr3.org wrote: Alan Ruttenberg wrote: On Wed, Jan 19, 2011 at 4:45 PM, Nathan nat...@webr3.org wrote: David Wood wrote: On Jan 19, 2011, at 10:59, Nathan wrote: ps: as an illustration of how engrained URI normalization is, I've capitalized the domain names in the to: and cc: fields, I do hope the mail still come through, and hope that you'll accept this email as being sent to you. Hopefully we'll also find this mail in the archives shortly at htTp://lists.W3.org/Archives/Public/public-lod/2011Jan/ - Personally I'd hope that any statements made using these URIs (asserted by man or machine) would remain valid regardless of the (incorrect?-)casing. Heh. OK, I'll bite. Domain names in email addressing are defined in IETF RFC 2822 (and its predecessor RFC 822), which defers the interpretation to RFC 1035 (Domain names - implementation and specification). RFC 1035 section 2.3.3 states that domain names in DNS, and therefore in (E)SMTP, are to be compared in a case-insensitive manner. As far as I know, the W3C specs do not so refer to RFC 1035. And I'll bite in the other direction, why not treat URIs as URIs? why go against both the RDF Specification [1] and the URI specification when they say /not/ to encode permitted US-ASCII characters (like ~ %7E)? why force case-sensitive matching on the scheme and domain on URIs matching the generic syntax when the specs say must be compared case insensitively? and so on and so forth. [AR] Which specs? The various URI/IRI specs and previous revisions of. http://www.w3.org/TR/REC-xml-names/#NSNameComparison URI references identifying namespaces .. In a namespace declaration, the URI reference is .. The URI references below are all different for the purposes of identifying namespaces .. The URI references below are also all different for the purposes of identifying namespaces .. So here is another spec that *explicitly* disagrees with the idea that URI normalization should be a built-in processing. As far as I can see, that's only for a URI reference used within a namespace, and does not govern usage or normalization when you join the URI reference up with the local name to make the full URI. Out of interest, where is that process defined? I was looking for it the other day - for instance in the quoted specification we have the example: edi:price xmlns:edi='http://ecommerce.example.org/schema' units='Euro'32.18/edi:price Where's the bit of the XML specification which says you join them up by concatenating 'http://ecommerce.example.org/schema' with #(?assumed?) and 'Euro' to get 'http://ecommerce.example.org/schema#Euro'? Actually you don't. A namespace is just that - a tuple (namespace, localname) in XML. That's why namespaces in XML are far all intents and purposes broken and why, to a large extent, Web browser developers in HTML stopped using them and hate implementing them in the DOM, and so refuse to have them in HTML5. And that's one reason RDF(A) will probably continue getting a sort of bad rap in the HTML world, as prefixes are not associated with just making URIs, but with this terrible namespace tuple. For an archeology of the relevant standards, check out Section What Namespaces Do of this paper. While the paper is focussed on why namespace documents are a mess, the relevant information is in that section and extensively referenced, with examples: http://xml.coverpages.org/HHalpinXMLVS-Extreme.html And finally, this is why I specifically asked if the non-normalization of RDF URI References had XML Namespace heritage, which had then filtered down through OWL, SPARQL and RIF. Indeed, they should be normalized in a sane manner across all Semantic Web specs, and dependencies on XML Namespaces should obviously be dropped IMHO. [AR] More to document, please: Which data is being junked and scrapped? will document, but essentially every statement made using a non normalized URI when other statements are also being made about the same resource using normalized URIs - the two most common cases for this will be when people are using CMS systems and enter their domain name as uppercase in some admin, only to have that filter through to URIs in serialized RDF/RDFa, and where bugs in software have led to inconsistent URIs over time (for instance where % encoding has been fixed, or a :80 has been removed from a URI). [AR] Hmm. Are you suggesting that the behavior of libraries and clients should have precedence over specification? My view is that one first looks to specifications, and then only if specifications are poor or do not speak to the issue do we look at existing behavior. Which is the case with namespaces and URI normalization :) Yes I am, that specification should standardize the behaviour of libraries and clients - the level of normalization in URIs published, consumed or used by these tools is often