Re: [whatwg] Link rot is not dangerous
On 20 May 2009, at 05:23, Tab Atkins Jr. wrote: Specifically, people can use a search engine to find information about foaf. I know that typing foaf into my browser's address bar and clicking on the first likely link is *way* faster than digging into a document with a foaf namespace declared, finding the url, and copy/pasting that into the location bar. FOAF is a very famous vocabulary, so this happens to work quite well for FOAF. Consider Dublin Core though. Typing dc into Google brings up results for DC Comics, DC Shoes, Washington DC and a file sharing application called Direct Connect, all ahead of Dublin Core, which is the nineth result. Even if I spot that result, clicking through takes me to the Dublin Core Metadata Initiative's homepage, which is mostly full of conference and event information - not the definitions I'm looking for. On the other hand, typing http://purl.org/dc/terms/issued into my browser's address bar gives me an RDFS definition of the term immediately. Your suggestion also makes the assumption that there is a single correct answer that Google/Yahoo/whatever could give to such a query - that any given string used as a prefix will only ever be legitimately bound to one vocabulary. That is simply not the case: dc for example is most often used with Dublin Core Elements 1.1, but still occasionally seen as a prefix for the older 1.0 version, and increasingly being used with the new Dublin Core Terms collection. While Elements 1.0 and 1.1 are largely compatible (the latter introduces two extra terms IIRC), Dublin Core Terms has significant differences. bio is another string commonly bound to different vocabularies - both the biographical vocab often used in conjunction with FOAF, plus various life-science-related vocabularies. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: [whatwg] Link rot is not dangerous
Kristof Zelechovski wrote: Following the URL to discover the semantic of properties is not only useful but can also be necessary for CURIE, e.g. when the author uses a paradoxical prefix just for the fun of it. A language without CURIE would not expose the users to this necessity. If you have to CURIE, you have to FYN. Just my POV, Chris CURIEs vs URI is only a syntactical difference; you don't need to FYN as long as you are happy with a URI as identifier. BR, Julian
Re: [whatwg] Link rot is not dangerous
On Wed, May 20, 2009 at 2:35 AM, Toby A Inkster m...@tobyinkster.co.uk wrote: On 20 May 2009, at 05:23, Tab Atkins Jr. wrote: Specifically, people can use a search engine to find information about foaf. I know that typing foaf into my browser's address bar and clicking on the first likely link is *way* faster than digging into a document with a foaf namespace declared, finding the url, and copy/pasting that into the location bar. FOAF is a very famous vocabulary, so this happens to work quite well for FOAF. Consider Dublin Core though. Typing dc into Google brings up results for DC Comics, DC Shoes, Washington DC and a file sharing application called Direct Connect, all ahead of Dublin Core, which is the nineth result. Even if I spot that result, clicking through takes me to the Dublin Core Metadata Initiative's homepage, which is mostly full of conference and event information - not the definitions I'm looking for. On the other hand, typing http://purl.org/dc/terms/issued into my browser's address bar gives me an RDFS definition of the term immediately. As Kristof said, while typing dc isn't very helpful, typing pretty much any relevant property works great. dc:title, dc:creator, whatever. It all brings up some decent results right at the top of a Google search. Your suggestion also makes the assumption that there is a single correct answer that Google/Yahoo/whatever could give to such a query - that any given string used as a prefix will only ever be legitimately bound to one vocabulary. That is simply not the case: dc for example is most often used with Dublin Core Elements 1.1, but still occasionally seen as a prefix for the older 1.0 version, and increasingly being used with the new Dublin Core Terms collection. While Elements 1.0 and 1.1 are largely compatible (the latter introduces two extra terms IIRC), Dublin Core Terms has significant differences. bio is another string commonly bound to different vocabularies - both the biographical vocab often used in conjunction with FOAF, plus various life-science-related vocabularies. And yet, given an example use of the vocabulary, I'm quite certain I can easily find the page I want describing the vocab, even when there are overlaps in prefixes such as with bio. FYN is nearly never necessary for humans. We have the intelligence to craft search queries and decide which returned result is correct. ~TJ
Re: [whatwg] Link rot is not dangerous
On 20/5/09 22:54, Tab Atkins Jr. wrote: On Wed, May 20, 2009 at 2:35 AM, Toby A Inksterm...@tobyinkster.co.uk wrote: And yet, given an example use of the vocabulary, I'm quite certain I can easily find the page I want describing the vocab, even when there are overlaps in prefixes such as with bio. FYN is nearly never necessary for humans. We have the intelligence to craft search queries and decide which returned result is correct. What happens in practice is that many of these perfectly intelligent humans ask in email or IRC questions that are clearly answered directly in the relevant documentation. You can lead humans to the documentation, but you can't make 'em read... cheers, Dan
Re: [whatwg] Link rot is not dangerous
On Wed, May 20, 2009 at 4:02 PM, Dan Brickley dan...@danbri.org wrote: On 20/5/09 22:54, Tab Atkins Jr. wrote: On Wed, May 20, 2009 at 2:35 AM, Toby A Inksterm...@tobyinkster.co.uk wrote: And yet, given an example use of the vocabulary, I'm quite certain I can easily find the page I want describing the vocab, even when there are overlaps in prefixes such as with bio. FYN is nearly never necessary for humans. We have the intelligence to craft search queries and decide which returned result is correct. What happens in practice is that many of these perfectly intelligent humans ask in email or IRC questions that are clearly answered directly in the relevant documentation. You can lead humans to the documentation, but you can't make 'em read... This is an unfortunate reality, and one which cannot be cured simply be embedding a url reasonably close to the location. I humbly suggest using www.lmgtfy.com to chastise the lazy bums while remaining helpful. ^_^ ~TJ
Re: [whatwg] Link rot is not dangerous
On Mon, May 18, 2009 at 7:26 AM, Henri Sivonen hsivo...@iki.fi wrote: On May 18, 2009, at 14:45, Dan Brickley wrote: Since there is useful information to know about FOAF properties and terms from its schema and human-oriented docs, it would be a shame if people ignored that. Since domain names can be lost, it would also be a shame if directly de-referencing URIs to the schema was the only way people could find that info. Fortunately, neither is the case. I wasn't talking about people but about apps dereferencing NS URIs to enable their functionality. Specifically, people can use a search engine to find information about foaf. I know that typing foaf into my browser's address bar and clicking on the first likely link is *way* faster than digging into a document with a foaf namespace declared, finding the url, and copy/pasting that into the location bar. There are always decent search terms around to help people find the information at least as easily, and certainly more reliably, than an embedded url. The just use a search engine position has been brought up by Ian with respect to multiple cases in the overall discussion as well. For humans, search engines are just more reliable and easier to use than a uri (at least, a uri in a non-clickable context). ~TJ
Re: [whatwg] Link rot is not dangerous
On May 15, 2009, at 19:20, Manu Sporny wrote: There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. The flawed conclusion flows out of Follow Your Nose advocacy, and is not flawed if one takes Follow Your Nose seriously. It seems to me that the positions that RDF applications should Follow Their Nose and that link rot is not dangerous (to RDF) are contradictory positions. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? (For clarity: I'm not saying that link rot is dangerous to RDF apps. I'm saying that taking the position that it is not dangerous contradicts Follow Your Nose advocacy. I think Follow Your Nose is impractical on the Web scale and is alien to naming schemes used in technologies that have been successfully deployed on the Web scale [e.g. HTML, CSS, JavaScript, DOM and Unicode].) - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). Cache means that you can still go find the original and the cache is just nearer. If a cached copy of the vocabulary cannot be found, it can be re- created from scratch if necessary. Do any end user applications that use RDF internally provide a UI for installing local re-creations? On May 15, 2009, at 20:25, Shelley Powers wrote: Also don't lose sight that this is really no more serious an issue than, say, a company originating com.sun.* being purchased by another company, named com.oracle.*. And you can't say, Well that's not the same, because it is. It's not the same. A Java classloader doesn't Follow Its Nose. A classloader will find classes in my classpath even if there weren't a server at sun.com. Likewise, http://sun.com/foo RDF predicates would continue to work in applications that don't Follow Their Nose even if the server at sun.com disappeared. However, if the com.sun.* classes were renamed to com.oracle.* and the com.sun.* copies withdrawn in a new release of a library, other classes that have been compiled against com.sun.* classes would cease to load. This is analogous to applications programmed to recognize http://web.resource.org/cc/* predicates not recognizing http://creativecommons.org/ns#* predicates. (You can't Follow Your Nose from the former to the latter, BTW.) -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] Link rot is not dangerous
On 18/5/09 10:34, Henri Sivonen wrote: On May 15, 2009, at 19:20, Manu Sporny wrote: There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. The flawed conclusion flows out of Follow Your Nose advocacy, and is not flawed if one takes Follow Your Nose seriously. It seems to me that the positions that RDF applications should Follow Their Nose and that link rot is not dangerous (to RDF) are contradictory positions. That's a strong claim. There is certainly a balance to be found between taking advantage of de-referencable URIs and relying on their de-referencability. De-referencing is a privilege not a right, after all. If I lost control of xmlns.com tommorrow, and it became un-rescuably owned by offshore spam-virus-malware pirates, that doesn't change history. For nine years, the FOAF documentation has lived there, and we can use URIs to ask other services about what they saw during that period: http://web.archive.org/web/*/http://xmlns.com/foaf/0.1/ Since there is useful information to know about FOAF properties and terms from its schema and human-oriented docs, it would be a shame if people ignored that. Since domain names can be lost, it would also be a shame if directly de-referencing URIs to the schema was the only way people could find that info. Fortunately, neither is the case. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? The search site, sindice.com does this: Yes Sindice dereferences URIs it finds in RDF instance data, including class and property URIs. It performs OWL reasoning using the retrieved information, mostly to infer additional triples based on subclass and subproperty relationships. Doing this helps us to increase recall in queries. (from Richard Cyganiak, who I asked offlist for confirmation) Whether you consider sindice.com end-user facing or not, I don't know. I put in roughly the same category as Google's Social Graph API. But it's a non-trivial implementation that aggregates and integrates a lot of data. BTW here's another use case for identifying properties and classes by URI: we can decentralise the translation of their labels into other languages. Here are some Korean descriptions of FOAF, for example: http://svn.foaf-project.org/foaftown/foaf18n/foaf-kr.rdf cheers, Dan
Re: [whatwg] Link rot is not dangerous
On May 18, 2009, at 14:45, Dan Brickley wrote: On 18/5/09 10:34, Henri Sivonen wrote: It seems to me that the positions that RDF applications should Follow Their Nose and that link rot is not dangerous (to RDF) are contradictory positions. That's a strong claim. There is certainly a balance to be found between taking advantage of de-referencable URIs and relying on their de-referencability. De-referencing is a privilege not a right, after all. If there's value in apps dereferencing namespace URIs, those URIs going undereferencable leads to loss of value. Hence, link rot would cause loss of value i.e. be 'dangerous' by breaking something. If I lost control of xmlns.com tommorrow, and it became un-rescuably owned by offshore spam-virus-malware pirates, that doesn't change history. For nine years, the FOAF documentation has lived there, and we can use URIs to ask other services about what they saw during that period: http://web.archive.org/web/*/http://xmlns.com/foaf/0.1/ Do any RDF consumer apps that dereference namespace URIs actually fall back on web.archive.org? If I'm a FOAF author, what recourse do I have if URI dereferencing- based functionality breaks in some apps due to xmlns.com going unavailable when other apps have hard-coded xmlns.com URIs so if I simply changed my predicates I'd break existing apps? At least authors who rely on Y!/AOL/Google serving JS libraries can start using a copy of any JS library on another CDN without changing how the script runs. Since there is useful information to know about FOAF properties and terms from its schema and human-oriented docs, it would be a shame if people ignored that. Since domain names can be lost, it would also be a shame if directly de-referencing URIs to the schema was the only way people could find that info. Fortunately, neither is the case. I wasn't talking about people but about apps dereferencing NS URIs to enable their functionality. That link rot hasn't been a practical problem to the Semantic Web community suggests that applications don't really Follow Their Nose in practice. Can anyone point me to a deployed end user application that uses RDF internally and Follows Its Nose? The search site, sindice.com does this: Thanks. Whether you consider sindice.com end-user facing or not, I don't know. I wouldn't characterize it as an end-user app. It exposes terms like RDF and triples and shows qnames to the user. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
[whatwg] Link rot is not dangerous
Geoffrey Sneddon Fri May 15 14:27:03 PDT 2009 On 15 May 2009, at 18:25, Shelley Powers wrote: One of the very first uses of RDF, in RSS 1.0, for feeds, is still in existence, still viable. You don't have to take my word, check it out yourselves: http://purl.org/rss/1.0/ Who actually treats RSS 1.0 as RDF? Every major feed reader just uses a generic XML parser for it (quite frequently a non-namespace aware one) and just totally ignores any RDF-ness of it. What does it mean to treat as RDF? An RSS 1.0 feed is essentially a stream of items that has been lifted from the page(s) and placed in an RDF/XML feed. When I read e.g. http://www.w3.org/2000/08/w3c-synd/home.rss in Safari, I can sort the news items according to date, source, title. Which means - I think - that Safari sees the feed as machine readable. It is certainly possible to do more - I guess, and Safari does the same to non-RDF feeds, but still. And search engines should have the same opportunities w.r.t. creating indexes based on RSS 1.0 as on RDFa. (Though here perhaps comes in between the fact that search engines prefers to help us locate HTML pages rather than feeds.) -- leif halvard silli
Re: [whatwg] Link rot is not dangerous (was: Re: Annotating structured data that HTML has nosemanticsfor)
On 15 May 2009, at 17:20, Manu Sporny wrote: The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. I was talking about this recently somewhere (can't remember where). The RDF model is different from {key:value} models in that it has a third component - a subject. This means that while a description for http://xmlns.com/foaf/0.1/Person (which I'll refer to as 'foaf:Person' from now on, for brevity) can be found at the URL foaf:Person, it's also possible for descriptions of foaf:Person to be found elsewhere. While the description for foaf:Person at foaf:Person is clearly much easier to find than other descriptions for foaf:Person, under the RDF model, they are all afforded equal weight. If foaf:Person disappeared tomorrow, and even if I couldn't find an alternative source for that definition, the URI would still not be useless. I'd still know, say, that Toby Inkster is a foaf:Person, and Manu Sporny is a foaf:Person and from that I'd be able to conclude that they're the same sort of thing in some way. Given enough instance data like that, I might even be able to analyse the instance data, looking at what all the instances of foaf:Person had in common and rediscover the original definition of foaf:Person. The ability to dereference an RDF class or property to discover more about it is very useful. A data format without that ability is all the poorer for not having it. But, when that dereferencing fails, all is not lost. So when in use cases, RDF fans talk about it being 'essential' to be able to follow their noses to definitions of terms, what is meant is that it's essential that a mechanism exists to enable this technique - it not essential that the definitions are always found. -- Toby A Inkster mailto:m...@tobyinkster.co.uk http://tobyinkster.co.uk
Re: [whatwg] Link rot is not dangerous
On May 15, 2009, at 11:08 PM, Leif Halvard Silli wrote: Geoffrey Sneddon Fri May 15 14:27:03 PDT 2009 On 15 May 2009, at 18:25, Shelley Powers wrote: One of the very first uses of RDF, in RSS 1.0, for feeds, is still in existence, still viable. You don't have to take my word, check it out yourselves: http://purl.org/rss/1.0/ Who actually treats RSS 1.0 as RDF? Every major feed reader just uses a generic XML parser for it (quite frequently a non-namespace aware one) and just totally ignores any RDF-ness of it. What does it mean to treat as RDF? An RSS 1.0 feed is essentially a stream of items that has been lifted from the page(s) and placed in an RDF/XML feed. When I read e.g. http://www.w3.org/2000/08/w3c-synd/home.rss in Safari, I can sort the news items according to date, source, title. Which means - I think - that Safari sees the feed as machine readable. It is certainly possible to do more - I guess, and Safari does the same to non-RDF feeds, but still. And search engines should have the same opportunities w.r.t. creating indexes based on RSS 1.0 as on RDFa. (Though here perhaps comes in between the fact that search engines prefers to help us locate HTML pages rather than feeds.) Safari's underlying feed parsing code completely ignores the RDF nature of RSS 1.0. It parses it the same way as an RSS 2.0 or Atom feed, which is to say parsing as XML (possibly broken XML in the case of RSS variants) and then examining the parsed XML in a completely ad- hoc fashion. Regards, Maciej
Re: [whatwg] Link rot is not dangerous
Philip Taylor wrote: The source data is the list of common RDF namespace URIs at http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces from three years ago. Out of those 284: * 56 are 404s. (Of those, 37 end with '#', so that URI itself really ought to exist. In the other cases, it'd be possible that only the prefix+suffix URIs are meant to exist. Some of the cases are just typos, but I'm not sure how many.) * 2 are Forbidden. (Of those, 1 looks like a typo.) * 2 are Bad Gateway. * 22 could not connect to the server. (Of those, 2 weren't http:// URIs, and 1 was a typo. The others represent 13 different domains.) While this analysis is interesting, looking at the 56 which 404, it doesn't seem like a massive loss to me. Some of them are clearly typos (e.g. DOAP and RSS syndication which are both on HTTP 200 and HTTP 3xx lists in their correct form). In many cases I think you'll find that it's not that the link has rotted with time, but that there was *never* a file at the other end. Even the ones which are genuinely lost are probably only used by a handful of people. The *really* commonly used URIs - RDF, RDFS, OWL, FOAF, Dublin Core (1.1 and Terms), RSS (1.0, plus commonly used modules), SKOS, SIOC, dbpedia, geo, Geonames, vCard and iCalendar - all seem to have been pretty stable so far. Judging the stability of RDF URIs by looking at the 284 most common namespace URIs is akin to judging the provision of light rail in British cities by looking at the UK's 284 most populated areas - the results would actually be more helpful if you restricted yourself to a smaller sample. Lastly, the RDF model tends to be very resilient against loss of information anyway. Generally, data tends to be structured such that if a collection of triples is true, any subset is also true. So if the meaning of certain triples within a document is lost because of link rot, the document as a whole will probably still be useful. -- Toby Inkster m...@tobyinkster.co.uk
Re: [whatwg] Link rot is not dangerous
2009/5/16 Laurens Holst laurens.nos...@grauw.nl: Tab Atkins Jr. schreef: Once you remove discovery as a strong requirement, then you remove the need for large urls, and that removes the need for CURIEs, or any other form of prefixing. You still want to uniquify your identifiers to avoid accidental clashes, but that's not that hard, nor is it absolutely necessary. The system can be robust and usable even with a bit of potential ambiguity if small authors design their private vocabs badly. As a bonus, everything gets simpler. Essentially it devolves into something relatively close to Ian's microdata proposal, perhaps with datatype added in (though I do question how necessary that is, given a half-intelligent parser can recognize things as numbers or dates). Ho, ho, you’re making a big leap there! By me explaining that dereferencible URIs are not needed to make RDF work on a core level, which makes RDF robust, do not jump to the conclusion that it is of no benefit! URIs are there for the benefit of linking, and help discoverability a lot (just like HTML hyperlinks do). Spidering the semantic web in a follow-your-nose style is effective. Incidentally, if an ontology disappears from its original address, this kind of spidering will likely lead you to a copy thereof stored elsewhere. For example on a different spider which has the triples cached. You had just stated in the previous email, however, that few (if any) major consumers of RDFa *use* what is located on the far end of the URI. If they're not even paying attention to it, where is the value in it? I don't really understand the 'discoverability' argument here, at least in the context of it being similar to HTML hyperlinks. Hyperlinks are useful for people because they make it simple to navigate to a new page. You just click and it works, no need to copypasta the address into a new browser window. I'm also not sure how a rotted link helps you compare vocabularies with other spiders, which in a hypothetical world you are communicating with (at this point we're *far* into theory, not practice). Any uniquifier would allow you to compare things in the same way, no? You are now only considering the ontologies, that is, types and properties. You’re forgetting (or ignoring) that in RDF, objects are also named with URIs so that data at other locations can refer to it. You know, that ‘web of linked data’ people refer to, core principle of RDF. No ‘simple’ scheme based on what Ian proposed can provide a sufficient level of uniqueness for that. URIs are the best and most natural fit for use as web-scale identifiers. Define 'sufficient', as used here. I believe that this is an area where absolute uniqueness is not a requirement. Worst case, you get a little bit of data pollution with weird triples being produced by badly-written pages. Perhaps your browser offers to add an event to your calendar when no event shows up on the page, or a fraction of a search engine's microdata collection is spurious. Neither of these are big deals. That being said, I agree that URIs provide a very convenient source of uniqueness. Ian's microdata allows them to be used either in normal form or in reverse-domain form; either way provides the necessary uniqueness. And then there is of course also the thing that there is already an existing framework, which has already been here for a long time, has had a lot of clever people work on it and is gaining in popularity, and here we have ‘HTML5’ wanting to reinvent the wheel and making an entirely new framework ‘just for them’. You’d think that of all places, in a standards body people would be compelled to adopt existing standards :). There are compelling reasons to make any proposal *compatible* with RDF at the least. Ian's microdata does this, though not perfectly/completely. I've said in another thread that I dislike *all* of the inline microdata proposals. RDFa sucks, Ian's microdata sucks, they all suck. They force structure completely inline, which solves what I feel is a minority issue (carrying microdata while copypasting sourcecode) while introducing several larger downsides (carrying possibly *incorrect* microdata while copypasting source, duplication of meta structure when there is a regular page structure that can obviate this, etc.). It's the exact same problems that inline event handlers or inline @style attributes have. I think Ian is trying to limit the suckiness by at least making it as simple as possible to write. It's probably half as difficult or less to write properly, while solving 90% or more of the cases that RDFa does. This is an effort that I'm in favor of. I won't be using RDF in my pages at all unless I know that I can use something like RDF-EASE or CRDF; they allow me to just write my page as normal, then specify what the page's data means in a separate file. Plus, honestly, CRDF's inline syntax seems just as expressive as microdata and RDFa, while
Re: [whatwg] Link rot is not dangerous
On 16 May 2009, at 07:08, Leif Halvard Silli wrote: Geoffrey Sneddon Fri May 15 14:27:03 PDT 2009 On 15 May 2009, at 18:25, Shelley Powers wrote: One of the very first uses of RDF, in RSS 1.0, for feeds, is still in existence, still viable. You don't have to take my word, check it out yourselves: http://purl.org/rss/1.0/ Who actually treats RSS 1.0 as RDF? Every major feed reader just uses a generic XML parser for it (quite frequently a non-namespace aware one) and just totally ignores any RDF-ness of it. What does it mean to treat as RDF? An RSS 1.0 feed is essentially a stream of items that has been lifted from the page(s) and placed in an RDF/XML feed. When I read e.g. http://www.w3.org/2000/08/w3c-synd/home.rss in Safari, I can sort the news items according to date, source, title. Which means - I think - that Safari sees the feed as machine readable. It is certainly possible to do more - I guess, and Safari does the same to non-RDF feeds, but still. And search engines should have the same opportunities w.r.t. creating indexes based on RSS 1.0 as on RDFa. (Though here perhaps comes in between the fact that search engines prefers to help us locate HTML pages rather than feeds.) I mean using an RDF processor, and treating it as an RDF graph. Everything just creates from an XML stream (or object model) a bunch of items with a certain title, date, and description, and acts on that (and parses it out in a format specific manner, so it creates the same sort of item for, e.g., Atom) — it doesn't actually use an RDF graph for it. If you can find any widely used software that actually treats it as an RDF graph I'd be interested to know. -- Geoffrey Sneddon http://gsnedders.com/ http://simplepie.org/
Re: [whatwg] Link rot is not dangerous
Tab Atkins Jr. schreef: Ho, ho, you’re making a big leap there! By me explaining that dereferencible URIs are not needed to make RDF work on a core level, which makes RDF robust, do not jump to the conclusion that it is of no benefit! URIs are there for the benefit of linking, and help discoverability a lot (just like HTML hyperlinks do). Spidering the semantic web in a follow-your-nose style is effective. Incidentally, if an ontology disappears from its original address, this kind of spidering will likely lead you to a copy thereof stored elsewhere. For example on a different spider which has the triples cached. You had just stated in the previous email, however, that few (if any) major consumers of RDFa *use* what is located on the far end of the URI. If they're not even paying attention to it, where is the value in it? I said that the ontologies were not used by many RDF consumers. This is because they can be computationally expensive, especially for large data sets, not because they are useless. I think the most clear way I can put this is by comparison: Your argument is like arguing against XML or JSON Schemas, concluding that because they are externally referenced and not used by most XML or JSON applications, they are useless, and in fact that XML and JSON themselves are useless. This is clearly false; removing a reference to a schema from a document, or a document not having a schema, does not make the document itself useless, nor the document format it is expressed in. Although RDF Schema and OWL are definitely part of the ‘RDF ecosystem’, they are built on top of the base RDF framework and they are not in themselves required for RDF to function. However the schema does provide a useful description about the document structures and has the ability to express certain semantics, and is thus a worthy technology in its own right. I don't really understand the 'discoverability' argument here, at least in the context of it being similar to HTML hyperlinks. Hyperlinks are useful for people because they make it simple to navigate to a new page. You just click and it works, no need to copypasta the address into a new browser window. By what means the user dereferences the link is not relevant. The fact that an URI is there, identifying a unique location on the world wide web, and thus contributing to the web of linked documents that we call the World Wide Web. Without links and URIs, there would be no ‘web’. There would be a big set of networked yet isolated computers that all live in their own walled garden. Links provide discoverability of data provided elsewhere, by indicating a location. Users can find other documents because of this. Search engines like Google can spider the web based on this. The Web of Linked Data is Tim Berners-Lee’s vision of a WWW for data. I'm also not sure how a rotted link helps you compare vocabularies with other spiders, which in a hypothetical world you are communicating with (at this point we're *far* into theory, not practice). Any uniquifier would allow you to compare things in the same way, no? Just a simple rdfs:seeAlso statement referencing it in one single place will allow a spider to ‘follow its own nose’ and find the triples of the ontology in the republished location. This republication can be anywhere, a new ontology location, or a copy cached by another spider that republishes the triples it harvests on the web (such as archive.org [1]). I agree we’re getting far into the theory-not-practice realm, which is why Shelley is right in saying that in practice vocabularies are served from a location that is well cared for, e.g. using services like purl to provide permanent URLs, or having a solid organisational backing, and Philip Taylor’s list [2] does not do much to discredit this. [Side note: To point out some flaws in Philip’s list, many of the sites in his ‘404’ and ‘not responding’ list are experimental URLs. Additionally, the list fails to list usage frequency. Finally, it does not (and can not, obviously) list whether there was any RDF Schema at those locations in the first place. Because, as I explained before, I can make up the following RDF triple right here on the spot, and there would be nothing wrong with it: _:a rdf:type http://grauw.nl/rdf#Game The type referenced in this triple’s subject has no ontology at this location. The fact that it is a type is inferred by it being referenced through rdf:type, and that is enough. There is no requirement that this type resolves into a document containing RDF Schema triples. A creative example of this on the list is “java:java.util.Date”.] You are now only considering the ontologies, that is, types and properties. You’re forgetting (or ignoring) that in RDF, objects are also named with URIs so that data at other locations can refer to it. You know, that ‘web of linked data’ people refer to, core principle of RDF. No ‘simple’
[whatwg] Link rot is not dangerous (was: Re: Annotating structured data that HTML has nosemanticsfor)
Kristof Zelechovski wrote: Therefore, link rot is a bigger problem for CURIE prefixes than for links. There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. This has also lead to a false requirement that all vocabularies should be centralized. Here's the fear: If a vocabulary document disappears for any reason, then the meaning of the vocabulary is lost and all triples depending on the lost vocabulary become useless. That fear ignores the fact that we have a highly available document store available to us (the Web). Not only that, but these vocabularies will be cached (at Google, at Yahoo, at The Wayback Machine, etc.). IF a vocabulary document disappears, which is highly unlikely for popular vocabularies - imagine FOAF disappearing overnight, then there are alternative mechanisms to extract meaning from the triples that will be left on the web. Here are just two of the possible solutions to the problem outlined: - The vocabulary is restored at another URL using a cached copy of the vocabulary. The site owner of the original vocabulary either re-uses the vocabulary, or re-directs the vocabulary page to another domain (somebody that will ensure the vocabulary continues to be provided - somebody like the W3C). - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). If a cached copy of the vocabulary cannot be found, it can be re-created from scratch if necessary. The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Link rot is not dangerous
On 15/5/09 18:20, Manu Sporny wrote: Kristof Zelechovski wrote: Therefore, link rot is a bigger problem for CURIE prefixes than for links. There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. This has also lead to a false requirement that all vocabularies should be centralized. Here's the fear: If a vocabulary document disappears for any reason, then the meaning of the vocabulary is lost and all triples depending on the lost vocabulary become useless. That fear ignores the fact that we have a highly available document store available to us (the Web). Not only that, but these vocabularies will be cached (at Google, at Yahoo, at The Wayback Machine, etc.). IF a vocabulary document disappears, which is highly unlikely for popular vocabularies - imagine FOAF disappearing overnight, then there are alternative mechanisms to extract meaning from the triples that will be left on the web. Here are just two of the possible solutions to the problem outlined: - The vocabulary is restored at another URL using a cached copy of the vocabulary. The site owner of the original vocabulary either re-uses the vocabulary, or re-directs the vocabulary page to another domain (somebody that will ensure the vocabulary continues to be provided - somebody like the W3C). - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). If a cached copy of the vocabulary cannot be found, it can be re-created from scratch if necessary. The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. A few other points: 1. It's for the community of vocabulary-creators to help each other out w.r.t. hosting/publishing these: I just nudged a friend to put another 5 years on the DNS rental for a popular namespace. I think we should put a bit more structure around these kinds of habit, so that popular namespaces won't drop off the Web through accident. 2. digitally signing the schemas will become part of the story, I'm sure. While it's a bit fiddly, there are advantages to having other mechanisms beyond URI de-referencing for knowing where a schema came from 3. Parties worried about external dependencies when using namespaces can always indirect through their own namespace, whose schema document can declare subclass/subproperty relations to other URIs cheers Dan
Re: [whatwg] Link rot is not dangerous (was: Re: Annotating structured data that HTML has nosemanticsfor)
I understand that there are ways to recover resources that disappear from the Web; however, the postulated advantage of RDFa you can go see what it means simply does not hold. The recovery mechanism, Web search/cache, would be as good for CURIE URL as for domain prefixes. Creating a redirect is not always possible and the built-in redirect dictionary (CURIE catalog?) smells of a central repository. This is no better than public entity identifiers in XML. Serving the vocabulary from the own domain is not always possible, e.g. in case of reader-contributed content, and only guarantees that the vocabulary will be alive while it is supported by the domain owner. (WHATWG wants HTML documents to be readable 1000 years from now.) It is not always practical either as it could confuse URL-based tools that do not retrieve the resources referenced. All this does not imply, of course, that RDFa is no good. It is only intended to demonstrate that the postulated advantage of the CURIE lookup is wishful thinking. Best regards, Chris
Re: [whatwg] Link rot is not dangerous
Dan Brickley wrote: On 15/5/09 18:20, Manu Sporny wrote: Kristof Zelechovski wrote: Therefore, link rot is a bigger problem for CURIE prefixes than for links. There have been a number of people now that have gone to great lengths to outline how awful link rot is for CURIEs and the semantic web in general. This is a flawed conclusion, based on the assumption that there must be a single vocabulary document in existence, for all time, at one location. This has also lead to a false requirement that all vocabularies should be centralized. Here's the fear: If a vocabulary document disappears for any reason, then the meaning of the vocabulary is lost and all triples depending on the lost vocabulary become useless. That fear ignores the fact that we have a highly available document store available to us (the Web). Not only that, but these vocabularies will be cached (at Google, at Yahoo, at The Wayback Machine, etc.). IF a vocabulary document disappears, which is highly unlikely for popular vocabularies - imagine FOAF disappearing overnight, then there are alternative mechanisms to extract meaning from the triples that will be left on the web. Here are just two of the possible solutions to the problem outlined: - The vocabulary is restored at another URL using a cached copy of the vocabulary. The site owner of the original vocabulary either re-uses the vocabulary, or re-directs the vocabulary page to another domain (somebody that will ensure the vocabulary continues to be provided - somebody like the W3C). - RDFa parsers can be given an override list of legacy vocabularies that will be loaded from disk (from a cached copy). If a cached copy of the vocabulary cannot be found, it can be re-created from scratch if necessary. The argument that link rot would cause massive damage to the semantic web is just not true. Even if there is minor damage caused, it is fairly easy to recover from it, as outlined above. A few other points: 1. It's for the community of vocabulary-creators to help each other out w.r.t. hosting/publishing these: I just nudged a friend to put another 5 years on the DNS rental for a popular namespace. I think we should put a bit more structure around these kinds of habit, so that popular namespaces won't drop off the Web through accident. 2. digitally signing the schemas will become part of the story, I'm sure. While it's a bit fiddly, there are advantages to having other mechanisms beyond URI de-referencing for knowing where a schema came from 3. Parties worried about external dependencies when using namespaces can always indirect through their own namespace, whose schema document can declare subclass/subproperty relations to other URIs cheers Dan The most important point to take from all of this, though, is that link rot within the RDF world is an extremely rare and unlikely occurrence. I've been working with RDF for close to a decade, and link rot has never been an issue. One of the very first uses of RDF, in RSS 1.0, for feeds, is still in existence, still viable. You don't have to take my word, check it out yourselves: http://purl.org/rss/1.0/ Even if, and I want to strongly emphasize if link rot does occur, both Manu and Dan have demonstrated multiple ways of ensuring that no meaning is lost, and nothing is broken. However, I hope that people are open enough to take away from their discussions that they are trying to treat this concern respectfully, and trying to demonstrate that there's more than one solution. Not that this forms a proof that Oh my god, if we use RDF, we're doomed! Also don't lose sight that this is really no more serious an issue than, say, a company originating com.sun.* being purchased by another company, named com.oracle.*. And you can't say, Well that's not the same, because it is. The only safe bet is to designate some central authority and give them power over every possible name. Then we run the massive risk of this system failing (and this applies to microdata's reverse DNS as well as RDF's URI), or it being taken over by an entity that sees such a data store as a way to make a great profit. We also defeat the very principle on which semantic data on the web abides, and that's true whether you're support microdata or RDF. Shelley
Re: [whatwg] Link rot is not dangerous
Classes in com.sun.* are reserved for Java implementation details and should not be used by the general public. CURIE URL are intended for general use. So, I can say Well, it is not the same, because it is not. Cheers, Chris
Re: [whatwg] Link rot is not dangerous
Kristof Zelechovski wrote: I understand that there are ways to recover resources that disappear from the Web; however, the postulated advantage of RDFa you can go see what it means simply does not hold. This is a strawman argument more below... All this does not imply, of course, that RDFa is no good. It is only intended to demonstrate that the postulated advantage of the CURIE lookup is wishful thinking. That train of logic seems to falsely conclude that if something does not hold true 100% of the time, then it cannot be counted as an advantage. Example: Since the postulated advantage of RAID-5 is that a disk array is unlikely to fail due to a single disk failure, and since it is possible for more than one disk to fail before a recovery is complete, one cannot call running a disk array in RAID-5 mode an advantage to not running RAID at all (because failure is possible). or Since the postulated advantage of CURIEs is that you can go see what it means and it is possible for a CURIE defined URL to be unavailable, one cannot call it an advantage because it may fail. There are two flaws in the premises and reasoning above, for the CURIE case: - It is assumed that for something to be called an 'advantage' that it must hold true 100% of the time. - It is assumed that most proponents of RDFa believe that you can go see what it means holds at all times - one would have to be very deluded to believe that. The recovery mechanism, Web search/cache, would be as good for CURIE URL as for domain prefixes. Creating a redirect is not always possible and the built-in redirect dictionary (CURIE catalog?) smells of a central repository. Why does having a file sitting on your local machine that lists alternate vocabulary files for CURIEs smell of a central repository? Perhaps you're assuming that the file would be managed by a single entity? If so, it wouldn't need to be and that was not what I was proposing. Serving the vocabulary from the own domain is not always possible, e.g. in case of reader-contributed content, This isn't clear, could you please clarify what you mean by reader-contributed content? and only guarantees that the vocabulary will be alive while it is supported by the domain owner. This case and it's solution was already covered previously. Again - if the domain owner disappears, the domain disappears, or the domain owner doesn't want to cooperate for any reason, one could easily set up an alternate URL and instruct the RDFa processor to re-direct any discovered CURIEs that match the old vocabulary to the new (referenceable) vocabulary. (WHATWG wants HTML documents to be readable 1000 years from now.) Is that really a requirement? What about external CSS files that disappear? External Javascript files that disappear? External SVG files that disappear? All those have something to do with the document's human/machine readability. Why is HTML5 not susceptible to link rot in the same way that RDFa is susceptible to link rot? Also, why 1000 years, that seems a bit arbitrary? =P It is not always practical either as it could confuse URL-based tools that do not retrieve the resources referenced. Could you give an example of this that wouldn't be a bug in the dereferencing application? How could a non-dereference-able URL confuse URL-based tools? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
[whatwg] Link rot is not dangerous
Tab Atkins Jr. wrote: Reversed domains aren't *meant* to link to anything. They shouldn't be parsed at all. They're a uniquifier so that multiple vocabularies can use the same terms without clashing or ambiguity. The Microdata proposal also allows normal urls, but they are similarly nothing more than a uniquifier. CURIEs, at least theoretically, *rely* on the prefix lookup. After all, how else can you tell that a given relation is really the same as, say, foaf:name? If the domain isn't available, the data will be parsed incorrectly. That's why link rot is an issue. Where in the CURIE spec does it state or imply that if a domain isn't available, that the resulting parsed data will be invalid? -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: A Collaborative Distribution Model for Music http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Re: [whatwg] Link rot is not dangerous
Serving the RDFa vocabulary from the own domain is not always possible, e.g. when a reader of a Web site is encouraged to post a comment to the page she reads and her comment contains semantic annotations. The probability of a URL becoming unavailable is much greater than that of both mirrored drives wearing out at the same time. (data mirroring does not claim it protects from fire, water, high voltage, magnetic storms, earthquakes and the like; it only protects you from natural wear.) The probability of ultimately losing data stored in one copy is 1; the probability of a URL going down is close to 1. So, RAID works in most cases, CURIE URL do not (ultimately) work in most cases. Disappearing CSS is not a problem for HTML because CSS does not affect the meaning of the page. Disappearing scripts are a problem for HTML but they are not a problem for HTML *data*. In other words, script-generated content is not guaranteed to survive, and there is nothing we can do about that except for a warning. Such content cannot be HTML-validated either. In general, scripts are best used (and intended) for behavior, not for creating content. External SVG files do not describe existing content, they *are* (embedded) content. If a HTML file disappears, it becomes unreadable as well, but that problem obviously cannot be solved from within HTML :-) HTML should be readable in 1000 years from now was an attempt to visualize the intention of persistence. It should not be understood as best before, of course. If the author chooses to create a redirect to a well-known vocabulary using a dependent vocabulary stored at his own site in order to prevent link rot, tools that recognize vocabulary URL without reading the corresponding resources will be unable to recognize the author's intent, and for the tools that do read the original vocabulary will still be unavailable, so this method causes more problems than it solves. Cheers, Chris
Re: [whatwg] Link rot is not dangerous
Kristof Zelechovski wrote: Classes in com.sun.* are reserved for Java implementation details and should not be used by the general public. CURIE URL are intended for general use. So, I can say Well, it is not the same, because it is not. Cheers, Chris But we're not dealing with Java anymore. We're dealing with using reversed DNS concatenated with some kind of default URI, to create some kind of bastardized URL, which actually is valid, though incredibly painful to see, and can be implied to actually take one to to a web address. You don't have to take my word for it -- check out Philip's testing demo for microdata. You get triples with the following: http://www.w3.org/1999/xhtml/custom#com.damowmow.cat http://philip.html5.org/demos/microdata/demo.html#output_ntriples Not only do you face problems with link rot, you also face a significant amount of confusion, as people look at that and go, What the hell is that? Oh, and you can say, Well, but we don't _mean_ anything by it -- but what does that have to do with anything? People don't go running the spec everytime they see something. They look at this thing and think, Oh, a link. I wonder where it goes. You go ahead and try it, and imagine for a moment the confusion when it goes absolutely no where. Except that I imagine the W3C folks are getting a little annoyed with the HTML WG now, for allowing this type of thing in, generating a whole bunch of 404 errors for the web master(s). But hey, you've given me another idea. I think I'll create my own vocabulary items, with the reversed DNS http://www.w3.org/1999/xhtml/custom#com.sun.*. No, maybe http://www.w3.org/1999/xhtml/custom#com.opera.*. Nah, how about http://www.w3.org/1999/xhtml/custom#com.microsoft.*. Yeah, that's cool. And there is no mechanism is place to prevent this, because unlike regular URIs, where the domain is actually controlled by specific entity, you've created the world famous W3C fudge pot. Anything goes. I can't wait for the lawsuits on this one. You think that cybersquatting is an issue on the web, or facebook, or Twitter, wait until you see people use com.microsoft.*. Then there's the vocabulary that was created by foobar.com, that people think, Hey, cool, I'll use that...whatever it is. After all, if you want to play with the RDF kids, your vocabularies have to be usable by other people. But Foobar takes a dive in the dot com pool, and foobar.com gets taken over by a porn establishment. Yeah, I can't wait for people to explain that one to the boss. Just because it doesn't link, won't mean it won't end up on Twitter as a big, huge joke. If you want to find something to criticize, I think it's important to realize that hey, folks, you've just stepped over the line, and you're now in the Zone of Decentralization. Whatever impacts us, babes, impacts all of you. Because if you look at Philip's example, you're going to see the same set of vocabulary URIs we're using for RDF right now, as microdata uses our stuff, too. Including the links that are all trembling on the edge on the self-implosion. So the point of all of this is moot. But it was fun. Really fun. Have a great weekend. Shelley
Re: [whatwg] Link rot is not dangerous
On Fri, May 15, 2009 at 6:25 PM, Shelley Powers shell...@burningbird.net wrote: The most important point to take from all of this, though, is that link rot within the RDF world is an extremely rare and unlikely occurrence. That seems to be untrue in practice - see http://philip.html5.org/data/rdf-namespace-status.txt The source data is the list of common RDF namespace URIs at http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces from three years ago. Out of those 284: * 56 are 404s. (Of those, 37 end with '#', so that URI itself really ought to exist. In the other cases, it'd be possible that only the prefix+suffix URIs are meant to exist. Some of the cases are just typos, but I'm not sure how many.) * 2 are Forbidden. (Of those, 1 looks like a typo.) * 2 are Bad Gateway. * 22 could not connect to the server. (Of those, 2 weren't http:// URIs, and 1 was a typo. The others represent 13 different domains.) (For the URIs which returned Redirect responses, I didn't check what happens when you request the URI it redirected to, so there may be more failures.) Over a quarter of the most common namespace URIs don't resolve successfully today, and most of those look like they should have resolved when they were originally used, so link rot seems to be common. (Major vocabularies like RSS and FOAF are likely to exist for a long time, but they're the easiest cases to handle - we could just pre-define the prefixes rss: and foaf: and have a centralised database mapping them onto schemas/documentation/etc. It seems to me that URIs are most valuable to let any tiny group make one for their rarely-used vocabulary, and be guaranteed no name collisions without needing to communicate with a centralised registry to ensure uniqueness; but it's those cases that are most vulnerable to link rot, and in practice the links appear to fail quite often.) (I'm not arguing that link rot is dangerous - just that the numbers indicate it's a common situation rather than an extremely rare exception.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Link rot is not dangerous
On Fri, May 15, 2009 at 1:32 PM, Manu Sporny mspo...@digitalbazaar.com wrote: Tab Atkins Jr. wrote: Reversed domains aren't *meant* to link to anything. They shouldn't be parsed at all. They're a uniquifier so that multiple vocabularies can use the same terms without clashing or ambiguity. The Microdata proposal also allows normal urls, but they are similarly nothing more than a uniquifier. CURIEs, at least theoretically, *rely* on the prefix lookup. After all, how else can you tell that a given relation is really the same as, say, foaf:name? If the domain isn't available, the data will be parsed incorrectly. That's why link rot is an issue. Where in the CURIE spec does it state or imply that if a domain isn't available, that the resulting parsed data will be invalid? Assume a page that uses both foaf and another vocab that subclasses many foaf properties. Given working lookups for both, the rdf parser can determine that two entries with different properties are really 'the same', and hopefully act on that knowledge. If the second vocab 404s, that information is lost. The parser will then treat any use of that second vocab completely separately from the foaf, losing valuable semantic information. (Please correct any misunderstandings I may be operating under; I'm not sure how competent parsers currently are, and thus how much they'd actually use a working subclassed relation.) ~TJ
Re: [whatwg] Link rot is not dangerous
Philip Taylor wrote: On Fri, May 15, 2009 at 6:25 PM, Shelley Powers shell...@burningbird.net wrote: The most important point to take from all of this, though, is that link rot within the RDF world is an extremely rare and unlikely occurrence. That seems to be untrue in practice - see http://philip.html5.org/data/rdf-namespace-status.txt The source data is the list of common RDF namespace URIs at http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces from three years ago. Out of those 284: * 56 are 404s. (Of those, 37 end with '#', so that URI itself really ought to exist. In the other cases, it'd be possible that only the prefix+suffix URIs are meant to exist. Some of the cases are just typos, but I'm not sure how many.) * 2 are Forbidden. (Of those, 1 looks like a typo.) * 2 are Bad Gateway. * 22 could not connect to the server. (Of those, 2 weren't http:// URIs, and 1 was a typo. The others represent 13 different domains.) (For the URIs which returned Redirect responses, I didn't check what happens when you request the URI it redirected to, so there may be more failures.) Over a quarter of the most common namespace URIs don't resolve successfully today, and most of those look like they should have resolved when they were originally used, so link rot seems to be common. (Major vocabularies like RSS and FOAF are likely to exist for a long time, but they're the easiest cases to handle - we could just pre-define the prefixes rss: and foaf: and have a centralised database mapping them onto schemas/documentation/etc. It seems to me that URIs are most valuable to let any tiny group make one for their rarely-used vocabulary, and be guaranteed no name collisions without needing to communicate with a centralised registry to ensure uniqueness; but it's those cases that are most vulnerable to link rot, and in practice the links appear to fail quite often.) (I'm not arguing that link rot is dangerous - just that the numbers indicate it's a common situation rather than an extremely rare exception.) Philip, I don't think the occurrence of link rot causing problems in the RDF world is all that common, but thanks for looking up this data. Actually I will probably quote your info on my next writing at my weblog. I'd like to be dropped from any additional emails in this thread. After all, I have it on good authority I'm not open for rational discussion. So I'll leave this type of thing to you guys. Thanks Shelley
Re: [whatwg] Link rot is not dangerous
The problem of cybersquatting of oblique domains is, I believe, described and addressed in tag URI scheme definition [RFC4151], which I think is something rather similar to the constructs used for HTML microdata. I think that document is relevant not only to this discussion but to the whole concept. IMHO, Chris