Re: PURLs don't matter, at least in the LOD world
On Fri, Feb 17, 2012 at 7:02 PM, David Wood da...@3roundstones.com wrote: Given what I personally know of the state of US Government agencies, I'll take your bet whether the Web services of the Library of Congress or OCLC lasts longer :) You might look back at the tortured history of id.loc.gov before we agree to a figure. At least w/ the tortured history of id.loc.gov and lcsh.info I was able to permanently redirect lcsh.info to the appropriate places on id.loc.gov when lcsh.info was slated for retirement. That way anybody who was scrubbing their links (notably the search engines more than the semantic web community) would have updated their links. I'm with Hugh, putting all your identifier eggs in the basket of purl.org (or any 3rd party service) isn't an excuse for not thoughtfully managing your URL namespaces and DNS. Perhaps that's tilting at windmills, but so be it. In my opinion more still needs to be done to educate people about how web architecture actually works instead of getting them to invest in niche software solutions, maintained by a handful of people with consulting contracts on the line. //Ed
Re: PURLs don't matter, at least in the LOD world
On 2/19/12 8:21 AM, Ed Summers wrote: On Fri, Feb 17, 2012 at 7:02 PM, David Woodda...@3roundstones.com wrote: Given what I personally know of the state of US Government agencies, I'll take your bet whether the Web services of the Library of Congress or OCLC lasts longer :) You might look back at the tortured history of id.loc.gov before we agree to a figure. At least w/ the tortured history of id.loc.gov and lcsh.info I was able to permanently redirect lcsh.info to the appropriate places on id.loc.gov when lcsh.info was slated for retirement. That way anybody who was scrubbing their links (notably the search engines more than the semantic web community) would have updated their links. I'm with Hugh, putting all your identifier eggs in the basket of purl.org (or any 3rd party service) isn't an excuse for not thoughtfully managing your URL namespaces and DNS. Perhaps that's tilting at windmills, but so be it. In my opinion more still needs to be done to educate people about how web architecture actually works instead of getting them to invest in niche software solutions, maintained by a handful of people with consulting contracts on the line. //Ed +1 -- Regards, Kingsley Idehen Founder CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen smime.p7s Description: S/MIME Cryptographic Signature
Re: PURLs don't matter, at least in the LOD world
On 17/02/12 21:08, Kingsley Idehen wrote: On 2/17/12 2:18 PM, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. David, But any admin that oversees a DNS server can do the same thing. What's special about purl in this context? Precisely that they don't require an admin with power over the DNS registration :) To me the PURL design pattern is about delegation authority and it's an important pattern. Two specific use cases at different extremes: (1) An individual is creating a small vocabulary that they would like to see used widely but don't have a nice brand-neutral stable domain of their own they can use for the purpose. This one has already been covered in the discussion. (2) I'm a big organization, say the UK Government. I want to use a particular domain (well a set of subdomains) for publishing my data, say *.data.gov.uk. The domain choice is important - it has credibility and promises long term stability. Yet I want to decentralize the publication itself, I want different departments and agencies to publish data and identifiers within the subdomains. The subdomains are supposed to be organization-neural yet the people doing the publication will be based in specific organizations. The PURL design pattern (though not necessarily the specific PURL implementation) is an excellent way to manage the delegation that makes that possible. So my summary answer to Hugh is - they are much more important to the publisher than to the consumer. Dave
Re: PURLs don't matter, at least in the LOD world
A quick related question - does anyone know the status of purl.oclc.org - there was a point in time where the service suggested that this new hostname was going to be the new proper host for purl.org urls. I hope they have abandoned this idea, as one sure way to affect url longevity is to include a organisational brand in it ;) Ben On Feb 18, 2012 1:02 PM, Dave Reynolds dave.e.reyno...@gmail.com wrote: On 17/02/12 21:08, Kingsley Idehen wrote: On 2/17/12 2:18 PM, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/**Tokyohttp://dbpedia.org/resource/Tokyo ? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. David, But any admin that oversees a DNS server can do the same thing. What's special about purl in this context? Precisely that they don't require an admin with power over the DNS registration :) To me the PURL design pattern is about delegation authority and it's an important pattern. Two specific use cases at different extremes: (1) An individual is creating a small vocabulary that they would like to see used widely but don't have a nice brand-neutral stable domain of their own they can use for the purpose. This one has already been covered in the discussion. (2) I'm a big organization, say the UK Government. I want to use a particular domain (well a set of subdomains) for publishing my data, say *. data.gov.uk. The domain choice is important - it has credibility and promises long term stability. Yet I want to decentralize the publication itself, I want different departments and agencies to publish data and identifiers within the subdomains. The subdomains are supposed to be organization-neural yet the people doing the publication will be based in specific organizations. The PURL design pattern (though not necessarily the specific PURL implementation) is an excellent way to manage the delegation that makes that possible. So my summary answer to Hugh is - they are much more important to the publisher than to the consumer. Dave
RE: PURLs don't matter, at least in the LOD world
Ben, purl.oclc.org is a DNS alias for purl.org and has been since the beginning. There are several others. These domain names work the same from an HTTP protocol POV, but if you're using them as identifiers in RDF don't assume they are interchangeable. Jeff From: Ben O'Steen [mailto:bost...@gmail.com] Sent: Saturday, February 18, 2012 9:19 AM To: Dave Reynolds Cc: public-lod@w3.org Subject: Re: PURLs don't matter, at least in the LOD world A quick related question - does anyone know the status of purl.oclc.org - there was a point in time where the service suggested that this new hostname was going to be the new proper host for purl.org urls. I hope they have abandoned this idea, as one sure way to affect url longevity is to include a organisational brand in it ;) Ben On Feb 18, 2012 1:02 PM, Dave Reynolds dave.e.reyno...@gmail.com wrote: On 17/02/12 21:08, Kingsley Idehen wrote: On 2/17/12 2:18 PM, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. David, But any admin that oversees a DNS server can do the same thing. What's special about purl in this context? Precisely that they don't require an admin with power over the DNS registration :) To me the PURL design pattern is about delegation authority and it's an important pattern. Two specific use cases at different extremes: (1) An individual is creating a small vocabulary that they would like to see used widely but don't have a nice brand-neutral stable domain of their own they can use for the purpose. This one has already been covered in the discussion. (2) I'm a big organization, say the UK Government. I want to use a particular domain (well a set of subdomains) for publishing my data, say *.data.gov.uk. The domain choice is important - it has credibility and promises long term stability. Yet I want to decentralize the publication itself, I want different departments and agencies to publish data and identifiers within the subdomains. The subdomains are supposed to be organization-neural yet the people doing the publication will be based in specific organizations. The PURL design pattern (though not necessarily the specific PURL implementation) is an excellent way to manage the delegation that makes that possible. So my summary answer to Hugh is - they are much more important to the publisher than to the consumer. Dave
Re: PURLs don't matter, at least in the LOD world
On 2/18/12 7:57 AM, Dave Reynolds wrote: On 17/02/12 21:08, Kingsley Idehen wrote: On 2/17/12 2:18 PM, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. David, But any admin that oversees a DNS server can do the same thing. What's special about purl in this context? Precisely that they don't require an admin with power over the DNS registration :) To me the PURL design pattern is about delegation authority and it's an important pattern. Two specific use cases at different extremes: (1) An individual is creating a small vocabulary that they would like to see used widely but don't have a nice brand-neutral stable domain of their own they can use for the purpose. This one has already been covered in the discussion. (2) I'm a big organization, say the UK Government. I want to use a particular domain (well a set of subdomains) for publishing my data, say *.data.gov.uk. The domain choice is important - it has credibility and promises long term stability. Yet I want to decentralize the publication itself, I want different departments and agencies to publish data and identifiers within the subdomains. The subdomains are supposed to be organization-neural yet the people doing the publication will be based in specific organizations. The PURL design pattern (though not necessarily the specific PURL implementation) is an excellent way to manage the delegation that makes that possible. So my summary answer to Hugh is - they are much more important to the publisher than to the consumer. Dave Dave, Don't publishers need to have admin access en route to exploiting the delegation services at purl.org? By this I mean: we are moving from DNS admin to purl.org service admin, per account. At some point in the Linked Data publishing value chain we always hit the admin level privileges matter :-) Ultimately, I believe this issue is one resolved in the Read-Write Web realm where folks control their own data spaces and use those data spaces as launchpads for their Linked Data publishing -- ditto Identity claims declaration. Thus, instead of depending on a single delegation service like purl.org, we end up with a federation of individually controlled Linked Data spaces. I've put out a number of demos that showcase declaration of verifiable identity claims via blog posts, tweets, simple html docs etc. These identity claims are held in profile documents that are really conduits to Linked Data spaces. Each of these spaces is endowed with its own proxy/wrapper URI capability which enables the kind of Linked Data graph portability that folks ultimately expect of a federated Web. Anyway, I agree that purl.org serves a purpose (for sure) on the publishing side. At the same time, I remain unconvinced about the longevity and uniqueness of its value with regards to Linked Data. -- Regards, Kingsley Idehen Founder CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen smime.p7s Description: S/MIME Cryptographic Signature
PURLs don't matter, at least in the LOD world
(Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.) M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this. It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem. Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist. (Personally I would bet the Library of Congress will last longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.) In the Linked Data world, at least, what does a PURL give protection from? Let's say I have http://dbpedia.org/resource/Tokyo. I can: a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to); b) I can resolve to find out what DBPedia thinks it means (returns as RDF); c) I can use it as an ID for another source to find out what that other source thinks it means. Now let's say dbpedia.org goes Phut! What I lose is facility (b) What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively. But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over. Not much to be gained for the overheads of having the purl? I can see that in the Web of Text, a URI that has gone 404 is rather painful. And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem, in my view. But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again. Is there something I am missing about PURLs? Best Hugh -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/
Re: PURLs don't matter, at least in the LOD world
On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
RE: PURLs don't matter, at least in the LOD world
Hugh, I commonly use PURLs when I'm modeling RDF vocabularies as described here: http://www.w3.org/TR/swbp-vocab-pub/#purls This allows me to prototype the vocabulary on my workstation without concern for where it ultimately ends up. Any instance data I generate along the way will remain unaffected since I've used PURL as the vocabulary namespace. Jeff -Original Message- From: Hugh Glaser [mailto:h...@ecs.soton.ac.uk] Sent: Friday, February 17, 2012 1:48 PM To: public-lod@w3.org Subject: PURLs don't matter, at least in the LOD world (Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.) M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this. It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem. Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist. (Personally I would bet the Library of Congress will last longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.) In the Linked Data world, at least, what does a PURL give protection from? Let's say I have http://dbpedia.org/resource/Tokyo. I can: a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to); b) I can resolve to find out what DBPedia thinks it means (returns as RDF); c) I can use it as an ID for another source to find out what that other source thinks it means. Now let's say dbpedia.org goes Phut! What I lose is facility (b) What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively. But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over. Not much to be gained for the overheads of having the purl? I can see that in the Web of Text, a URI that has gone 404 is rather painful. And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem, in my view. But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again. Is there something I am missing about PURLs? Best Hugh -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/
Re: PURLs don't matter, at least in the LOD world
On 17 Feb 2012, at 19:18, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. Thanks David. But someone has to persuade the administrator to do that. If the owner of http://purl.org/dbpedia/ and dbpedia.org is co-operative, then all is good. But then it is a similar challenge to provide a redirect at dbpedia.org or re-assign the dbpedia.org domain, so not much gain there. And this is the situation that big organisations will usually be in, in relation to their own URIs. If the owner is not co-operative, then I am guessing (sorry, I can't find it in the documentation) that it is quite a challenge to get the administrator to re-assign a PURL domain to someone else. But of course it is quite a challenge to get a domain registrar to move a domain registration - I am sure it is actually harder, so some gain. But overall, I still can't really see the effort is worth the candle, certainly for big organisations that have expectation of longevity. Best Hugh -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer. -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/
Re: PURLs don't matter, at least in the LOD world
Hi Jeff, On 17 Feb 2012, at 19:24, Young,Jeff (OR) wrote: Hugh, I commonly use PURLs when I'm modeling RDF vocabularies as described here: http://www.w3.org/TR/swbp-vocab-pub/#purls (Yes, I was avoiding commenting on the 302 problem. :-) ) This allows me to prototype the vocabulary on my workstation without concern for where it ultimately ends up. Any instance data I generate along the way will remain unaffected since I've used PURL as the vocabulary namespace. Ah yes, thanks. I remember someone saying they did this. I can see the advantage where one doesn't have a target domain in mind during development. Best hugh Jeff -Original Message- From: Hugh Glaser [mailto:h...@ecs.soton.ac.uk] Sent: Friday, February 17, 2012 1:48 PM To: public-lod@w3.org Subject: PURLs don't matter, at least in the LOD world (Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.) M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this. It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem. Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist. (Personally I would bet the Library of Congress will last longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.) In the Linked Data world, at least, what does a PURL give protection from? Let's say I have http://dbpedia.org/resource/Tokyo. I can: a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to); b) I can resolve to find out what DBPedia thinks it means (returns as RDF); c) I can use it as an ID for another source to find out what that other source thinks it means. Now let's say dbpedia.org goes Phut! What I lose is facility (b) What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively. But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over. Not much to be gained for the overheads of having the purl? I can see that in the Web of Text, a URI that has gone 404 is rather painful. And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem, in my view. But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again. Is there something I am missing about PURLs? Best Hugh -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/ -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/
Re: PURLs don't matter, at least in the LOD world
Hi Hugh, I can only speak for one case, some PURLs I have maintained for the last 5 years - the Personally Identifiable Information Namespace http://purl.org/pii/terms/# There are 16 terms. The use for the terms is in discovery, as a penultimate node to rdf:nil in Lists and Collections. Imagine you have searched a dataset repository catalog for Assets (e.g. ADMS). The list is finite, but the subject you were searching for is not. The end of the list is not rdf:nil, it is a subject token PII Term. rdf:nil has the meaning that you have searched the entire universe. Public Social Networks and Repositories may not contain search able PII, but as businesses they all have customer lists which can and do contain PII - but you are not able to search those. The point is often blurred, because search engines return thousands of links. --Gannon From: Hugh Glaser h...@ecs.soton.ac.uk To: public-lod@w3.org public-lod@w3.org Sent: Friday, February 17, 2012 12:48 PM Subject: PURLs don't matter, at least in the LOD world (Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.) M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this. It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem. Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist. (Personally I would bet the Library of Congress will last longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.) In the Linked Data world, at least, what does a PURL give protection from? Let's say I have http://dbpedia.org/resource/Tokyo. I can: a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to); b) I can resolve to find out what DBPedia thinks it means (returns as RDF); c) I can use it as an ID for another source to find out what that other source thinks it means. Now let's say dbpedia.org goes Phut! What I lose is facility (b) What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively. But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over. Not much to be gained for the overheads of having the purl? I can see that in the Web of Text, a URI that has gone 404 is rather painful. And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem, in my view. But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again. Is there something I am missing about PURLs? Best Hugh -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/
Re: PURLs don't matter, at least in the LOD world
On 2/17/12 1:48 PM, Hugh Glaser wrote: (Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.) M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this. It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem. Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist. (Personally I would bet the Library of Congress will last longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.) In the Linked Data world, at least, what does a PURL give protection from? Let's say I have http://dbpedia.org/resource/Tokyo. I can: a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to); b) I can resolve to find out what DBPedia thinks it means (returns as RDF); c) I can use it as an ID for another source to find out what that other source thinks it means. Now let's say dbpedia.org goes Phut! What I lose is facility (b) What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively. But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over. Not much to be gained for the overheads of having the purl? I can see that in the Web of Text, a URI that has gone 404 is rather painful. And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem, in my view. But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again. Is there something I am missing about PURLs? Best Hugh +1 Nice value proposition articulation re. Web of Linked Data :-) -- Regards, Kingsley Idehen Founder CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen smime.p7s Description: S/MIME Cryptographic Signature
Re: PURLs don't matter, at least in the LOD world
On 2/17/12 2:18 PM, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. David, But any admin that oversees a DNS server can do the same thing. What's special about purl in this context? Remember, DBpedia URIs are live, so the most important issues are: 1. domain ownership 2. dns admin re. ip address mapping 3. reconstitution of the linked data sets to with the DBpedia URIs resolve. -- Regards, Kingsley Idehen Founder CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca handle: @kidehen Google+ Profile: https://plus.google.com/112399767740508618350/about LinkedIn Profile: http://www.linkedin.com/in/kidehen smime.p7s Description: S/MIME Cryptographic Signature
Re: PURLs don't matter, at least in the LOD world
On Fri, Feb 17, 2012 at 8:51 PM, Hugh Glaser h...@ecs.soton.ac.uk wrote: On 17 Feb 2012, at 19:18, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. Thanks David. But someone has to persuade the administrator to do that. If the owner of http://purl.org/dbpedia/ and dbpedia.org is co-operative, then all is good. But then it is a similar challenge to provide a redirect at dbpedia.org or re-assign the dbpedia.org domain, so not much gain there. And this is the situation that big organisations will usually be in, in relation to their own URIs. If the owner is not co-operative, then I am guessing (sorry, I can't find it in the documentation) that it is quite a challenge to get the administrator to re-assign a PURL domain to someone else. But of course it is quite a challenge to get a domain registrar to move a domain registration - I am sure it is actually harder, so some gain. Hugh - you are responding to a scenario that I wasn't proposing. Mea culpa. I'm afraid that I never stated the use of PURLs that I had in mind. Let me try to briefly explain the motivation of SharedNames, keeping in mind that it's late on a Friday night here in NL and SharedNames was established in 2008 by a number of people. See http://sharedname.org/ for more info. Oh dear. Don't know if I'm up for explaining SharedNames in a nutshell. Ok, here goes.. SharedNames is based on the notion of federated PURL servers under control of the URI owners. Once upon a time.. In the land of Bio-identifiers, there are many valuable database resources (see the annual survey of Nucleic Acids Research where they count 1800) and LOD-savvy bio(medical)informatics practitioners want to refer unambiguously to the information records with URIs- preferably using the same identifiers for common types of records about genes, proteins, compounds, etc. to save a load of mapping/translation. We faced a few problems: 1) Key database maintainers were not aware of the wonders of linked data. That would explain why you wouldn't want to wait to see what kind of URI policies came out of institutions with varying levels of W3C expertise and motivations. 2) URIs were consistently being used as advertisements or name branding (and still are unfortunately - in some cases politicising data ! but that's another story). Understandably so. However, it is also understandable when things change drastically that the domain owner has other priorities than keeping your identifiers valid. Strategy: From http://sharedname.org/page/Project_overview: Control of shared URIs should be in the hands of those who depend on them. This is the best way to ensure that the URIs serve the community in the ways listed above. With few exceptions, biodata providers have not had the resources to create either RDF renderings or URIs for their data. We want well-formed and well-behaved URIs and we want them now. So, we suggested creating a vendor-neutral namespace of URIs backed up by a federation of people/organizations/servers. So, we set out to create a *federation* of organizations with governance for a *federation* of PURL servers. In such a federation, if any member stepped out or a server failed, it wouldn't disable the other servers and the organization would continue. Federated PURLz software would be required so we set about outlining the requirements and contracted the software. The software is in beta testing although I don't know if it is actively being tested now. But overall, I still can't really see the effort is worth the candle, certainly for big organisations that have expectation of longevity. I hope that LOC is the epitome of longevity but we've all seen very large organisations and institutions and even countries both change their names and ownership, as well as disappear from all levels of society. Would examples help? Sun swallowed by Oracle. Genentech swallowed by Roche. There are many corporate examples as well as governmental examples. Some example changes from the bio domain: Locuslink to Entrez Gene, Swissprot to Uniprot, .. -Scott
Re: PURLs don't matter, at least in the LOD world
Many thanks Scott. I hope you realised that I didn't want to imply you were describing any particular scenario with respect to PURL - it was more your comments sparked something off for me. And thank you for your late-night efforts to describe the SharedNames scenario! It is very helpful; I had tried to get my head round it from the site, you give quite a lot more of the motivation. Of course, all this sort of stuff interests me, as I have my own way of doing these or at least strongly related things (sameas.org, which happens itself to only have Linked Data URIs, but other sameas.org sub-sites have arbitrary identifiers). I think the important thing is that the solutions need to be situated in the application context - a strong societal motivation should be ridden hard to get the benefit from it, and that seems what is happening in this case. I would need to spend quite a long time trying to follow the concerns and why the proposed solution helps better than any other, but late on a Friday I won't try that. :-) Thanks again. Hugh On 17 Feb 2012, at 22:59, M. Scott Marshall wrote: On Fri, Feb 17, 2012 at 8:51 PM, Hugh Glaser h...@ecs.soton.ac.uk wrote: On 17 Feb 2012, at 19:18, David Booth wrote: On Fri, 2012-02-17 at 18:48 +, Hugh Glaser wrote: [ . . . ] What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. No, the idea is that the administrator for http://purl.org/dbpedia/ updates the redirect, to point to whatever new site is hosting the dbpedia data, so the http://purl.org/dbpedia/Tokyo still works. Thanks David. But someone has to persuade the administrator to do that. If the owner of http://purl.org/dbpedia/ and dbpedia.org is co-operative, then all is good. But then it is a similar challenge to provide a redirect at dbpedia.org or re-assign the dbpedia.org domain, so not much gain there. And this is the situation that big organisations will usually be in, in relation to their own URIs. If the owner is not co-operative, then I am guessing (sorry, I can't find it in the documentation) that it is quite a challenge to get the administrator to re-assign a PURL domain to someone else. But of course it is quite a challenge to get a domain registrar to move a domain registration - I am sure it is actually harder, so some gain. Hugh - you are responding to a scenario that I wasn't proposing. Mea culpa. I'm afraid that I never stated the use of PURLs that I had in mind. Let me try to briefly explain the motivation of SharedNames, keeping in mind that it's late on a Friday night here in NL and SharedNames was established in 2008 by a number of people. See http://sharedname.org/ for more info. Oh dear. Don't know if I'm up for explaining SharedNames in a nutshell. Ok, here goes.. SharedNames is based on the notion of federated PURL servers under control of the URI owners. Once upon a time.. In the land of Bio-identifiers, there are many valuable database resources (see the annual survey of Nucleic Acids Research where they count 1800) and LOD-savvy bio(medical)informatics practitioners want to refer unambiguously to the information records with URIs- preferably using the same identifiers for common types of records about genes, proteins, compounds, etc. to save a load of mapping/translation. We faced a few problems: 1) Key database maintainers were not aware of the wonders of linked data. That would explain why you wouldn't want to wait to see what kind of URI policies came out of institutions with varying levels of W3C expertise and motivations. 2) URIs were consistently being used as advertisements or name branding (and still are unfortunately - in some cases politicising data ! but that's another story). Understandably so. However, it is also understandable when things change drastically that the domain owner has other priorities than keeping your identifiers valid. Strategy: From http://sharedname.org/page/Project_overview: Control of shared URIs should be in the hands of those who depend on them. This is the best way to ensure that the URIs serve the community in the ways listed above. With few exceptions, biodata providers have not had the resources to create either RDF renderings or URIs for their data. We want well-formed and well-behaved URIs and we want them now. So, we suggested creating a vendor-neutral namespace of URIs backed up by a federation of people/organizations/servers. So, we set out to create a *federation* of organizations with governance for a *federation* of PURL servers. In such a federation, if any member stepped out or a server failed, it wouldn't disable the other servers and the organization would continue. Federated PURLz software would be required so we set about outlining the requirements
Re: PURLs don't matter, at least in the LOD world
Hi Hugh, There are several aspects to PURLs that I think are relevant to LOD. Some of them are: - PURLs allow a general Web user to curate the location of a persistent identifier without needing administrative access to a DNS server, an Apache server or other non-user-oriented technology. For many people, this is a big deal. - PURLs allow for the implementation of http-range-14 (303) redirection without the need for administrator-level access to technology. - Partial PURLs allow for the assignment of bulk persistent identifiers to classes of data (e.g. data set or database aggregation) with a minimum of administrative overhead. - PURL Federation (currently in Beta, see [1]) allows for long-term persistence to be offered for identifiers in the face of changing hosting providers. We are also in process of making Callimachus [2] into a PURL server specifically designed for the LOD community. That will allow PURLs to fit more naturally with Linked Data in its various forms. PURLs have historically been used by the library community (e.g. OCLC, a number of universities and e.g. the US Government Printing Office, which uses them to manage persistent Web addresses for US Government documents regardless of physical location). However, their use by LOD developers seems to be mostly (but not always) for persistence of vocabularies. Given that vocabulary developers have often hosted their vocabularies on fungible Web hosting providers but LOD applications and users often hard-code vocabulary URLs into their offerings, this use of PURLs seems particularly appropriate to me. Given what I personally know of the state of US Government agencies, I'll take your bet whether the Web services of the Library of Congress or OCLC lasts longer :) You might look back at the tortured history of id.loc.gov before we agree to a figure. Regards, Dave -- David Wood, Ph.D. 3 Round Stones http://3roundstones.com Cell: +1 540 538 9137 [1] http://purlz.org [2] http://callimachusproject.org On Feb 17, 2012, at 13:48, Hugh Glaser wrote: (Sorry if there is a paper/discussion on this that I have missed somewhere. And I may have some of this wrong, as I have essentially not used PURLs.) M Scott Marshall and others' comments have prompted me to put pen to paper and ask what the list thinks on this. It has long puzzled me why people seem to think that PURLs (and Handles, etc.) solve some actual problem. Leaving aside the question of whether it actually adds extra fragility as to whether purl.org will continue to exist. (Personally I would bet the Library of Congress will last longer than purl.org, but I would have to wait too long to collect on the bet to make it worthwhile.) In the Linked Data world, at least, what does a PURL give protection from? Let's say I have http://dbpedia.org/resource/Tokyo. I can: a) Use the URI without any URI resolution at all, and it is really useful to do so (as commented, foaf:name is used a lot, and it does not depend on anything being at the other end to resolve to); b) I can resolve to find out what DBPedia thinks it means (returns as RDF); c) I can use it as an ID for another source to find out what that other source thinks it means. Now let's say dbpedia.org goes Phut! What I lose is facility (b) What happens if I have http://purl.org/dbpedia/Tokyo, which is set to go to http://dbpedia.org/resource/Tokyo? I have (a), (b) and (c) as before. Now if dbpedia.org goes Phut!, we are in exactly the same situation - (b) gets lost. Both of these situation can be fixed by persuading someone (the registrar for dbpedia.org or the purl.org organisers respectively) to allow someone else to take over purl.org/dbpedia or dbpedia.org respectively. But once dbpedia.org goes Phut!, you get a dead link whatever you do, until someone takes it over. Not much to be gained for the overheads of having the purl? I can see that in the Web of Text, a URI that has gone 404 is rather painful. And I know that people who have curated data find dying links painful, and seem to find Handles etc some sort of comfort for their concerns, even though they don't necessarily solve the perceived problem, in my view. But in the Web of Data, given a good guess at somewhere else (such as the LoC, or even the Virtuoso endpoint or sameAs.org), I stand a good chance of finding a skos:*Match or even an owl:sameAs that will get me back on track again. Is there something I am missing about PURLs? Best Hugh -- Hugh Glaser, Web and Internet Science Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/