Re: Datatypes with no (cool) URI
Sent from my portable device. On Apr 3, 2012, at 16:58, Phil Archer ph...@w3.org wrote: Again, thanks everyone for the quick and useful responses. @Gannon, @Andy - you are right that the issue of sex/gender is far from straightforward (they're not even the same thing I've learned!) However, I need to offer 'something' even if it's not ideal and then work on the longer term. @Sarven - SDMX looks very useful indeed, hadn't seen that they cover gender - great. But it doesn't answer the more general point (I was using sex/gender as an example - there are other terms for which the value space should be a controlled vocabulary that doesn't necessarily have a URI). Here's my plan of action: Short term: the limitation here is that all I'm chartered/empowered to do is to define the terms (actually I'm planning to use schema:gender). I am not, and I don't believe the EU (current project paymasters) or the GLD WG/W3C more generally is not, in a position to set up some sort of de-referencing system. Well actually you are. The world loses when RDF terms can't be looked up to yield useful information, or when things supporting Uris rot, so a W3C policy is if in doubt to allow a group at or loosely affiliated with W3c to set up a persistent supporting document. The emphasis for me is on the machine-readable bit -- it should of course point to online human-readable documentation where it can but also carry as many tips for machines as possible We could look at the idea of setting up fit example a w3.org/ns/iso/5006 space even to hold machine-readable info about frozen stuff ISO has not learned to support S linked data yet. If we do a few, future ISO standards might get the message and be supported by ISO. Note the namespace does NOT imply W3c rec track, or any process. That is the point, that as the process and status change, the URI will not. So people won't have to recode. (It would obviously be nice, from the bootstrap point of view, to have stuff usable for automatically building a Ui, like a regexp for valid strings in the lexical space. That is another interesting thread ...) Even up Purls means that we're in effect condoning a value space (and I have at least 3 on my radar for just this term alone - Gannon pointed to some useful info from LoC which might make 4, plus SDMX makes 5). So I'm going to have to fudge it for now and say 'provide an identifier' and may leave it at that. I'd like to offer more guidance but it may not be sensible to do so (and btw. these vocabularies have to work in XML as well as RDF). Longer term... I think I'll drop a line to Norman Paskin at the DOI Foundation... Phil. On 03/04/2012 16:22, John Erickson wrote: Gannon raises a valid point, BUT it is important to remember that ISO is a *publisher* and DOI is fundamentally a publishing industry thing. So while they might not be inclined to support Cool URIs for their own sake, they might be DOI adopters for the sake of The Bottom Line... On Tue, Apr 3, 2012 at 11:19 AM, Gannon Dickgannon_d...@yahoo.com wrote: There are just some things outside of the Web's bailiwick, and the properties of people in that class. The problem is that you are never sure if you are naming the property on rudely calling the property holder names. ISO declines to play, the LOC declines differently http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't exist. I think you've hit a limit, not on Cool Uri's necessarily, but maybe on philosophy. From: John Ericksonolyerick...@gmail.com To: David Boothda...@dbooth.org Cc: Phil Archerph...@w3.org; public-lod@w3.orgpublic-lod@w3.org Sent: Tuesday, April 3, 2012 9:53 AM Subject: Re: Datatypes with no (cool) URI On Tue, Apr 3, 2012 at 10:38 AM, David Boothda...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter
Re: Datatypes with no (cool) URI
(apologies if this is a re-post, I don't think it made it through y'day) Hi On Tue, Apr 3, 2012 at 6:29 PM, Dave Reynolds dave.e.reyno...@gmail.com wrote: On 03/04/12 16:38, Sarven Capadisli wrote: On 12-04-03 02:33 PM, Phil Archer wrote: I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ... Perhaps I'm looking at your problem the wrong way, but have you looked at the SDMX Concepts: http://purl.org/linked-data/sdmx/2009/code#sex -Sarven I was going to suggest that :) +1. A custom datatype doesn't seem correct in this case. Treating gender as a category/classification captures both the essence that there's more than one category that people may differ in how they would assign classifications. I wrote a bit about Custom Datatypes here: http://patterns.dataincubator.org/book/custom-datatype.html This use case aside, there ought to be more information to guide people towards how to do this correctly. See also: http://www.w3.org/TR/swbp-xsch-datatypes/ Cheers, L.
Re: Datatypes with no (cool) URI
Phil, Reading Leigh's mail and his reference to the XML Schema datatypes and RDF document: I wonder whether a possible way forward would not be to define your own datatypes as derived datatypes from good-old xsd datatypes, but using the OWL 2 facilities: http://www.w3.org/TR/owl2-syntax/#Data_Ranges My understanding is that you would need datatypes with a very restricted set of possible values; these can be described using these OWL 2 features. The advantage is that you can then mint the URI-s you want for those and, with a bit of luck, some OWL environment can handle them (which is probably not the case if you use those ISO datatypes in RDF, for example). Of course, as Leigh said, you can also define those datatypes in XML Schema, but I would not expect OWL reasoners to handle those. B.t.w., by OWL reasoner I do not necessarily mean something very complex. My overly simple (and inefficient:-) OWL RL environment: http://www.ivan-herman.net/Misc/2008/owlrl/ also handle some of the simpler cases... Just an idea Ivan On Apr 4, 2012, at 10:30 , Leigh Dodds wrote: (apologies if this is a re-post, I don't think it made it through y'day) Hi On Tue, Apr 3, 2012 at 6:29 PM, Dave Reynolds dave.e.reyno...@gmail.com wrote: On 03/04/12 16:38, Sarven Capadisli wrote: On 12-04-03 02:33 PM, Phil Archer wrote: I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ... Perhaps I'm looking at your problem the wrong way, but have you looked at the SDMX Concepts: http://purl.org/linked-data/sdmx/2009/code#sex -Sarven I was going to suggest that :) +1. A custom datatype doesn't seem correct in this case. Treating gender as a category/classification captures both the essence that there's more than one category that people may differ in how they would assign classifications. I wrote a bit about Custom Datatypes here: http://patterns.dataincubator.org/book/custom-datatype.html This use case aside, there ought to be more information to guide people towards how to do this correctly. See also: http://www.w3.org/TR/swbp-xsch-datatypes/ Cheers, L. Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: http://www.ivan-herman.net/foaf.rdf smime.p7s Description: S/MIME cryptographic signature
Re: Datatypes with no (cool) URI
Thanks Ivan and thank you, Leigh. What I like about Leigh's suggestion is that it gives a way to associate a string like ISO/IEC 5218:2004 with a URI and I can show that as a generalised guidance note without necessarily saying use one of these controlled vocabularies that we don't control and that you may not like. So I think the way is fairly clear: If there is a suitable controlled vocabulary (and in the particular use case I'm referring to there is - SDMX) - then use it; If you can construct a suitable datatype URI then use that (The HL7 terms have OIDs which can be given as a stable URI from a look up service) If you can't do these things - and you really can't sensibly with PDFs on a portal, perhaps behind a paywall, then Leigh's method is the way to go. However... as always, we should look for other instances where this has been done so we don't invent lots of URIs for the same datatype and then have to fall back on loads of owl:sameAs assertions. OWL data ranges look nice but it's not the kind of thing most public administrations will want to get into. Incidentally, I did contact Norman Paskin at DOI who sent me a positive reply. DOIs for ISO standards are not ruled out and it has been discussed, especially in the context of CrossRef, but, as ever, it's complicated. Phil. On 04/04/2012 13:13, Ivan Herman wrote: Phil, Reading Leigh's mail and his reference to the XML Schema datatypes and RDF document: I wonder whether a possible way forward would not be to define your own datatypes as derived datatypes from good-old xsd datatypes, but using the OWL 2 facilities: http://www.w3.org/TR/owl2-syntax/#Data_Ranges My understanding is that you would need datatypes with a very restricted set of possible values; these can be described using these OWL 2 features. The advantage is that you can then mint the URI-s you want for those and, with a bit of luck, some OWL environment can handle them (which is probably not the case if you use those ISO datatypes in RDF, for example). Of course, as Leigh said, you can also define those datatypes in XML Schema, but I would not expect OWL reasoners to handle those. B.t.w., by OWL reasoner I do not necessarily mean something very complex. My overly simple (and inefficient:-) OWL RL environment: http://www.ivan-herman.net/Misc/2008/owlrl/ also handle some of the simpler cases... Just an idea Ivan On Apr 4, 2012, at 10:30 , Leigh Dodds wrote: (apologies if this is a re-post, I don't think it made it through y'day) Hi On Tue, Apr 3, 2012 at 6:29 PM, Dave Reynoldsdave.e.reyno...@gmail.com wrote: On 03/04/12 16:38, Sarven Capadisli wrote: On 12-04-03 02:33 PM, Phil Archer wrote: I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ... Perhaps I'm looking at your problem the wrong way, but have you looked at the SDMX Concepts: http://purl.org/linked-data/sdmx/2009/code#sex -Sarven I was going to suggest that :) +1. A custom datatype doesn't seem correct in this case. Treating gender as a category/classification captures both the essence that there's more than one category that people may differ in how they would assign classifications. I wrote a bit about Custom Datatypes here: http://patterns.dataincubator.org/book/custom-datatype.html This use case aside, there ought to be more information to guide people towards how to do this correctly. See also: http://www.w3.org/TR/swbp-xsch-datatypes/ Cheers, L. Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: http://www.ivan-herman.net/foaf.rdf -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
2012/4/4 Phil Archer ph...@w3.org If you can construct a suitable datatype URI then use that (The HL7 terms have OIDs which can be given as a stable URI from a look up service) ... Incidentally, I did contact Norman Paskin at DOI who sent me a positive reply. DOIs for ISO standards are not ruled out and it has been discussed, especially in the context of CrossRef, but, as ever, it's complicated. Phil: Are you referring to ISO21090 ?? http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=35646 If so: So (cool) URI:s for ISO 21090 data types such as CD (Coded DataTypes) would be very useful, see Jim McCusker's and my comment on one, of many blog post on HL7 data types, by Graham Grieve an expert in HL7 and OpenEHR http://www.healthintersections.com.au/?p=381 FYI: A OWL representation of ISO21090 is part of an effort to transform the UML model for Biomedical Research Integrated Domain Group (BRIDG) http://www.bridgmodel.org/.
Re: Datatypes with no (cool) URI
On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
Re: Datatypes with no (cool) URI
Hi David, Yes, one could use URL shorteners and that's probably the only sane way to go but it's still not ideal because: 1. Both Bitly and Tinyurl come with no guarantee of service (and a lot of tracking) - Google's goo.gl is all wrapped up with their services too - not the kind of thing public administrations will be happy about using. Yves Lafon's http://kwz.me is a pure shortener with no tracking of any kind but it's a one man project so, again, it won't be 'good enough' for public sector data. 2. Neither a shortened URL nor the long form tell a human reader a lot whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells you that it's an ISO standard that a human can look up. The ISO catalogue URLs point to Web pages or PDFs available from those Web pages so you still need to be a human to get the information. The danger would be that a machine would look up the datatype URI and expect to get data back, not ISO's paywall :-) So, not ideal, but still the best (practical) solution? On 03/04/2012 15:38, David Booth wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
On 03/04/2012 15:53, John Erickson wrote: On Tue, Apr 3, 2012 at 10:38 AM, David Boothda...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? What shall we do? Start a petition? Go on a march through Geneva? (it's nice there this time of year). Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl Yep, that would do the job certainly. Hmmm... unless Crossref could mint URIs out of, say, ISO/IEC 5218:2004 ?? I'm sure it could but is the demand sufficient and would ISO allow it? -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
RE: Datatypes with no (cool) URI
I am a researcher working on some Demographic Social Simulation Models. In the simple models, I distinguish people classed male at birth and people classed female at birth and gender ambiguity, reassignment (sex change) and gender recalssification are not modelled. In more complicated models these things might be modelled and if I were modelling that, I would consider storing a list of changes and have more classes or somehow quantify maleness and femaleness. The point I am making here is that the assignment of gender (or sex depending on what word you prefer) could be time dependent. In an attempt to make my data storage and retrieval work better I implemented two main data stores for people: those classed female at birth; those classed male at birth. In my models, even if current gender were re-assigned data for that individual would still be stored in the same data store. I suspect that in ambiguous cases in reality what is done in terms of gender classification might be different for different countries. BTW: gender ambiguity was topical in the mainstream media in the Autumn in the UK [1]. It is not as uncommon as you might think... So, gender is a fuzzy thing. Maybe we all belong to male and female classes to a degree and for most of us this distinction is binary. In terms of encoding, in my implementations I've used 0 for female and 1 for male as I find that easy to remember and computationally it makes sense. Andy [1] http://www.bbc.co.uk/news/health-14459843 From: Phil Archer [ph...@w3.org] Sent: 03 April 2012 14:33 To: public-lod@w3.org Subject: Datatypes with no (cool) URI I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ISO 5218 has: 0 = not known; 1 = male; 2 = female; 9 = not applicable. and Eurostat offers F = female M = male OTH = other UNK = unknown NAP = not applicable IMO, the spec should not dictate which one to use (there are others too of course). What I *do* want to do though is to encourage publishers to state which vocabulary they're using. Sounds like a job for a datatype - and for that you need a URI for the vocabulary. Something like: schema:gender 1^^http://iso.org/5218/ . Except I made that iso.org URI up. The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? Given that the identifier for the ISO standard is ISO/IEC 5218:2004 how about urn:iso/iec:5218:2005? For Eurostat, the internal identifier for the vocabulary is SCL - Sex (standard code list) so would urn:eurostat:scl:sex be appropriate? Anyone done anything like this in the real world? All advice gratefully received. Thank you Phil. [1] https://dvcs.w3.org/hg/gld/raw-file/default/people/index.html -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org @philarcher1
Re: Datatypes with no (cool) URI
The ideal thing would be if ISO, Eurostat produced concise resolvable URIs of course. But while we wait for them to do that, why doesn't W3C mint and support URIs for the most commonly used code lists, that resolve to relevant documentation and/or links to the definitive documents from ISO etc. Cheers Bill On 3 Apr 2012, at 15:58, Phil Archer wrote: Hi David, Yes, one could use URL shorteners and that's probably the only sane way to go but it's still not ideal because: 1. Both Bitly and Tinyurl come with no guarantee of service (and a lot of tracking) - Google's goo.gl is all wrapped up with their services too - not the kind of thing public administrations will be happy about using. Yves Lafon's http://kwz.me is a pure shortener with no tracking of any kind but it's a one man project so, again, it won't be 'good enough' for public sector data. 2. Neither a shortened URL nor the long form tell a human reader a lot whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells you that it's an ISO standard that a human can look up. The ISO catalogue URLs point to Web pages or PDFs available from those Web pages so you still need to be a human to get the information. The danger would be that a machine would look up the datatype URI and expect to get data back, not ISO's paywall :-) So, not ideal, but still the best (practical) solution? On 03/04/2012 15:38, David Booth wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
Okay, then maybe a PURL would help? purl.org now supports partial redirects: http://purl.org/docs/faq.html#toc1.9 That may not quite work with your ISO URIs though. Personally, I don't think you should worry too much about a machine expecting to be able to dereference the datatype URI to get data back. I would expect most datatype URIs would lead to human-oriented information, though that could gradually change. David On Tue, 2012-04-03 at 15:58 +0100, Phil Archer wrote: Hi David, Yes, one could use URL shorteners and that's probably the only sane way to go but it's still not ideal because: 1. Both Bitly and Tinyurl come with no guarantee of service (and a lot of tracking) - Google's goo.gl is all wrapped up with their services too - not the kind of thing public administrations will be happy about using. Yves Lafon's http://kwz.me is a pure shortener with no tracking of any kind but it's a one man project so, again, it won't be 'good enough' for public sector data. 2. Neither a shortened URL nor the long form tell a human reader a lot whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells you that it's an ISO standard that a human can look up. The ISO catalogue URLs point to Web pages or PDFs available from those Web pages so you still need to be a human to get the information. The danger would be that a machine would look up the datatype URI and expect to get data back, not ISO's paywall :-) So, not ideal, but still the best (practical) solution? On 03/04/2012 15:38, David Booth wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer.
Re: Datatypes with no (cool) URI
There are just some things outside of the Web's bailiwick, and the properties of people in that class. The problem is that you are never sure if you are naming the property on rudely calling the property holder names. ISO declines to play, the LOC declines differently http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't exist. I think you've hit a limit, not on Cool Uri's necessarily, but maybe on philosophy. From: John Erickson olyerick...@gmail.com To: David Booth da...@dbooth.org Cc: Phil Archer ph...@w3.org; public-lod@w3.org public-lod@w3.org Sent: Tuesday, April 3, 2012 9:53 AM Subject: Re: Datatypes with no (cool) URI On Tue, Apr 3, 2012 at 10:38 AM, David Booth da...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson
Re: Datatypes with no (cool) URI
Gannon raises a valid point, BUT it is important to remember that ISO is a *publisher* and DOI is fundamentally a publishing industry thing. So while they might not be inclined to support Cool URIs for their own sake, they might be DOI adopters for the sake of The Bottom Line... On Tue, Apr 3, 2012 at 11:19 AM, Gannon Dick gannon_d...@yahoo.com wrote: There are just some things outside of the Web's bailiwick, and the properties of people in that class. The problem is that you are never sure if you are naming the property on rudely calling the property holder names. ISO declines to play, the LOC declines differently http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't exist. I think you've hit a limit, not on Cool Uri's necessarily, but maybe on philosophy. From: John Erickson olyerick...@gmail.com To: David Booth da...@dbooth.org Cc: Phil Archer ph...@w3.org; public-lod@w3.org public-lod@w3.org Sent: Tuesday, April 3, 2012 9:53 AM Subject: Re: Datatypes with no (cool) URI On Tue, Apr 3, 2012 at 10:38 AM, David Booth da...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson
Re: Datatypes with no (cool) URI
So David's solution (using PURLs) provides a bit of transparency and manageablity, but it has the disadvantage of having no official status. Maybe (probably) I'm missing something here? On Tue, Apr 3, 2012 at 11:19 AM, David Booth da...@dbooth.org wrote: Okay, then maybe a PURL would help? purl.org now supports partial redirects: http://purl.org/docs/faq.html#toc1.9 That may not quite work with your ISO URIs though. Personally, I don't think you should worry too much about a machine expecting to be able to dereference the datatype URI to get data back. I would expect most datatype URIs would lead to human-oriented information, though that could gradually change. David On Tue, 2012-04-03 at 15:58 +0100, Phil Archer wrote: Hi David, Yes, one could use URL shorteners and that's probably the only sane way to go but it's still not ideal because: 1. Both Bitly and Tinyurl come with no guarantee of service (and a lot of tracking) - Google's goo.gl is all wrapped up with their services too - not the kind of thing public administrations will be happy about using. Yves Lafon's http://kwz.me is a pure shortener with no tracking of any kind but it's a one man project so, again, it won't be 'good enough' for public sector data. 2. Neither a shortened URL nor the long form tell a human reader a lot whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells you that it's an ISO standard that a human can look up. The ISO catalogue URLs point to Web pages or PDFs available from those Web pages so you still need to be a human to get the information. The danger would be that a machine would look up the datatype URI and expect to get data back, not ISO's paywall :-) So, not ideal, but still the best (practical) solution? On 03/04/2012 15:38, David Booth wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer. -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson
Re: Datatypes with no (cool) URI
As Bill suggests: If you use a URI from an authoritative source that serves the terms, you don't have to wait for ISO to start doing it themselves. This has been done to some extent in several efforts in the bio area, in chronological order: http://bio2rdf.org/ A framework of federated PURLs was set up at http://sharednames.org and the latest and greatest: (Bio2RDF supports) http://identifiers.org In the above schemes, an example URI would be shorter and served by a third party. See http://identifiers.org/examples -Scott On Tue, Apr 3, 2012 at 5:10 PM, Bill Roberts b...@swirrl.com wrote: The ideal thing would be if ISO, Eurostat produced concise resolvable URIs of course. But while we wait for them to do that, why doesn't W3C mint and support URIs for the most commonly used code lists, that resolve to relevant documentation and/or links to the definitive documents from ISO etc. Cheers Bill On 3 Apr 2012, at 15:58, Phil Archer wrote: Hi David, Yes, one could use URL shorteners and that's probably the only sane way to go but it's still not ideal because: 1. Both Bitly and Tinyurl come with no guarantee of service (and a lot of tracking) - Google's goo.gl is all wrapped up with their services too - not the kind of thing public administrations will be happy about using. Yves Lafon's http://kwz.me is a pure shortener with no tracking of any kind but it's a one man project so, again, it won't be 'good enough' for public sector data. 2. Neither a shortened URL nor the long form tell a human reader a lot whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells you that it's an ISO standard that a human can look up. The ISO catalogue URLs point to Web pages or PDFs available from those Web pages so you still need to be a human to get the information. The danger would be that a machine would look up the datatype URI and expect to get data back, not ISO's paywall :-) So, not ideal, but still the best (practical) solution? On 03/04/2012 15:38, David Booth wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
If it helps, gender would be a subclass of http://purl.org/pii/terms/marks If you show me where to point, I'll create the PURL's. It may take me an hour to remember my passwords :o) From: John Erickson olyerick...@gmail.com To: David Booth da...@dbooth.org Cc: Phil Archer ph...@w3.org; public-lod@w3.org public-lod@w3.org Sent: Tuesday, April 3, 2012 10:26 AM Subject: Re: Datatypes with no (cool) URI So David's solution (using PURLs) provides a bit of transparency and manageablity, but it has the disadvantage of having no official status. Maybe (probably) I'm missing something here? On Tue, Apr 3, 2012 at 11:19 AM, David Booth da...@dbooth.org wrote: Okay, then maybe a PURL would help? purl.org now supports partial redirects: http://purl.org/docs/faq.html#toc1.9 That may not quite work with your ISO URIs though. Personally, I don't think you should worry too much about a machine expecting to be able to dereference the datatype URI to get data back. I would expect most datatype URIs would lead to human-oriented information, though that could gradually change. David On Tue, 2012-04-03 at 15:58 +0100, Phil Archer wrote: Hi David, Yes, one could use URL shorteners and that's probably the only sane way to go but it's still not ideal because: 1. Both Bitly and Tinyurl come with no guarantee of service (and a lot of tracking) - Google's goo.gl is all wrapped up with their services too - not the kind of thing public administrations will be happy about using. Yves Lafon's http://kwz.me is a pure shortener with no tracking of any kind but it's a one man project so, again, it won't be 'good enough' for public sector data. 2. Neither a shortened URL nor the long form tell a human reader a lot whereas something (non-standard I know) like urn:iso/iec:5218:2004 tells you that it's an ISO standard that a human can look up. The ISO catalogue URLs point to Web pages or PDFs available from those Web pages so you still need to be a human to get the information. The danger would be that a machine would look up the datatype URI and expect to get data back, not ISO's paywall :-) So, not ideal, but still the best (practical) solution? On 03/04/2012 15:38, David Booth wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer. -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson
Re: Datatypes with no (cool) URI
On 12-04-03 02:33 PM, Phil Archer wrote: I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ISO 5218 has: 0 = not known; 1 = male; 2 = female; 9 = not applicable. and Eurostat offers F = female M = male OTH = other UNK = unknown NAP = not applicable IMO, the spec should not dictate which one to use (there are others too of course). What I *do* want to do though is to encourage publishers to state which vocabulary they're using. Sounds like a job for a datatype - and for that you need a URI for the vocabulary. Something like: schema:gender 1^^http://iso.org/5218/ . Except I made that iso.org URI up. The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? Given that the identifier for the ISO standard is ISO/IEC 5218:2004 how about urn:iso/iec:5218:2005? For Eurostat, the internal identifier for the vocabulary is SCL - Sex (standard code list) so would urn:eurostat:scl:sex be appropriate? Anyone done anything like this in the real world? All advice gratefully received. Thank you Phil. [1] https://dvcs.w3.org/hg/gld/raw-file/default/people/index.html Perhaps I'm looking at your problem the wrong way, but have you looked at the SDMX Concepts: http://purl.org/linked-data/sdmx/2009/code#sex -Sarven
Re: Datatypes with no (cool) URI
Again, thanks everyone for the quick and useful responses. @Gannon, @Andy - you are right that the issue of sex/gender is far from straightforward (they're not even the same thing I've learned!) However, I need to offer 'something' even if it's not ideal and then work on the longer term. @Sarven - SDMX looks very useful indeed, hadn't seen that they cover gender - great. But it doesn't answer the more general point (I was using sex/gender as an example - there are other terms for which the value space should be a controlled vocabulary that doesn't necessarily have a URI). Here's my plan of action: Short term: the limitation here is that all I'm chartered/empowered to do is to define the terms (actually I'm planning to use schema:gender). I am not, and I don't believe the EU (current project paymasters) or the GLD WG/W3C more generally is not, in a position to set up some sort of de-referencing system. Even setting up Purls means that we're in effect condoning a value space (and I have at least 3 on my radar for just this term alone - Gannon pointed to some useful info from LoC which might make 4, plus SDMX makes 5). So I'm going to have to fudge it for now and say 'provide an identifier' and may leave it at that. I'd like to offer more guidance but it may not be sensible to do so (and btw. these vocabularies have to work in XML as well as RDF). Longer term... I think I'll drop a line to Norman Paskin at the DOI Foundation... Phil. On 03/04/2012 16:22, John Erickson wrote: Gannon raises a valid point, BUT it is important to remember that ISO is a *publisher* and DOI is fundamentally a publishing industry thing. So while they might not be inclined to support Cool URIs for their own sake, they might be DOI adopters for the sake of The Bottom Line... On Tue, Apr 3, 2012 at 11:19 AM, Gannon Dickgannon_d...@yahoo.com wrote: There are just some things outside of the Web's bailiwick, and the properties of people in that class. The problem is that you are never sure if you are naming the property on rudely calling the property holder names. ISO declines to play, the LOC declines differently http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't exist. I think you've hit a limit, not on Cool Uri's necessarily, but maybe on philosophy. From: John Ericksonolyerick...@gmail.com To: David Boothda...@dbooth.org Cc: Phil Archerph...@w3.org; public-lod@w3.orgpublic-lod@w3.org Sent: Tuesday, April 3, 2012 9:53 AM Subject: Re: Datatypes with no (cool) URI On Tue, Apr 3, 2012 at 10:38 AM, David Boothda...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
Not so fast, young man ... The general point is indeed a minefield. see also: http://www.rustprivacy.org/2011/cnpii.xml (Common Names of Personally Identifiable Information) I think, but I doubt anyone else in the universe does, that you can fix the problem by looking at RDF Lists a bit differently. In particular, rdf:nil / should be a fallback to generality, not empty of truth. This is the Sherlock Holmes Method ... ... when you have eliminated the impossible, whatever remains, however improbable, must be the truth. That said, good plan. --Gannon From: Phil Archer ph...@w3.org To: public-lod@w3.org public-lod@w3.org Sent: Tuesday, April 3, 2012 10:58 AM Subject: Re: Datatypes with no (cool) URI Again, thanks everyone for the quick and useful responses. @Gannon, @Andy - you are right that the issue of sex/gender is far from straightforward (they're not even the same thing I've learned!) However, I need to offer 'something' even if it's not ideal and then work on the longer term. @Sarven - SDMX looks very useful indeed, hadn't seen that they cover gender - great. But it doesn't answer the more general point (I was using sex/gender as an example - there are other terms for which the value space should be a controlled vocabulary that doesn't necessarily have a URI). Here's my plan of action: Short term: the limitation here is that all I'm chartered/empowered to do is to define the terms (actually I'm planning to use schema:gender). I am not, and I don't believe the EU (current project paymasters) or the GLD WG/W3C more generally is not, in a position to set up some sort of de-referencing system. Even setting up Purls means that we're in effect condoning a value space (and I have at least 3 on my radar for just this term alone - Gannon pointed to some useful info from LoC which might make 4, plus SDMX makes 5). So I'm going to have to fudge it for now and say 'provide an identifier' and may leave it at that. I'd like to offer more guidance but it may not be sensible to do so (and btw. these vocabularies have to work in XML as well as RDF). Longer term... I think I'll drop a line to Norman Paskin at the DOI Foundation... Phil. On 03/04/2012 16:22, John Erickson wrote: Gannon raises a valid point, BUT it is important to remember that ISO is a *publisher* and DOI is fundamentally a publishing industry thing. So while they might not be inclined to support Cool URIs for their own sake, they might be DOI adopters for the sake of The Bottom Line... On Tue, Apr 3, 2012 at 11:19 AM, Gannon Dickgannon_d...@yahoo.com wrote: There are just some things outside of the Web's bailiwick, and the properties of people in that class. The problem is that you are never sure if you are naming the property on rudely calling the property holder names. ISO declines to play, the LOC declines differently http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't exist. I think you've hit a limit, not on Cool Uri's necessarily, but maybe on philosophy. From: John Ericksonolyerick...@gmail.com To: David Boothda...@dbooth.org Cc: Phil Archerph...@w3.org; public-lod@w3.orgpublic-lod@w3.org Sent: Tuesday, April 3, 2012 9:53 AM Subject: Re: Datatypes with no (cool) URI On Tue, Apr 3, 2012 at 10:38 AM, David Boothda...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
oops. http://www.rustprivacy.org/2011/pii/cnpii.xml From: Gannon Dick gannon_d...@yahoo.com To: Phil Archer ph...@w3.org; public-lod@w3.org public-lod@w3.org Sent: Tuesday, April 3, 2012 11:46 AM Subject: Re: Datatypes with no (cool) URI Not so fast, young man ... The general point is indeed a minefield. see also: http://www.rustprivacy.org/2011/cnpii.xml (Common Names of Personally Identifiable Information) I think, but I doubt anyone else in the universe does, that you can fix the problem by looking at RDF Lists a bit differently. In particular, rdf:nil / should be a fallback to generality, not empty of truth. This is the Sherlock Holmes Method ... ... when you have eliminated the impossible, whatever remains, however improbable, must be the truth. That said, good plan. --Gannon From: Phil Archer ph...@w3.org To: public-lod@w3.org public-lod@w3.org Sent: Tuesday, April 3, 2012 10:58 AM Subject: Re: Datatypes with no (cool) URI Again, thanks everyone for the quick and useful responses. @Gannon, @Andy - you are right that the issue of sex/gender is far from straightforward (they're not even the same thing I've learned!) However, I need to offer 'something' even if it's not ideal and then work on the longer term. @Sarven - SDMX looks very useful indeed, hadn't seen that they cover gender - great. But it doesn't answer the more general point (I was using sex/gender as an example - there are other terms for which the value space should be a controlled vocabulary that doesn't necessarily have a URI). Here's my plan of action: Short term: the limitation here is that all I'm chartered/empowered to do is to define the terms (actually I'm planning to use schema:gender). I am not, and I don't believe the EU (current project paymasters) or the GLD WG/W3C more generally is not, in a position to set up some sort of de-referencing system. Even setting up Purls means that we're in effect condoning a value space (and I have at least 3 on my radar for just this term alone - Gannon pointed to some useful info from LoC which might make 4, plus SDMX makes 5). So I'm going to have to fudge it for now and say 'provide an identifier' and may leave it at that. I'd like to offer more guidance but it may not be sensible to do so (and btw. these vocabularies have to work in XML as well as RDF). Longer term... I think I'll drop a line to Norman Paskin at the DOI Foundation... Phil. On 03/04/2012 16:22, John Erickson wrote: Gannon raises a valid point, BUT it is important to remember that ISO is a *publisher* and DOI is fundamentally a publishing industry thing. So while they might not be inclined to support Cool URIs for their own sake, they might be DOI adopters for the sake of The Bottom Line... On Tue, Apr 3, 2012 at 11:19 AM, Gannon Dickgannon_d...@yahoo.com wrote: There are just some things outside of the Web's bailiwick, and the properties of people in that class. The problem is that you are never sure if you are naming the property on rudely calling the property holder names. ISO declines to play, the LOC declines differently http://id.loc.gov/authorities/subjects/sh91003756 and simple classes don't exist. I think you've hit a limit, not on Cool Uri's necessarily, but maybe on philosophy. From: John Ericksonolyerick...@gmail.com To: David Boothda...@dbooth.org Cc: Phil Archerph...@w3.org; public-lod@w3.orgpublic-lod@w3.org Sent: Tuesday, April 3, 2012 9:53 AM Subject: Re: Datatypes with no (cool) URI On Tue, Apr 3, 2012 at 10:38 AM, David Boothda...@dbooth.org wrote: On Tue, 2012-04-03 at 14:33 +0100, Phil Archer wrote: [ . . . ] The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? It's helpful to be able to click on the URI to figure out what exactly was meant. How about just using a URI shortener, such as tinyurl.com or bit.ly? David's good point raises an even bigger point: why isn't ISO minting DOI's for specs? Or, at least, why can't ISO manage a DOI-equivalent space that would rein-in bogusly-long URIs, make them more manageable, and perhaps more functional e.g. CrossRef's Linked Data-savvy DOI proxy http://bit.ly/HcStYl -- John S. Erickson, Ph.D. Director, Web Science Operations Tetherless World Constellation (RPI) http://tw.rpi.edu olyerick...@gmail.com Twitter Skype: olyerickson -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: Datatypes with no (cool) URI
On 03/04/12 16:38, Sarven Capadisli wrote: On 12-04-03 02:33 PM, Phil Archer wrote: I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ISO 5218 has: 0 = not known; 1 = male; 2 = female; 9 = not applicable. and Eurostat offers F = female M = male OTH = other UNK = unknown NAP = not applicable IMO, the spec should not dictate which one to use (there are others too of course). What I *do* want to do though is to encourage publishers to state which vocabulary they're using. Sounds like a job for a datatype - and for that you need a URI for the vocabulary. Something like: schema:gender 1^^http://iso.org/5218/ . Except I made that iso.org URI up. The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? Given that the identifier for the ISO standard is ISO/IEC 5218:2004 how about urn:iso/iec:5218:2005? For Eurostat, the internal identifier for the vocabulary is SCL - Sex (standard code list) so would urn:eurostat:scl:sex be appropriate? Anyone done anything like this in the real world? All advice gratefully received. Thank you Phil. [1] https://dvcs.w3.org/hg/gld/raw-file/default/people/index.html Perhaps I'm looking at your problem the wrong way, but have you looked at the SDMX Concepts: http://purl.org/linked-data/sdmx/2009/code#sex -Sarven I was going to suggest that :) Actually looking at that I see that I've failed to datatype the skos:notation entries in those code lists. There should probably be a http://purl.org/linked-data/sdmx/2009/code#sexDT datatype to go with the notation on those skos:Concepts. Phil, if that's important to you then raise it as an issue on the tracker [1] and, if no one objects, then I can get it fixed. Dave [1] http://code.google.com/p/publishing-statistical-data/issues/list
Re: Datatypes with no (cool) URI
For a comprehensive overview of gender vs sex and other challenges in representing the reality underlying demographic data see this paper http://ceur-ws.org/Vol-833/paper20.pdf Describing the The Ontology of Medically Related Social Entitie (OMRSE) http://code.google.com/p/omrse/ . The URI for the Female Gender = http://purl.obolibrary.org/obo/OMRSE_0009 (subclass of Gender Role and Human Social Role) Regards @kerfors 2012/4/3 Andy Turner a.g.d.tur...@leeds.ac.uk I am a researcher working on some Demographic Social Simulation Models. In the simple models, I distinguish people classed male at birth and people classed female at birth and gender ambiguity, reassignment (sex change) and gender recalssification are not modelled. In more complicated models these things might be modelled and if I were modelling that, I would consider storing a list of changes and have more classes or somehow quantify maleness and femaleness. The point I am making here is that the assignment of gender (or sex depending on what word you prefer) could be time dependent. In an attempt to make my data storage and retrieval work better I implemented two main data stores for people: those classed female at birth; those classed male at birth. In my models, even if current gender were re-assigned data for that individual would still be stored in the same data store. I suspect that in ambiguous cases in reality what is done in terms of gender classification might be different for different countries. BTW: gender ambiguity was topical in the mainstream media in the Autumn in the UK [1]. It is not as uncommon as you might think... So, gender is a fuzzy thing. Maybe we all belong to male and female classes to a degree and for most of us this distinction is binary. In terms of encoding, in my implementations I've used 0 for female and 1 for male as I find that easy to remember and computationally it makes sense. Andy [1] http://www.bbc.co.uk/news/health-14459843 From: Phil Archer [ph...@w3.org] Sent: 03 April 2012 14:33 To: public-lod@w3.org Subject: Datatypes with no (cool) URI I'm hoping for a bit of advice and rather than talk in the usual generic terms I'll use the actual example I'm working on. I want to define the best way to record a person's sex (this is related to the W3C GLD WG's forthcoming spec on describing a Person [1]). To encourage interoperability, we want people to use a controlled vocabulary and there are several that cover this topic. ISO 5218 has: 0 = not known; 1 = male; 2 = female; 9 = not applicable. and Eurostat offers F = female M = male OTH = other UNK = unknown NAP = not applicable IMO, the spec should not dictate which one to use (there are others too of course). What I *do* want to do though is to encourage publishers to state which vocabulary they're using. Sounds like a job for a datatype - and for that you need a URI for the vocabulary. Something like: schema:gender 1^^http://iso.org/5218/ . Except I made that iso.org URI up. The actual URI for it is http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=36266 (or rather, that's the page about the spec but that's a side issue for now). That URI is just horrible and certainly not a 'cool URI'. The Eurostat one is no better. Does the datatype URI have to resolve to anything (in theory no, but in practice? Would a URN be appropriate? Given that the identifier for the ISO standard is ISO/IEC 5218:2004 how about urn:iso/iec:5218:2005? For Eurostat, the internal identifier for the vocabulary is SCL - Sex (standard code list) so would urn:eurostat:scl:sex be appropriate? Anyone done anything like this in the real world? All advice gratefully received. Thank you Phil. [1] https://dvcs.w3.org/hg/gld/raw-file/default/people/index.html -- Phil Archer W3C eGovernment http://www.w3.org/egov/ http://philarcher.org @philarcher1