Re: Southampton Pub data as linked open data

Richard Cyganiak Mon, 28 Jul 2008 07:25:19 -0700


Chris,


I'll try to answer some of the questions.

On 28 Jul 2008, at 13:18, Chris Wallace wrote:

Thanks John for this resource - It inspires me to help my studentsto do a similar data collection exercise in Bristol!
A few things puzzle me though, probably as a newcomer to this field.I'm in the process of RDFing our faculty data so these issues aretaxing me too.
1) The resource URI eg. http://www.johngoodwin.me.uk/pubs/id/pub1
is not humanly readable. Is this considered to be a problem? Forexample DBPedia would be I think be less valuable with system-generated resource ids, even though natural resource ids require amechanism for disambiguation.

Human-readable unique identifiers are nice, but the exception. It'strue that DBpedia would be less valuable without the human-readableIDs, but DBpedia piggy-banks on Wikipedia's identifier scheme, whichis maintained by an army of volunteers. At the end of the day,uniqueness is more important than human-readable. If the uniqueidentifiers in your original data source are not human-readable, andyou don't have the resources to curate a new identifier scheme, thenusing a numeric scheme is better than not publishing the data at all...

2) The pub name has been re-formatting to catalogue order, but pubnames are proper nouns and I'd be laughed at if I asked the way to"Alexandra, The". Perhaps both forms could be included with adifferent tag for the catalog format if it is not computable fromthe natural name.

I don't see why pub names are different from movie names, artistnames, or book names, all of which can often be found reformatted inthis way.

3) Why have both rdfs:label and pub:name since they seem to havethe same content?

Generic RDF tools (which do not know about the pub vocabulary) oftenuse rdfs:label for display/headline purposes. So if your domain-specific vocabular has its own vocabulary, it might be a good idea toadd both. In an ideal world, John would declare pub:name a subpropertyof rdfs:label, and the tools would infer the rdfs:label value... Butmost clients don't do that yet.

4) I feel uncomfortable with the non-uniform representation of theaddress - partly with domain specific-tags pub:street andpub:postcode, partly with a company-specific (and non-humanlydecipherable) URI. I know that this is a can of worms e.g.http://xml.coverpages.org/namesAndAddresses.html#eccmaand I can’t find a suitable address vocabulary but this mixturedoesn’t look very satisfactory.


If only we could finally agree on *one* vCard-in-RDF vocabulary...

5) pub:dateSurveyed: isn’t this just the date at which thedescription was authored (if not when it was entered into thisformat) i.e. dc:date

dc:date could mean many things: when the pub was surveyed, when theRDF document was published, when the pub was opened... Usingpub:dateSurveyed makes the meaning clear to the user of the data.


Best,
Richard

6) Generally , these seem such general properties of any place thatI'm surprised that any local vocabulary is needed at all, given thatno data is actually domain specific (like a list of beers served).
This case study seems a great example of the issues in vocabularyand resource reuse. It would be interesting to compare the differentsolutions which different analysts would use to represent thisdata. Perhaps something like it would be a good exercise for theOxford VoCamp?
Chris


Chris Wallace
Senior Lecturer
Department of Information Science and Digital Media
University of the West of England, Bristol
This email was independently scanned for viruses by McAfee anti-virus software and none were found

Re: Southampton Pub data as linked open data

Reply via email to