On 2 Apr 2008, at 23:18, Chris Sizemore wrote:
1) the licensing seems too restrictive for the purposes of this
community, but has anyone taken the downloadable imdb data and tried
to RDF-ize it? thoughts?
http://www.imdb.com/interfaces
http://uk.imdb.com/help/show_leaf?usedatasoftware
http://glinden.blogspot.com/2008/03/using-imdb-data-for-netflix-prize.html
http://radar.oreilly.com/archives/2006/05/imdb-api.html
I did a partial RDFization of the IMDB data together with Bastian
Quilitz for a distributed querying demo (we meshed IMDB data with
local movie showtimes). Might dig out the code if someone needs it.
2) switching focus a bit, could we/should we be using imdb URIs as
identifiers for Movies, TV Programmes, and TV Programme Episodes,
and (certain) people? i think we should, so, from the best LOD
practice (given that imdb haven't yet pulled a dbpedia and provided
concept/data URIs in addition to their document URLs), shouldn't i
use:
http://www.imdb.com/title/tt0088846/#thing (to represent the gilliam
film Brazil in BBC RDF...)
From a SemWeb POV this is pretty useless since the URI doesn't
resolve to RDF data. Identifiers on the Web are only as good as the
data they point to. IMDB URIs point to high-quality web pages, but not
to data.
Also, don't squat other people's URI space. IMDB hasn't endorsed the
#thing URIs, and simply creating new URIs inside someone else's URI
space is considered a violation of the Web's social contract.
3) what if i published a site that publicly made available RDF such
as:
http://www.imdb.com/name/nm0000187/#thing owl:sameAs
http://musicbrainz.org/artist/79239441-bfd5-4981-a70c-55c3f15c1287.html#thing
or
http://www.imdb.com/name/nm0000187/#thing owl:sameAs http://zitgist.org/79239441-bfd5-4981-a70c-55c3f15c1287
(or whatever it is)
Better make your own identifiers. Implementation-wise it might make
sense (or not) to re-use their internal IDs (0088846) in your own
URIs, so you could have http://yourdomain/movies/0088846#thing .
It's still a good idea to include a link to the IMDB page in your
data, e.g. using the foaf:page or foaf:isPrimaryTopicOf property,
which can be used to link together things (e.g. movies, people) and
web pages about them.
So you could have (in N3 syntax):
<http://yourdomain/people/0000187#thing>
a foaf:Person;
owl:sameAs <http://zitgist.org/79239441-bfd5-4981-
a70c-55c3f15c1287>;
owl:sameAs <http://dbpedia.org/resource/Madonna_%28entertainer%29>;
foaf:isPrimaryTopicOf <http://www.imdb.com/name/nm0000187/>;
.
in other words, a set of RDF making equivalency statements about
people from imdb across to other datasets like musicbrainz?
The problem is that people in IMDB don't have URIs, and only IMDB is
in a position to create them. IMDB only has URIs for web pages, so the
best you can do is say something about the IMDB pages, e.g. what their
topic is.
would this community find that useful?
I think it would be useful.
in other words, given the imdb licensing realities, are imdb URIs
useful as identifiers even if we can't use the related data?
IMDB URIs are useful because they resolve to high-quality human-
readable web pages. This is valuable, because it's a very good way of
making clear what our own LOD URIs identify.
The way I describe it above, I wouldn't call it "using IMDB URIs as
identifiers", but rather "annotating IMDB pages with links into the
Semantic Web".
are URIs useful in LOD on their own?
As I said, a URI is only as good as what it resolves to. IMDB URIs are
part of the old document Web, and only IMDB themselves can upgrade
them to the Semantic Web, because they control what comes back when
you request the URI.
Best,
Richard
sorry for the ramble, but had a lot of imdb on my mind...
all the best--
--chris sizemore
http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain
personal views which are not the views of the BBC unless
specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in
reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.