Re: [ol-tech] a few notes on rdf views

Lee Passey Tue, 08 Jun 2010 16:06:57 -0700

On 6/8/2010 2:59 PM, Erik Hetzner wrote:
> Hi Lee -
>
> At Tue, 08 Jun 2010 14:16:44 -0600,
> Lee Passey wrote:
>>
>> On 6/8/2010 10:20 AM, Ross Singer wrote:
>>
>>> I think it's important to note here, that in RDF, you are -not-
>>> confined to one schema.
>>
>> No. That RDF was designed as a schema aggregator, and as a way to avoid
>> DTD constraints, is a given. The important issues are:
>
> I am afraid I don’t think this a very good way of thinking about RDF.


I'm certain it's not ideologically pure, but I think it's very 
practical. The W3C states that the motivation for creating RDF was "to 
represent information in a minimally constraining, flexible way." In 
information processing there is a natural and inevitable tension between 
constraints and flexibility. Human beings (and presumably really good 
AI) is very good at deriving meaning from ambiguity. Computer 
algorithms, not so much. So if what I want is a way to represent 
information about the relationships between web resources, and present 
the relationship data to a human to sort out, flexibility is good. If 
what I want to do is data mining, flexibility is bad.

I tend to be much more interested in data mining and automated data 
processing than in just presenting another pretty web page to the world; 
constraints work for me.

>> 1. Having aggregated as many different schemas as you need, does the
>> resultant set completely express /all/ the data held by OL for any given
>> record? and,
>
> If you want absolute fidelity to the underlying record, there is
> always the JSON output, no? e.g.,
> http://openlibrary.org/authors/OL31800A.json

Maybe not. If you look at the web documentation, OL claims that the JSON 
API "is deprecated now. This is retained only for backward compatibility 
and RESTful API should be used instead of this." Again according to the 
web documentation, the RESTful API is equivalent to the RDF interface. 
My understanding of the word "deprecated" is that it is a warning 
against use in the future so that it may be phased out. If OL is going 
to phase out the JSON API, then whatever replaces it should be a 
complete representation of the underlying data object (which is, in 
fact, just a stored record of the JSON text object), at least for data 
mining purposes.

I always use the JSON API because I'm assured of getting all the data. 
If OL said, "whoops, we really aren't deprecating the JSON API, and it 
will always be available" then I would cease to care about the RDF 
representation, as it would no longer be of any interest to me.

[snip]

> Using FOAF does not preclude using other schemas, even to describe the
> same URI, even schemas that overlap in their use.
>
> That we say that http://openlibrary.org/authors/OL31800A is a
> foaf:Person and provide FOAF data for them, does not preclude also
> using the official or unofficial FRBR vocabularies, RDA vocabularies,
> the bio vocabulary, etc. additionally.

And in my mind, this is the biggest problem with RDF. If I'm writing an 
application to derive biographical data from an RDF feed, an infinite 
number of alternatives makes it useless. As the Pointed Man in the 
Pointless Forest said, "a point in every direction is as good as no 
point at all." A controlled vocabulary (and by controlled I mean 
limited, constricted and constrained) is critical to automated data 
processing.

In the end, I don't care if an author's name is represented by:

<rdf:Description rdf:about="http://openlibrary.org/authors/OL20188A";>
     <rdf:value>Edith Wharton</rdf:value>
</rdf:Description>

or by

<foaf:Person>
      <foaf:name>Edith Wharton</foaf:name>
</foaf:Person>

or by

<dcterms:Agent>
     <dcterms:Name>Edith Wharton</dcterms:Name>
</dcterms:Agent>

or (preferably):

<ol:Authors>
   <ol:Author type="creator">
     <ol:Name>Edith Wharton</ol:Name>
   </ol:Author>
<ol:Authors>

But it should only be represented by one of these, not by all. If I need 
it transformed into a different vocabulary, that's what XSLT is for. In 
all probability FOAF is probably good enough for whatever consumer of OL 
data emerges. But it shouldn't be selected simply because it's the 
newest craze, and it certainly shouldn't be selected with the idea that 
if it's not good enough OL will just add a new, parallel XML tree. At 
some point, somebody needs to say, "This far shalt thou go, and no farther."

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] a few notes on rdf views

Reply via email to