I agree entirely with your analysis, Lee, but am still confused about your 
objection to assigning a Work a title!

Doesn't one, in some circumstances and applications,  sometimes need to present 
a Work to users as a result on a screen?  How is it to be labelled? Why is it 
inappropriate for the Work entity to contain a title (or really, you are quite 
right, no reason to only insist on ONE title, perhaps multiple), that can be 
used by the application, when convenient/useful to do so, to label the Work 
group?  Seems useful to me, and even if not useful, certainly not any kind of 
intrinsic violation of the conceptual model.   If it makes sense for your 
application, there is no reason not to do it. (In legacy library data, the 
'uniform title', or some portion of it, sometimes serves as a "work title", 
although other times is used for other things, making it very confusing to 
figure out how to use the data. At least if you use something in a semantically 
consistent way as a work title, you're still within the bounds of the FRBR 
model). 

I agree entirely, which was really my point that started this thread, that any 
assertion that is true of every (not just existing but every _possible_) 
Expression in a Work (and, to say the same thing another way: additionally of 
every Manifestation in any Expression of that work) is best modelled as a 
property of the Work. Some systems (like our legacy MARC-based systems) may 
instead model it as a duplicate identical assertion made on every single 
Manifestation. This is semantically the same thing, it kind of sort of works, 
but is not very tidy/maintainable and it makes your data much harder to use for 
many use cases (of course, since it's semantically the same thing, it could 
always be transformed to the former).   But really, my suggestion that began 
this part of the thread, that means there is absolutely no need for some entity 
representing "the whole thing", the Work entity already does, as a neccessary 
consequence of the FRBR model -- and using the Work entity for this 
 is a perfectly consistent, maintainable, clear, and re-useable way to model 
such assertions, with no downsides I can think of. 

I also agree that something like an author is BEST modelled as a relationship 
to an Author entity, for the reasons you say. Although some simple or 
constrained systems may not be able to do that, and may instead just slap an 
author name string as a property of a Work object, instead of, as we say in the 
library world, 'controlling' the author.  Even if you're doing that out of 
necessity or desperation, if you generally try to otherwise align your 
modelling with FRBR, you're still going to have benefits in inter-operability 
with other people with FRBR-ish data, among other benefits. The FRBR model 
isn't a straightjacket or an all or nothing thing, it's a conceptual framework. 

This probably has nothing to do with OL anymore, but it's an interesting 
conversation. 

________________________________________
From: [email protected] [[email protected]] On Behalf Of 
Lee Passey [[email protected]]
Sent: Wednesday, November 17, 2010 7:04 PM
To: Open Library -- technical discussion
Subject: Re: [ol-tech] New treatment of frbr:manifestation in work RDF

On Mon, November 15, 2010 5:34 pm, Jonathan Rochkind wrote:

> I mostly agree with Lee's general analysis. Except I'd note: Just
> because the FRBR document doesn't give a Work an author or title,
> doesn't mean we can't or shouldn't.

Let me try to distill much of this reply into two assertions that I think
we are in complete agreement on:

1. Every work has at least one "creator" and possibly multiple
"contributors." (Even if we don't know the identity of the creator, we
still know that s/he must have existed.)

2. Every assertion that can be said to be true for every expression of a
work is, and should be, a property of the Work and not one or more of it's
expressions.

Now every bibliographic system worth its salt will capture and preserve
the foregoing information; the only question is /how/ it is captured and
preserved.

The FRBR specification defines 10 entities grouped into three categories.
These three categories can be generally be considered as those entities
dealing with creative works (group 1), those entities dealing with authors
and creators (group 2) and those entities dealing with subject matters
(group 3).

All entities have attributes (properties) generally manifested as
name/value pairs, where the "name" is the name of the attribute in the
entity definition (e.g. 'Title') and the "value" is the data associated
with the attribute for a given instance of the entity (e.g. 'The
Adventures of Tom Sawyer'). Entities may not be the attribute (property)
values of other entities, but an entity may have an attribute (property)
which is a relationship to another entity.

This distinction between attribute values and entity relationships is
somewhat artificial, but I believe it is based on the valid notion that an
entity is a complex object which itself possesses a collection of
attributes. It is at least inefficient to expect a "Work" entity to
maintain all the properties of every "Person" entitiy that contributed to
its creation, and it is highly error-prone for multiple "Work" entities to
/all/ attempt to maintain the same data repetitively.

I don't think it is inappropriate for a "Work" entity to maintain the
identities of those entities involved in its creation, but attempting to
do so by recording an author's name is an inadequate way to do so. A much
better way is to assign a Universally Unique IDentifier to each author,
and store that as a "CreatedBy" attribute on the work.

Some may consider it to be a semantically trivial distinction to say that
"a work contains an author" as opposed to saying "a work contains an
author identification," but I consider it a highly important distinction
if you wish to maintain the proper perspective between entities.

> I think the FRBR document probably _should_ have allowed such
attributes. > It won't be a _transcribed_ title or author, because a
Work is an abstract
> thing, there is nothing to transcribe. It might not fit into _library_
> workflow to assign a title or author to a Work.
>
> But a Work still has a creator, and still can have an assigned (not
> transcribed) title labelling what the work is. If it's not in the
> official documented FRBR list of attributes, oh well, we can add it
> ourselves anyway if we need it -- to me it seems adding extra attributes
> to entities still used largely as FRBR intended is fine, it won't make
> your data incompatible with anyone elses FRBR data.

Actually, the FRBR specification provides that a "Work" entity /may have/
a "Title" property, much to my dismay. I disagree with this schema for a
number of reasons. In the first instance, a work need not have a title of
any kind; there are many, many untitled works in existence. In the second
instance, a work may, over the course of its virtually unlimited
life-span, have many different titles, no one of which can be considered
authoritative; I am occasionally struck when watching "The Daily Show with
Jon Stewart" to hear an author state, "that's not the title I gave it, the
publisher insisted on that one." In the third instance, a title cannot
hope to provide a unique and unambiguous reference to a specific work.

I suspect that it is this last objection that you are referring to when
you make the distinction between an "assigned" title and a "transcribed"
title. Clearly, a Work needs a unique identifier if for no other reason
than to facilitate the creation of relationship attributes. The identifier
may be Universally Unique (e.g. ISTC:A02200900000A88F or OL:OL53919W) or
Locally Unique (e.g. record number 24419 in my database), but it must have
some assurance of uniqueness to be useful ("Tom Sawyer" just won't cut
it).

I believe that what you call an "assigned title" is what I call a "unique
identifier." I try to avoid the word "Title" because of its semantic
baggage. I'd bet that when you use the word "Title," everyone reading
hears "transcribed title."

And if I've misread you, I'm sure you'll let me know ;-).

So, the way I've modeled my own database is something like this:

I have tables for Actors, Events, Works, Expressions, and Manifestations,
each of which has a unique ID property. The Expressions table has a
Foreign Key constraint on the Works table, so an Expression record cannot
exist without referencing a specific Work record. Their is no reciprocal
column in the Work table, thus limiting me to a one-to-many relationship
between Works and Expressions.

My Work table has but two columns: an auto-generated Identity column,
which is guaranteed to be locally unique, and a "notes" field designed to
hold unqualified free-form text relating to a work as a whole in all its
iterations.

The Event table captures date/time and location information, but also has
a subject-verb-object type of function. In the context of the current
discussion, I use it to relate a work to an author via a "creation" event.
In the case of Mark Twain's Tom Sawyer, a record may indicate "during
[some period of time] at [some location] the subject [LUID of Samuel
Clemens] did [CREATE verb] the [LUID of Tom Sawyer].

(As pointed out by the FRBR spec, relations among entities tend not to be
limited to certain other entities. An Actor (in my parlance, FRBR Group 2
entities) can be associated with Works, Expressions, Manifestations and
even Items in different roles, and their numbers may increase at each
level.)

> I would also add, Lee, I think it's totally consistent with your and my
> analysis to in fact give attributes to Works.  Consider, as you say:
>
> * "Every assertion|attribute value|property value of or about a work is
> also a valid assertion|attribute value|property value of or about the
> Expression that expresses the work. "
>
> Indeed. So if there's something you want to say which is _inherently_
> true about all Expressions (and their Manifestations and THEIR Items) of
> a Work, the proper place to say it is about the Work.  Then it is true
> of all past and future EMI of the Work, just as you intended.

Absolutely. One of the best examples of this, I think, is the kind of
free-form text recommendation you regularly see on library-oriented social
networking sites: "I liked this book, so you will too." Rarely is this
kind of comment directed at a particular edition or printing; it's
intended to be a comment on the work as a whole, and every derivation
thereof, and should therefore be stored and transmitted as part of that
entity.

Most discussion (dissention?) regarding the proper assignment of
attributes arises when trying to answer the question "does the property I
have identified as being possessed by the instance of a work I now hold in
my hand belong conceptually at the Item level, the Manifestation level,
the Expression level, or the Work level." When the answer to that question
ends up being "wrong" (defined as "not the way I would have done it") I
think it is usually as a result of not being able to see the trees for the
forest.

My own heuristic is to ask, "Is this assertion true (every
attribute|property has the same value) of every associated (abstract)
instance of this Entity?" If so, the property probably belongs at the next
higher level; move it and ask again at that level. It is by asking this
question (and by removing relationships to other entities that are modeled
outside of the entity itself) that I have arrived at the conclusion that a
unique ID (assigned title?) is pretty much all you need in a work record.
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to