Re: [ol-tech] Yet Another FRBR Schema (was Bulk Download and Request) - Part 3

Jonathan Rochkind Wed, 15 Dec 2010 15:39:20 -0800

I agree with the principles you lay down in general, agree with your liking of 
FRBR, and agree we should not neccesarily be bound by what libraries have 
always done, especially when working in a domain where compatibiilty with 
library legacy data is not important (up to OL whether OL is such a domain, but 
it's reasonable to decide it is).

But I don't think it's true that an association (or other collective body) can 
not author a document. 

If a document is put out in the name of an association (or other collective 
body), has no individual names on it, and nobody knows who wrote it --- does it 
not make sense to treat it as authored by the collective body?  I suggest that 
most users would in fact consider it authored by the collective body. Sure, you 
COULD, in theory, try to do some sleuthing to figure out the (say) PR flacks or 
congressional staffers (another good example -- is not a Congressional bill 
authored by the U.S Congress, a collective body?) who actually authored it. 
Maybe you'll even manage to figure it out.  Is it worth it to meet any actual 
user needs?  Even if you do figure it out -- sure, it might be useful to _add_ 
the names of the actual real humans who authored it as additional information. 
But you _still_ wouldn't want to _remove_ the collective body as an 
author/creator, it would be a confusing dis-service to the user -- who might, 
for instance, be confused about if your record represented the same thing they 
had in their hands, which mentioned a collective body as an author and didn't 
mention the individuals you tracked down. 

Another example would be a musical album by a musical band (or a recording from 
a symphonic orchestra). Sure, you COULD track down the names of the individuals 
in the band. (or the 120 or however many performers in a symphonic orchestra!). 
And it wouldn't _hurt_ to list them too, if someone actually wanted to spend 
the time looking them up.  But most people are going to think of that work as 
_by_ The Beatles or The Chicago Symphony Orchestra, it does them no service to 
argue otherwise because only individual people (seperately or in concert) can 
actually author anything. 

ALL models are just approximations of reality.  You pick the approximation of 
reality that works for what you're doing -- which in part is creating systems 
that hopefully match most of your users own internal mental models. (Which I 
still insist, for most Americans anyway, most definitely includes the notion 
that works can be authored by collective bodies).  The map is not the 
territory, the menu is not the meal. 
________________________________________
From: [email protected] [[email protected]] On Behalf Of 
Lee Passey [[email protected]]
Sent: Wednesday, December 15, 2010 6:08 PM
To: Open Library -- technical discussion
Subject: Re: [ol-tech] Yet Another FRBR Schema (was Bulk Download and Request) 
- Part 3

On Tue, December 14, 2010 1:34 pm, Karen Coyle wrote:

> Quoting Lee Passey <[email protected]>:
>
>> But even if you limit the definition of the word "agent" to be
>> synonymous with "actor" (e.g. one who acts), it really can't
>> encompass those entities who are "NotAPerson."  The 9/11
>> Commission did not write the "9/11 Commission Report";
>> one or more individual members of the Commission, or their staff,
>> were the true creators. The commission as a whole is /responsible/ f
>> or the document, but it did not act to /create/ it.
>
> You can take that view, but the library cataloging view is that the
> corporate entity is the creator. So libraries would have no problem
> using Agent for corporate bodies. I realize that this is a stretch,
> but I've gotten used to it.

Ahh, now we get to the real philosophical underpinnings of FRBR.

Back in my halcyon days as a programmer, I would often get asked to "take this
manual/paper system and automate it." The expectation, unfortunately, was
usually to build an electronic clone of the manual system, when usually the
/right/ solution was to create a new system, more attuned to the electronic
environment, that solved the users' problems in novel, and typically more
efficient ways.

I am still amused (in a snarky, derisive kind of way) by people who talk about
making e-book cookbooks. In fact, the reason most people use cookbooks is as a
way to store and retrieve recipes. In other words, what they really want is a
recipe database; we are only used to cookbooks because 20 years ago a book was
the only practical way to build a dedicated database system that could be used
in a home.

These days, even though I own a whole bookcase full of cookbooks, when I need
a recipe the first place I go is http://epicurious.com. Epicurious now has a
really fantastic app for smartphones to search for recipes. As the Internet
generation grows, the cookbook (at least the cookbooks that are recipe
collections as opposed to those that tell a story like "The French Laundry
Cookbook"), will probably be the first class of paper books to go "out of
print."

I /really/ like the FRBR model. It seems to me that what the IFLA did in this
case was to take a big step back from the model of writing information on 3x5
cards which were then stored in drawers, and said: "What are the various
abstractions of instances of works of literature, and how to they relate to
each other? Can we build an Automated Data Processing (ADP) infrastructure
that represents these relationships and provides traditional catalog access
/without regard to how this catalog access has been implemented in the past/?"

So, when dealing with the FRBR model, I think it is very important /not/ to
take "the library cataloging view." If we all know that an association cannot
author a document then there is no reason we should continue to refer to that
association as an 'author'. That is an artifact of the old days when we had a
paper form with a line labeled 'Author,' and everything had to be shoe-horned
into the form's parameters.

As we move forward with building FRBR databases I think it is important that
we chose language which is most precise and which conforms to commonly
understood meanings for common words, regardless of past jargon or terms of
art.

For me, the most accurate term for FRBR Group 2 Entities is still "responsible
entity." Additionally, my educational background is not in library science,
but in the law, and I am quite certain that the definition of "Corporate Body"
has no connotation of "any association of individuals regardless of legal
status," and cannot be construed to encompass "any and all entities that are
not persons." "Corporate Body" may be a term of art in Library Science, but
there is no justification for perpetuating those kinds of legacy inaccuracies
moving forward.

I note that the FRBR term "Corporate Body" has been translated to
'collectivité' in the "Spécifications fonctionnelles des notices
bibliographiques" (Traduction française de "Functional requirements for
bibliographic records : final report"). I /like/ that term, so I think that
going forward I will use the term "collective" for those responsible entities
which are not persons.

>>> The FRBR/FRAD "name" is a display form for human use.
>> I'm not convinced of that. The FRBR Final Report says that the
>> Entity name "is the word, character, or group of words and/or characters
>> by which the [Entity] is known....
>
> <snip>
>
> The reason why the name isn't useful as an identifier is that it can change.

No, it can't.

If it could, it wouldn't serve the FRBR purpose of being a "uniform heading
for purposes of consistency in naming and referencing the [Entity]." A FRBR
Entity::name is not the same thing as a display name, although conceivably the
same value could be used as both (leading perhaps to confusion and namespace
collisions).

In my view, a FRBR Entity may have many appellations, and that list of
appellations is clearly mutable. If one of those appellations is chosen as the
FRBR::name, however, that decision must never change or referencing the Entity
would become inconsistent, in violation of the specification.

I suspect this is why in FRAD 'Name' was removed as an attribute for FRBR
entities, and two /new/ entities, Name and Identifier were added. According to
the Final Report, "Entities in the bibliographic universe (such as those
identified in the Functional Requirements for Bibliographic Records) are known
by names and/or identifiers."

Note the use of the phrase "and/or." According to FRAD, a "responsible Entity"
need not have a Name, and presumably need not have an identifier either
(although the lack of both would certainly make it difficult to refer to the
underlying Entity).

Names and Identifiers are related to Entities via an "appellation of"
relationship (Names) or an "assigned to" relationship (Identifiers). If an
identifier is used, it "may be assigned to only one specific instance of a
bibliographic entity;" presumably a 'Name' may be an appellation of any number
of Entities.

This is precisely the approach I took when designing my database schema.
Entities are assigned identifiers which are unique, exclusive, persistent and
immutable. Names are collected in a "names" table. Names are related to
Entities via the "entities_names" table, which ties a specific name_idn to a
specific entity_idn.

> The name is a display form that, should the cataloging rules decide, could
> be replaced with another string. There are rules and reasons why this
> happens, but it is not a persistent identifier.

A display name can certainly be changed or replaced, which is a good reason
why it should not be chosen as a persistent identifier. But persistent
identifiers are a requirement of an entity/relation data model, and so far I
have seen no reference to any official specification that suggests that there
is any equivalence between a display name and a FRBR:Name/FRAD Identifier.

[snip]

> Would it make a difference to you if, instead of re-direction, the previous
> identifiers were included in the record itself?

No. For purposes of a relational database I need an identifier that is unique
and exclusive. For example, given the above-referenced schema, I can do
"SELECT name from names, entity_names WHERE entity_idn = [some entity id]" and
get a list of all the names associated with a specific entity. Conversely, I
can "SELECT entity_idn from entities, entity_names WHERE name_idn = [some name
id]" and get a list of all entities which share a common name. If I had
multiple identifiers for an entity I could not be assured that these queries
would return a complete set, especially if there continued to be records that
referred to different identifiers in the set.

Note that this does not mean that I think that OL's redirection mechanism is a
bad thing, particularly in the context of OL's data sets; indeed, I think it's
a very /good/ thing in that context. And I do preserve it as an alternate name
for an Entity so I can refer back to an originating record in the OL data set.
It's simply not something that can be used effectively to relate records in a
relational database.

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] Yet Another FRBR Schema (was Bulk Download and Request) - Part 3

Reply via email to