Re: [RDA-L] Fictitious beings as pseudonyms (was: Dr. Snoopy)

2011-04-28 Thread Stephen Hearn
I think this is covered by LCRI 22.2B, Multiple
Headings--Contemporaries, point 5:

If different names appear in different editions of the same work,
choose for all editions of the same work the name that predominates in
the editions of the same work.  If, however, a change in the person's
bibliographic identification from an older name to a newer name that
seems to be stable has taken place, choose that name for all editions.
 In case of doubt on any point, choose the latest name used for all
editions.

RDA says something similar at 6.27.1.7:

If the identity used most frequently cannot be readily determined,
construct the authorized access point representing the work using the
authorized access point representing the identity appearing in the
most recent resource embodying the work followed by the preferred
title for the work.

I'd consider that the books originally published only with the
Rampling name but now appearing with the Rice name given top billing
as well would fall under either of these rules, and that one could
establish a uniform title for all editions of a previously Rampling
title under the Rice heading.  The problematic bit here is that the
rule calls for this to be done title by title. We have to wait for all
the Rampling titles to be reissued under the Rice name before we can
merge Rampling into Rice. If there's a lesser novel that never gets
republished, the rule does not support changing its entry on the basis
of a larger trend to use Rice over Rampling, resulting in a split of
the preferred access points for titles which arguably ought to share a
single form of name entry.

Stephen

On Thu, Apr 28, 2011 at 10:10 AM, J. McRee Elrod m...@slc.bc.ca wrote:

 Still waiting for an answer to Anne Rice writing as Anne Rampling in
 RDA.


   __       __   J. McRee (Mac) Elrod (m...@slc.bc.ca)
  {__  |   /     Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__




-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Sneaky Pie and Rita Mae Brown

2011-04-28 Thread Stephen Hearn
 ~~

 On Wed, 27 Apr 2011, Deborah Tomares wrote:

 Here's the thing, though. Snoopy doesn't have the profession of
 author,
 because as we all know, he didn't really write the book. He is a
 fictitious
 dog, lacking in digits and English language necessary to put out the
 work
 he authored (even in the cartoons, he never speaks). So I don't
 believe
 we can, or should, apply the same rules and standards to him that we
 do to
 real, live, preferably human authors.

 And yes--I would have one heading for both Superman and Clark Kent.
 And it
 would be a subject heading, not a personal name heading. That's where
 I
 believe fictitious characters belong, and where most users would
 expect to
 find them. As in my Spiderman example before, I don't think it would
 benefit anyone, cataloger or user, to have to constantly revise and
 sift
 through the changeable natures/personae/call-them-what-you-will of
 fictitious characters. Because they aren't real, so aren't bound by
 rules
 of reality, to attempt to impose reality upon them seems to me wrong
 and
 not useful.

 Deborah Tomaras, NACO Coordinator
 Librarian II
 Western European Languages Team
 New York Public Library
 Library Services Center
 31-11 Thomson Ave.
 Long Island City, N.Y. 11101
 (917) 229-9561
 dtoma...@nypl.org

 Disclaimer: Alas, my ideas are merely my own, and not indicative of
 New
 York Public Library policy.



  From:       Peter Schouten pschou...@ingressus.nl

  To:         RDA-L@LISTSERV.LAC-BAC.GC.CA

  Date:       04/27/2011 12:51 PM

  Subject:    Re: [RDA-L] Dr. Snoopy

  Sent by:    Resource Description and Access / Resource Description
 and Access RDA-L@LISTSERV.LAC-BAC.GC.CA






 Unless one assumes that Dr. Snoopy is somehow different from plain

 Snoopy,

 and would advocate a series of maybe linked authorities for each

 differing

 guise of a character. Mr. Schouten, for example, claims that: even
 fictional characters are entitled to their own Personae. But I would

 argue

 against this route for multiple reasons. Fictitious character cannot

 truly

 have professions, so they aren't really different persons despite
 the
 guise;

 But in this example, the publication presents Dr. Snoopy as the
 author,
 which causes the fictional character to have the profession of author.

 Would you also have 1 heading for both Clark Kent and Superman?


 Peter Schouten









-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Dr. Snoopy

2011-04-27 Thread Stephen Hearn
One point of having authority records is to recognize that entities
can have a coherent presence--an identity--that goes beyond what is
found on one book. In the case of Snoopy, that identity is primarily
iconic--we recognize his various images as Snoopy, regardless of what
he's sometimes wearing. Knowing that, when I encounter Dr. Snoopy, I
see it as Snoopy--Snoopy in one of his many personae, but primarily as
Snoopy. The fact that Snoopy's existence is primarily visual and that
he remains recognizable as Snoopy across so many personae says to me
that these personae are not equivalent to pseudonyms, which tend to
hide the fact that two authorial names are the same. They're more like
different forms of the same iconographic identity.  That being the
case, I'd establish the authorial Snoopy as just Snoopy, and give
Dr. Snoopy, Joe Cool, etc. as 400s, if they ever turn up as
authors.

The question is, what level of granularity is most appropriate for
collocation and will best match users' expectations and needs. I find
it dubious that most users would prefer to have to track down all the
Snoopy personae under their individual names when they're looking for
stuff by or about Snoopy. We have only one subject heading for him.
Would a new subject heading be needed to catalog a poster of Joe Cool,
or would Snoopy (Fictitious character) still apply?

Three minor notes: it's Schulz, not Schultz. I don't think anyone
would argue that the book by Dr. Snoopy should also have Schulz as an
access point. And let's not forget spirits, who can also be authors
under AACR2 (e.g., Seth (Spirit)).

Stephen

On Wed, Apr 27, 2011 at 12:50 PM, Laurence Creider
lcrei...@lib.nmsu.edu wrote:
 John,

 What you say is well thought, and made me realize that I should have been
 clearer in saying that I consider Dr. Snoopy to be a form of a name and not
 a different name until proven otherwise, particularly given the presumed
 character depicted in the illustrations and the name of the illustrator.
  Joe Cool presents a different case, of course.

 As you say, RDA may need some revision here, as AACR2 certainly did for some
 of its unintended consequences.

 Larry

 --
 Laurence S. Creider
 Special Collections Librarian
 New Mexico State University
 Las Cruces, NM  88003
 Work: 575-646-7227
 Fax: 575-646-7477
 lcrei...@lib.nmsu.edu

 On Wed, 27 Apr 2011, John Attig wrote:


 On 4/27/2011 11:40 AM, Laurence Creider wrote:
      The point of my comment yesterday was that there was no proof
      that Dr.
      Snoopy was in fact a different person from Snoopy.  The
      existence of a
      title means nothing.  Sometimes I use my Dr. or Professor,
      sometimes I do
      not.


 As the JSC was reviewing the drafts of the section of RDA that dealt with
 multiple identities or personae, it struck me that a literal reading of
 RDA
 would suggest that the simple use of different names (but not different
 forms of the same name or changes of name) was sufficient evidence of the
 intent to establish a separate bibliographic identity.  If that is true,
 then Larry's point above is not relevant: you don't need proof that Dr.
 Snoopy is a different person, you only need evidence of the use of a
 distinct name -- and a decision that this is a different name rather than
 a
 different form of the same name (which I suppose one could argue).




-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
Another fundamental rule of identifiers is that what is identified
should not change significantly. That generally holds true in LC
authority practice, but not in the case of undifferentiated personal
name authorities. By rule and standard procedure, an LCCN for an
authority of this kind can refer uniquely to one person now, and a
different person later, and yet another person later still.

This is another reason why a system which restricts the data elements
that can be used to make a unique heading to the point that making a
unique heading is not always possible is problematic. Good identifiers
can always be made unique to correspond to a unique entity. LC/NACO
personal name headings can't, which is another reason why they're not
good identifiers under the current rules.

Stephen

On Mon, Apr 25, 2011 at 11:20 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 I am talking about our library-community database as the database [someone]
 is linking to.

 If we're always changing our identifiers (considering our authority 1xx
 preferred display forms to be identifiers), that makes it very hard for
 anyone to link to things in our database.

 Even just for our own database with their internal links, always changing
 the effective 'identifiers' (auth 1xx) makes our own housekeeping much more
 expensive for ourselves.

 Again, this is because of using the very same string (auth 1xx) as both a
 functional identifier and a functional preferred display term.  A
 practice that is highly discouraged in actual contemporary software/metadata
 engineering, although it worked fine 100 years ago.

 Seriously, it is a fundamental idea in identifier management, decades old,
 that you should not change your identifiers, and for this reason you should
 not use strings you will be displaying to users as identifiers. One way this
 idea is expressed, for instance, is that you should not use a 'natural key'
 as a 'primary key' in a relational database. You can google on those terms
 if you want. In the sense that an rdbms pk serves as a kind of identifier,
 that is just one expression of the fundamental guideline not to change your
 identifiers, and thus not to use things you might want to change as
 identifiers.

 I am seriously not sure why you are arguing this, James.  This is a pretty
 fundamental concept of data design accepted by every single contemporary era
 data/database/metadata designer. This is probably my last post in this
 thread, this is getting frustrating to me.  Perhaps it's my fault in not
 being able to explain this concept adequately, in which case I don't think I
 can personally do any better then I've done.  Otherwise, I am not sure why
 you are insisting on arguing with a basic principle accepted by everyone
 else doing computer-era data/database/metadata design -- which has been
 proven in practice to be a really good prinicple. It's not a controversial
 principle.  At all.  Anywhere except among library catalogers, apparently.

 Jonathan

 On 4/25/2011 12:12 PM, James Weinheimer wrote:

 On 04/25/2011 05:56 PM, Jonathan Rochkind wrote:
 snip

 If you maintain the preferred display form as your _identifier_, then
 whenever the preferred display form changes, all those links will need to be
 changed.

 This is why contemporary computer-era identifier practice does NOT use
 preferred display form as an identifier. Because preferred display forms
 change, but identifiers ought not to. The identifier should be a
 _persistent_ link into your database for the identified record.

 /snip

 So long as the link from your database links unambiguously to the resource
 you want to link to, that is all that matters. There are different ways of
 allowing that. This function is most efficiently handled by the database you
 are linking into, instead of the single database expecting everybody in the
 world to change their own databases to add their URIs. For example, I could
 add a link for the NAF form of Leo Tolstoy to dbpedia to interoperate with
 it. If they had a special search for exact NAF form, like in the VIAF, it
 would definitely be unambiguous.

 My point is: this is something that is achievable. Probably through a
 relatively simple API, it could be implemented in every catalog pretty
 easily. There is just no hope that each catalog will add URIs within any
 reasonable amount of time.

 Certainly, if we were creating things from scratch, we could redo
 everything that would be better for us (there is no doubt in my mind that
 future information specialists/catalogers 80 years from now will be
 complaining about whatever we make), but you must play the cards you are
 dealt and be creative with what you have.  Perhaps it wouldn't be perfect,
 or maybe it would, I don't know, but in any case, it would be vastly better
 than what we have now and people could start discovering and using our
 records in new ways.





-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University

Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
Actually it doesn't remain the same. The current rules say that
identities can and should move on and off of an undifferentiated
personal name authority (UndiffPNA). When an UndiffPNA is reduced to
representing a single identity again, it is recoded as unique
(UniqPNA), until another person with the same name gets added to it,
it becomes a UndiffPNA again, and so on--all under the same LCCN. So,
over time, the rules will require that a single LCNAF authority record
represent a string of unique persons:

UniqPNA Smith, John (1)
Smith, John (2) appears, and cannot be given a unique heading
UndiffPNA Smith, John (1) and Smith, John (2)
Smith, John (1) acquires a distinguishing bit of data and
is given a separate, new record.
UniqPNA Smith, John (2)
Smith, John (3) appears, and cannot be given a unique heading
UndiffPNA Smith, John (2) and Smith, John (3)
Smith, John (2) acquires a distinguishing bit of data and
is given a separate, new record.
UniqPNA  Smith, John (3)

I agree with Jonathan that persons are slippery, UndiffPNAs are
pretty useless, and that they should never revert to UniqPNAs; but the
rules instruct us otherwise (specifically, LC's Descriptive Cataloging
Manual, Section Z1, 008/32, which NACO follows: When an
undifferentiated personal name authority record is being revised to
delete all but one name, change value b to a. ).

Stephen




On Mon, Apr 25, 2011 at 11:55 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 I'd interprett it differently, I'd say that an undifferentiated name
 authority always refers to the same thing -- a sort of fake person that
 isn't really a known person at all. But this remains the same, it's just the
 way  it is.

-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
I've been trying to identify the linchpins in our documentation that
hold the sorry UndifPNA practice together. One is the DCM instruction
cited earlier. Another is the revised NACO Heading Comparison Rules
which forbid identical 100s. All AACR2 says is that identical
headings should be used in bib records when heading forms can't be
distinguished. It does not require that a single authority record be
created for persons with undifferentiated headings, and as John and
Diane point out, there's no need to do so. If differentiation is
managed elsewhere, the 100s (or more precisely, the encoded heading
texts) could be identical. The heading comparison rules could take
into account additional data not meant for display in the heading
text, like a difference between LCCNs.

There's more about managing undifferentiated names in RDA than there
ever was in AACR2. RDA instructs that the Undiff indicator must be
used when the core elements are not sufficient to distinguish two
names (e.g., RDA 8.3); but there may be room to argue that multiple
PNAs could carry the Undiff indicator to acknowledge that their 100s
are undifferentiated, without requiring that all the persons who share
an undifferentiated heading also share a single authority record.
Maybe that could be done with an LC Policy Statement. The point of the
008/32=b code would be to warn systems not to do automatic matching on
certain records' heading text strings, which is the practical value it
has now. VIAF and other smart systems avoids matches involving
UndiffPNAs. However, if the relationship between an authority record's
ID and the person it represents were fixed and consistent, then
systems using the LCCN identifier (or some synonymous ID) to match
between bib headings and authority records could safely link to an
UndiffPNA and thereby inherit any later changes to that person's
heading or authority record.

My guess is there are other rules that I haven't spotted yet, but
these three--DCM Z1 008/32, NACO Heading Comparison, and
RDA/LCPS--would need to change to correct the current practice.

Stephen

On Mon, Apr 25, 2011 at 1:23 PM, Diane I. Hillmann d...@cornell.edu wrote:
  Just to point out a few things here:

 If we were not making the text of the name serve double duty, we would be
 providing an identifier to every newly established name, and the description
 would provide information on where that name appeared (a title page, for
 instance), which would thereby provide a distinction between it and another
 authority description based on a different resource, where the name that
 displayed was the same.  In this new world, there would NEVER be a need for
 an UndiffPNA (thanks, Stephen, for the unpronouncable shortened name for
 this!).  If we ultimately discovered that this John Smith really was the
 same as THAT John Smith, we could associate them, BUT NOT HAVE TO CHANGE THE
 IDENTIFIERS.

 Consider the amount of sheer human grunt work we could avoid (not to mention
 the actually bucks), with absolutely no loss of quality control, by moving
 on from our traditional practices.  And why can't we convince people that
 this is better, cheaper, and much more sensible?  ARRRGGH.

 Diane

 On 4/25/11 1:42 PM, Stephen Hearn wrote:

 Actually it doesn't remain the same. The current rules say that
 identities can and should move on and off of an undifferentiated
 personal name authority (UndiffPNA). When an UndiffPNA is reduced to
 representing a single identity again, it is recoded as unique
 (UniqPNA), until another person with the same name gets added to it,
 it becomes a UndiffPNA again, and so on--all under the same LCCN. So,
 over time, the rules will require that a single LCNAF authority record
 represent a string of unique persons:

 UniqPNA Smith, John (1)
             Smith, John (2) appears, and cannot be given a unique heading
 UndiffPNA Smith, John (1) and Smith, John (2)
             Smith, John (1) acquires a distinguishing bit of data and
 is given a separate, new record.
 UniqPNA Smith, John (2)
             Smith, John (3) appears, and cannot be given a unique heading
 UndiffPNA Smith, John (2) and Smith, John (3)
             Smith, John (2) acquires a distinguishing bit of data and
 is given a separate, new record.
 UniqPNA  Smith, John (3)

 I agree with Jonathan that persons are slippery, UndiffPNAs are
 pretty useless, and that they should never revert to UniqPNAs; but the
 rules instruct us otherwise (specifically, LC's Descriptive Cataloging
 Manual, Section Z1, 008/32, which NACO follows: When an
 undifferentiated personal name authority record is being revised to
 delete all but one name, change value b to a. ).

 Stephen




 On Mon, Apr 25, 2011 at 11:55 AM, Jonathan Rochkindrochk...@jhu.edu
  wrote:

 I'd interprett it differently, I'd say that an undifferentiated name
 authority always refers to the same thing -- a sort of fake person that
 isn't really a known person at all. But this remains the same, it's just
 the
 way

Re: [RDA-L] Linked files

2011-04-25 Thread Stephen Hearn
But submit to whom? I think PCC oversaw the last revision of the NACO
Heading Comparison rules (formerly NACO normalization). LC manages the
DCM, which is closer to being an internal document than the LCRIs have
been, and less open to community input.  (DCM's instructions on using
pairs of 670s in the 670/General section would also need changing.)
Does JSC need to rule on what RDA means to say about undifferentiated
names before LC can make policy statement about them?

Regardless, this goes nowhere without LC and changes to the DCM. I've
tried to make the case with leaders there and have met with a counter
that presumes that undifferentiated authorities are used when the
persons, not their headings, can't be distinguished, which really
misunderstands how UndiffPNAs are structured and used. The DCM 670
instructions already make it clear that the persons on an UndiffPNA
are being distinguished from one another through the device of paired
670 fields. It's very frustrating.

Stephen

On Mon, Apr 25, 2011 at 2:57 PM, Mary Mastraccio ma...@marcive.com wrote:
 My guess is there are other rules that I haven't spotted yet,
 but these three--DCM Z1 008/32, NACO Heading Comparison, and
 RDA/LCPS--would need to change to correct the current practice.

 The desire to have the UndifPNA practice/records changed has been expressed 
 repeatedly over the years. It seems to me that someone needs to step forward 
 to officially submit such a proposal. Can PCC, or similar group, be persuaded 
 to promote this change?

 Mary L. Mastraccio, MLS
 Cataloging  Authorities Librarian
 MARCIVE, Inc.
 San Antonio Texas 78265
 1-800-531-7678
 ma...@marcive.com
 www.marcive.com




-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] RDA and MARC (was Linked data)

2011-02-11 Thread Stephen Hearn
Yes and no. On the one hand, music catalogers have been much more
diligent about using uniform titles for works. On the other hand, in
terms of recorded performances, all of what they deal with could be
considered expressions. As expressions, their access points are not
differentiated, e.g., by performer, when multiple works appear
together in a recording, and the association between an individual
performer and a particular expression often requires human
interpretation.

What's needed is a tiered or multi-record structure. We could embed
lots of micro-records in a record for a recording in order to
associate the significant entities and attributes with each of the
objects it contains (imagine a record for a recording, only much
longer and more redundant with other records). We could divide the
work among multiple records, putting one set of data on a FRBR work
record, another on an expression record, and linking both of those
directly or indirectly to a manifestation record for a compilation.
But we still haven't figured out how we'll build such records or such
links. The discussion at Midwinter MARBI about enabling a label for
work and expression records without real consideration of how to
encode their contents was indicative of how far we have to go on this.

Stephen

On Fri, Feb 11, 2011 at 11:52 AM, Karen Coyle li...@kcoyle.net wrote:
 Quoting Weinheimer Jim j.weinhei...@aur.edu:


 But I wonder if what you point out is a genuine problem, especially in an
 RDA/FRBR universe. The user tasks are to find, identify, yadda -- works,
 expressions, manifestations, and *items*. Not sub-items.


 Jim, I think you're at the wrong end of the WEMI continuum -- what this
 record lacks is better access to *Works* contained in the
 manifestation/item. Items are the physical items, the thing you have in
 hand. The added entries in this record represent persons and works.

 The fact that music cataloging has used constructed titles for all works
 (Quartets, strings, no. 1) puts them way ahead of other cataloging
 communities, in particular book cataloging. In music, you almost always have
 a Work-level representation for every work in the manifestation. In book
 cataloging we not only lack controlled names for most works (e.g. no uniform
 titles), but there is less emphasis on adding an entry for every work in the
 manifestation. (And there's great confusion as to what multiple works mean
 for expressions.)

 It's fairly common to find a book record that *should* have a uniform title
 but does not.

  http://lccn.loc.gov/46003912 (and to be clear, LC is relatively diligent
 about UTs compared to many other libraries)

 In addition, it seems to be unclear whether the titles in added entries in
 book records for other books consist of uniform titles, or what to do in
 cases when the library has decided not to display or shelve books under
 uniform titles that its users may not recognize (Voina i Mir).  You cannot
 assume that the lack of 240 means that the 245 $a$b entry is *also* the UT,
 and you cannot assume that a 7xx$t has been created in proper UT form.

 I think that defining, grasping, and coding of Work titles (as they are
 called in FRBR and RDA) is going to be a huge challenge... but mainly in
 those areas where we haven't done a good job of this in the past. I'm
 beginning to think that music cataloging could lead the way because there is
 greater clarity about multiple works in a manifestation than we have in book
 cataloging. I'd be interested to hear if there are other cataloging
 subsets that have handled this well -- law? maps? serials?

 kc



 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet




-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Linked data

2011-01-19 Thread Stephen Hearn
It's a good article, but also a bit disingenuous. Much more was being
asked for than just a displayable title, as the author's
dissatisfaction with the initial results makes clear. It would help to
have the full list of expectations stated up front, to make clear that
what is being asked for is itself fairly complex:

A displayable title
A recognizable title (Selections is not enough, but neither is
Symphony no. 3)
A title which represents the full contents of the object
A title which makes clear the semantic relationships of its several elements

That's a taller order than what the wording of the article suggests.
To demand beyond that than any algorithm proposed for extracting such
a complex piece of data from MARC should be reliable across the vast
sea of catalog records with all their acknowledged variability is just
silly.

MARC is complex, the cataloging rules are complex, and the objects
they seek to represent are, in many, many cases, complex. Simple
approaches to any of these won't work, unless the bar for what's
expected is set very low.

Stephen

On Wed, Jan 19, 2011 at 12:03 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Again, as someone who knows cataloing rules, if there's an algorithm you
 can give me that will let me extract the individual elements (actual
 transcribed title vs analytical titles vs parallel titles vs statement of
 responsibility) reliably from correct AACR2 MARC, please let me know what it
 is.

 I am fairly certain there is no such algorithm that is reliable.

 I guess you could say that there's no reason to _expect_ that you should be
 able to get those elements out of a data record.  But most developers,
 library or not, will consider bibliographic data that you can't reliably
 extract the title of the item (a pretty basic attribute, just about the most
 basic attribute there is) from to be pretty low-value data.  They won't
 change their opinion if you show them the record serialized in MarcXML
 instead of ISO Marc21.

 All that you get by being an expert in the data is the knowledge that you
 _can't_ really reliably algorithmically extract the transcribed title alone
 from any arbitrary 245 of Marc/AACR2.   It'll work for the basic cases, but
 once you start putting in parallel titles, analytics, and parallel titles of
 analytical titles, it's a big mess -- and such complicated cases (which are
 rare in general but common in some domains like music records) are also the
 ones where the cataloger is most likely to have gotten the punctuation not
 EXACTLY right, making it even more hopeless, even if the programmer did want
 to write an incredibly complicated algorithm that tried to take into account
 the combination of ISBD punctuation with marc subfields.

 Yes, many of these issues have been known from the beginning and dealt with
 in various ways. That doesn't make the data easily useable by developers,
 whether you put in MarcXML or not. Those various ways, if we're talking
 about software trying to extract elements from bib records,  are expensive
 (in developer time) and fragile (they still won't work all the time) hacks.

 On 1/19/2011 12:52 PM, Weinheimer Jim wrote:

 Jonathan Rochkind wrote:

 Concerning:   One example of this can be found reported in this
 article: http://journal.code4lib.org/articles/3832;
 snip
 Okay, what would someone who knows library metadata do to get a
 displayable title out of records in an arbitrary corpus of MARC data?
 There's an easy answer that only those who know library metadata
 (apparently unlike people like Thomale or me who have been working with
 it for years) can provide?  I have my doubts.
 /snip

 I agree that this is an excellent article that everyone should read, but I
 wrote a comment myself there (no. 7) discussing how this article illustrates
 how important it is to know cataloging rules and/or to work closely with
 experienced catalogers when building something like this. It also shows how
 many programmers concentrate on certain parts of a record and tend to ignore
 the overall view, while catalogers concentrate on whole records.

 In this case, the parsing is *always* done manually by the cataloger, who
 is directed to make title added entries, along with uniform titles,
 including the authors--that is, so long as the cataloger is competent and
 following the rules. So, it is always a mistake to concentrate only on a
 single field since a record must be must be considered in its entirety.  It
 would be unrealistic for systems people to know these intricacies, but it
 just shows how important it is that they work closely with catalogers.

 Therefore, it's not *necessarily* arbitrary. Many of these issues have
 been known since the very beginnings and have been dealt with in various
 ways.

 James L. Weinheimer  j.weinhei...@aur.edu
 Director of Library and Information Services
 The American University of Rome
 Rome, Italy
 First Thus: http://catalogingmatters.blogspot.com/




-- 
Stephen Hearn, Metadata

Re: [RDA-L] RDA Questions

2010-10-08 Thread Stephen Hearn
No set of rules can ultimately determine the form of a name heading, because
that will inevitably depend on the information available to the person
creating the authority record. Two people with different information can
create different headings for the same person  following the same rule set.
Authority files are the only way to determine a shared form of name heading.

That being the case, I'm more concerned about not having authority control
split between an RDA file and an AACR2 file. We'll better served by a
convention which treats established AACR2 headings as RDA compatible and
treats RDA headings as AACR2 compatible, thereby enabling a single
authority record and, more importantly, a single authority file to be used
for both AACR2 and RDA bib records. We already have much more aggresive
conventions when it comes to updating authorities than we used to, so we
could expect a steady migration of headings and authority records in the
direction of RDA. The advantages of authorizing against a single file rather
than starting up a second parallel file during this transition would be
considerable.

I can understand the reasons for using 7XX authority links during the trial
period, since it's less intrusive on the relationship of AACR2 established
headings and bibs; but it's not going to be the best transition strategy
when real implementation begins.

Stephen

On Fri, Oct 8, 2010 at 11:05 AM, Mike Tribby
mike.tri...@quality-books.comwrote:

 If we are using authority records from LC, anything other than following
 their lead would not make sense.

 Again, not every cataloging agency follows LC's lead. This kind of option
 in the RDA rules serves no purpose that I can see. And haven't we heard
 plenty in the recent past about not relying on LC for everything? If the
 rules allow deviations in practice, deviations are sure to occur. Moreover,
 LC does _not_ prepare every authority record in the shared authority file.


 Mike Tribby
 Senior Cataloger
 Quality Books Inc.
 The Best of America's Independent Presses

 mailto:mike.tri...@quality-books.com


 -Original Message-
 From: Resource Description and Access / Resource Description and Access
 [mailto:rd...@listserv.lac-bac.gc.ca] On Behalf Of Brenda Parris Parker
 Sent: Friday, October 08, 2010 10:59 AM
 To: RDA-L@LISTSERV.LAC-BAC.GC.CA
 Subject: Re: [RDA-L] RDA Questions

 If we are using authority records from LC, anything other than following
 their lead would not make sense.

 Brenda

 Resource Description and Access / Resource Description and Access
RDA-L@LISTSERV.LAC-BAC.GC.CA writes:
 Which is why I suggest that we follow LC's lead in this matter.
 
 Robert L. Maxwell
 Head, Special Collections and Formats Catalog Dept.
 6728 Harold B. Lee Library
 Brigham Young University
 Provo, UT 84602
 (801)422-5568
 
 -Original Message-
 From: Resource Description and Access / Resource Description and Access
 [mailto:rd...@listserv.lac-bac.gc.ca] On Behalf Of Mike Tribby
 Sent: Friday, October 08, 2010 9:42 AM
 To: RDA-L@LISTSERV.LAC-BAC.GC.CA
 Subject: Re: [RDA-L] RDA Questions
 
 How does open-ended instruction on just how to note birth and death
 dates achieve the interchangeability and all-important granularity that
 RDA is purported to advance? If I record Lee Perry as Perry, Lee,
 1936- and another cataloging agency records him as Perry, Lee, b.
 1936 how does that achieve anything other than confusion?
 
 If confusion is our goal, I'm all for it at this point. But somehow I
 doubt that's what we're pursuing here.
 



 Brenda Parris Parker
 Technical Services/Reference Librarian
 Brewer Library
 Calhoun Community College
 Decatur, AL

 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.448 / Virus Database: 271.1.1/3183 - Release Date: 10/07/10
 18:34:00




-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Interesting conversations about RDA and FRBR ...

2010-09-15 Thread Stephen Hearn
On the point about reinventing--it's worth noting that Classical Archives
succeeds by being more rigorous, more uniform, and more extensive in its use
of what librarians would call uniform title data--form terms,
instrumentation, etc.--than librarians are. So maybe it doesn't matter
whether the user community understands uniform titles, as long as uniform
title data is presented to them in a useful way. Given that catalog records
have made this harder by burying many uniform titles in 245 fields, not
providing a uniform set of identifying information, and treating uniform
titles generally as optional data, the struggles systems have had with
providing inclusive, well ordered index displays is understandable. RDA's
extensions of the MARC21 authority format could help with this, if they're
uniformly applied. A big if ...

On the point about generating descriptions of higher-level objects from
attributes common to sets of manifesation records--often the higher-level
object whose attributes we'd like to capture is only partially represented
and deeply enmeshed in attribute statements about the manifestation.
Manifestations are far from being transparent containers of works and
expressions. Deriving work and expression descriptions from them is a good
idea, but it will be a lot harder than it looks. More broadly, though,
letting the recognition that some sets of objects contain a common
work/expression drive the choice of which work/expressions to establish
makes a lot of sense.

Stephen

On Wed, Sep 15, 2010 at 10:05 AM, Mike Tribby mike.tri...@quality-books.com
 wrote:

 How would the OPAC know to display only English-language books if you
 don't tell it beforehand, whether FRBR catalog or otherwise?

 If the search one initiated were on title spelled in English or on the
 title (spelled in English) in a keyword search? Perhaps the title in English
 as a keyword search would bring up the Danish version, too, though.




 Mike Tribby
 Senior Cataloger
 Quality Books Inc.
 The Best of America's Independent Presses

 mailto:mike.tri...@quality-books.com


 -Original Message-
 From: Resource Description and Access / Resource Description and Access
 [mailto:rd...@listserv.lac-bac.gc.ca] On Behalf Of Mark Ehlert
 Sent: Wednesday, September 15, 2010 10:02 AM
 To: RDA-L@LISTSERV.LAC-BAC.GC.CA
 Subject: Re: [RDA-L] Interesting conversations about RDA and FRBR ...

 J. McRee Elrod m...@slc.bc.ca wrote:
  Won't FRBR result in even more unwanted item records being displayed?
  Will one be able to turn of FRBR display in OPACs?  I don't *need* to
  see the record for the Danish original of the murder mystery I want to
  read!

 How would the OPAC know to display only English-language books if you don't
 tell it beforehand, whether FRBR catalog or otherwise?

 --
 Mark K. Ehlert Minitex
 CoordinatorUniversity of Minnesota Bibliographic 
 Technical  15 Andersen Library
  Services (BATS) Unit222 21st Avenue South
 Phone: 612-624-0805Minneapolis, MN 55455-0439 
 http://www.minitex.umn.edu/

 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 8.5.445 / Virus Database: 271.1.1/3136 - Release Date: 09/15/10
 06:34:00




-- 
Stephen Hearn, Metadata Strategist
Technical Services, University Libraries
University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN 55455
Ph: 612-625-2328
Fx: 612-625-3428


Re: [RDA-L] Question about RDA relationships (App. J)

2010-03-05 Thread Stephen Hearn
The web statements would presumably be derived from a large set of 
records, not from an individual record. The bib record for Sturges'  
Magnificent 7 if constructed the same way as the Kurosawa record would 
inferentially provide the data needed to create the statement 
establishing his connection. The logic would break down, though, when 
the bib record describes multiple objects. A MARC bib record describing 
a 3-DVD set of the three King Kong movies would have a hard time 
associating individual directors with their movies in a way that 
machines could process. As Adam observed, an authority record focused on 
a particular work or expression in a way that many bib records are not 
would provide a better basis for establishing these kinds of relationships.


Stephen

Adam L. Schiff wrote:

In today's record, we would code this somewhat like:

100 $a Kurosawa, Akira $e director
245 $a Shichinin no samurai
246 $a Seven Samurai
500 $a Adapted as The Magnificent 7
730 $a Magnificent 7


Well I would change your 100 to a 700 to make this more like what we 
do in a bibliographic record.  But there's no reason all of this could 
not be in an authority record instead:


130  $a Shichinin no samurai
430  $a Seven samurai
500  $a Kurosawa, Akira $e director
530  $w r $i adapted as $a Magnificent 7


**
* Adam L. Schiff * * Principal 
Cataloger*

* University of Washington Libraries *
* Box 352900 *
* Seattle, WA 98195-2900 *
* (206) 543-8409 * * (206) 685-8782 
fax *
* asch...@u.washington.edu   * 
**


--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] Systems v Cataloging was: RDA and granularity

2010-02-02 Thread Stephen Hearn
I appreciate John Myers example. This conversation has itself been 
lacking in granularity. But one of the things it points out is that 
we're not necessarily talking about MARC. The spotty use of uniform 
titles for translations is a result of cataloging policy decisions, not 
a limit imposed by MARC. On the other hand, there are limits in MARC. At 
MARBI there has been talk of the need for a level of demarcation below 
the MARC subfield. There's often talk of the exhaustion of the definable 
value set for some MARC fixed fields and some fields' subfield codes. 
The gerrymandering of MARC to express RDA elements has been strange to 
watch, and it's clear that many would be happier doing this development 
in a more flexible coding environment with a wider support base.


My understanding of Karen Coyle's original point was that  the MARC data 
format is not congruent with the rules used to formulate data for it. 
AACR2 covers some of what goes into MARC; subject rules cover some more; 
DCM covers some more; and rules guidance for the correct coding and 
interpretation of MARC fixed field data can be hard to find. RDA may 
define a more granular and extensive element set and standard values for 
some elements, but since it's format-independent, there will still be 
need for adjunct documentation to explain how those elements are mapped 
into whatever format is used to express them. Add to that the ongoing 
division of cataloging expertise into different domains, and a certain 
amount of complexity and incongruity in our cataloging rule sets and 
their relationship to format seems inevitable.


In any case, given how complex our rules and format environments are, 
statements about the failure of  MARC or AACR2 or RDA are very 
hard to interpret. More specificity about what in particular is failing 
will help us understand and analyze the problems better and come up with 
better solutions. Granularity and complexity--gotta love 'em if you're 
in this game.


Stephen

Myers, John F. wrote:

At the risk of showing my ignorance on the topic, it's not so much
getting info from MARC -- as Daniel's quip indicates there are a few
thousand ILSes doing fine with it.  The issue is making the information
actionable by other machines.  I might add that not all of the
shortcomings are MARC's fault but also of the cataloging codes that are
used to populate a MARC record. 


As an example, consider the FRBR expression entity.  A significant
aspect in textual works between expressions is translation.  We do have
a 240 field to record that, but since the application of the rules for
Uniform titles were left to the discretion of the cataloging agency,
indication of an expression for a translation can also appear in a
translation note recorded in tag 500, sometimes in conjunction with the
240 but oftentimes alone (as several thousand records in my catalog will
attest).  Now, if this data were consistently recorded in the 240 (both
with respect to the format and to the application of use of the 240),
then machine FRBR-ization of these records for translations would be
relatively simple.  


In the present circumstances however, with the existing mix of
treatments, it is much, much more complicated.  Having attempted to
duplicate solo translation notes into corresponding 240 tags, I have
learned there are no simple solutions.  I have to manually copy,
interpret, and edit the data due to the free-text nature of the note.  I
could employ a few programming tricks to simplify some of the tasks, but
I would still need to review so much of the resulting edits that my
current manual method is no less inefficient than a (partially)
programmed approach.  My manual approach is on the edge of practicality
(although I'm not sure which side), when I start with a file of about 4k
culled from a database of about 300k.  Any larger sets would likely be
unfeasible.

Doubtless others will have more cogent examples but I hope this gives a
hint as to the problem.

John F. Myers, Catalog Librarian
Schaffer Library, Union College
807 Union St.
Schenectady NY 12308

518-388-6623
mye...@union.edu


-Original Message-
From: Resource Description and Access / Resource Description and Access
[mailto:rd...@listserv.lac-bac.gc.ca] On Behalf Of Frances, Melodie

Can anyone explain WHY it's so hard to get info from MARC? 
  


--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] RDA and granularity

2010-01-28 Thread Stephen Hearn
I thought that $n and $p are in 245 because they're defined as uniform 
title elements, and 245 is unfortunately considered to be both the 
descriptive title and the uniform title when coincidence allows. One 
value of $n and $p subfielding in uniform titles is that you can 
authorize headings hierarchically and provide references appropriate to 
different levels. Without $n and $p, the 245 couldn't contain the 
authorized form of some uniform titles. Of course, this has all proved 
very problematic. The transcribed title and the uniform title should be 
separate data structures. In the former, where apparently the main use 
of subfielding would be to control machine-supplied punctuation and 
display labeling to data which is determined by transcription rules, the 
mix of needed subfields would be different from that needed for uniform 
titles, which for example have no parallel title component needing its 
own subfield.


Stephen

Kevin M. Randall wrote:

John Attig wrote:

  

Subfields $n and $p are an example of this.  I
would hate to lose these distinctions; specifically, they relate to
ISBD punctuation specifications and -- as noted in MARC 2010-DP01 --
this content designation does allow ISBD punctuation to be supplied
for display rather than encoded in the data (although the 245 field
cannot support all the possible ISBD punctuation conventions).



John, thanks a lot (NOT!) for explaining this.  I was getting all ready to
push for making $n and $p obsolete.  The thing is, apart from their
usefulness in identifying the ISBD elements, I can't think of any use for
them in our systems.  If anything, they are an impediment!  It is very
common for index and OPAC displays to suppress $n and $p data.  Even OCLC
does it!!  Does anybody know what good the subfield codes do us?

Kevin M. Randall
Principal Serials Cataloger
Bibliographic Services Dept.
Northwestern University Library
1970 Campus Drive
Evanston, IL  60208-2300
email: k...@northwestern.edu
phone: (847) 491-2939
fax:   (847) 491-4345
  


--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] (Online) qualifier for series

2009-07-22 Thread Stephen Hearn
In the case of multi-part monographs, LCNAF has cases of authorized 
access points for what I take to be manifestation-level entities, e.g.


Tolkien, J. R. R. (John Ronald Reuel), 1892-1973. Lord of the rings 
(Silver anniversary edition) [LCCN  n 42024986]


which is a controlled heading for a particular edition from a particular 
publisher. Will these be accommodated in RDA? Or will things like 
publisher, edition, and year of an edition's first publication be 
considered attributes of expression-level entities for multi-part 
monographs?


Stephen

Adam L. Schiff wrote:
RDA has both authorized access point for work and authorized access 
point for expression.  There are no rules at present for authorized 
access points for specific manifestations or items.


Adam

^^
Adam L. Schiff
Principal Cataloger
University of Washington Libraries
Box 352900
Seattle, WA 98195-2900
(206) 543-8409
(206) 685-8782 fax
asch...@u.washington.edu
http://faculty.washington.edu/~aschiff
~~

On Wed, 22 Jul 2009, Karen Coyle wrote:

RDA doesn't define a uniform title, but instead (well, I think of 
it as instead) has title of the work. I think this will be an 
improvement, in part because every Work should have a title, whereas 
uniform titles were the exception rather than the rule. Oftentimes 
the title of the work will be the same as the title proper, which 
is associated with the manifestation. There doesn't, however, seem to 
be a specific title for the expression. Maybe someone here could 
clarify this for us.


kc

Jonathan Rochkind wrote:

hal Cain wrote:


Just what is the uniform title intended to do here?  To serve as a
one-line identifier for what's being catalogued; to provide a linking
point for the work content; or to provide a linking point for the
expression embodied?



This is a really important point.  In my reading of our history, the 
uniform title has traditionally been intended to do _several_ 
things, things that sometimes work at cross-purposes. Many of these 
things haven't really been specified, so much as they are 
tradition -- and in the current environment, often applied 
mechanistically without thinking about intent.


We need to become clear on what uniform title is supposed to do -- 
and I believe, once we have that clarity, it will also be clear that 
uniform title alone can't do all the things it's been implicitly 
depended upon (or hoped for?) to do.  We need instead one mechanism 
for collocating works, another for collocating expressions, another 
to serve as user-presentable display label (supporting doing this in 
multiple languages!), another to say what language an expression is 
in, and another to do... whatever it is that music catalogers do 
with uniform title (there are probably half a dozen different things 
just in music cataloging practice, none of which I understand!)


Jonathan



Until we have that clear (and RDA discussions have failed to make that
clear to me -- perhaps on account of my inattention, but I can usually
follow clear exposition) we'll go on making ad-hoc and conflicting
decisions.

FWIW I don't think the application of FRBR categories provides us with
the tools to make the distinctions people are talking about here --
they're not subtle enough, at least not within the framework of the
MARC21 bibliographic format.  And the success will depend on the 
display

created, a matter which RDA chose not to address, but crucial to the
outcome.

Hal Cain
Dalton McCaughey Library
Parkville, Victoria, Australia
h...@dml.vic.edu.au







--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234




--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


[RDA-L] [Fwd: Re: [RDA-L] FW: [RDA-L] Slave to the title page?]

2009-01-02 Thread Stephen Hearn

Actually, I think there are more factors involved than just powerful
technology and limited imaginations. Consider organizational
structures--the relationships which national library CIP programs are
based on are not between an author and a cataloger, but between
publishing companies and a national library.  If every individual
website creator could voluntarily demand CIP cataloging, that would be a
major change to the CIP program, not just a new application of a
tried-and-true model. It would likely overwhelm LC's ability to uphold
its side of the bargain, since the allocation of limited human resources
is another factor that the powerful technology and limited
imaginations equation ignores.  One way around this is the distribution
of CIP creation to a host of other providers, as Mac suggests--but
does this really have the same value, given that these alternate CIP
sources presumably cannot be supply an LCCN or other national library
record identifier for their data?

Maybe instead of an extension of the CIP program, we need to imagine
something new--a program that would distribute unique LCCNs without
cataloging to content providers or operations like Quality Books and
SLC. The LCCNs could then be embedded in the content providers'
productions with or without accompanying metadata (following certain
guidelines, of course). The LCCN identifier and the document itself, in
whatever form, would then become the kernel of information from which
various kinds of surrogate records could be developed and authorized at
different levels. That might actually be affordable at an organizational
level, at least until we run out of LCCNs.

Stephen

 Original Message 
Subject:Re: [RDA-L] FW: [RDA-L] Slave to the title page?
Date:   Thu, 1 Jan 2009 18:21:27 +0100
From:   Weinheimer Jim j.weinhei...@aur.edu
Reply-To:   j.weinhei...@aur.edu
To: RDA-L@INFOSERV.NLC-BNC.CA



J. McRee (Mac) Elrod wrote:

 Properly approached, and shown that included bibliographic data would
 increase hits, website creators might well welcome such a feature.

 Some publishers who fall outside LC's cataloguing in publication
 program pay Quality Books $50 for CIP for inclusion in their
 publications, because they have found it increases sales.  Some
 Canadian publishers purchase CIP from us (at less cost because we do
 not establish the related authorities as does QB).

 Imbedded bibliographic data in websites could be thought of as CIP.
 It's not a new or novel concept.  It would be best if website creators
 could be included in the LC and LAC CIP programs as are text
 publishers.

No, it's not a new idea at all--that's one of its greatest advantages.
It's simply a new application of a tried-and-true model, plus there
would be a division of labor based on the most efficient workers: the
initial record made by catalogers (with input from the creator), updates
to the description by the creator, updates to headings by the cataloger,
while everything remains under the watch of the selectors. If someone
else wants the record, they could just take it from the embedded
metadata. I am sure there could be numerous variations on this, but the
main thing is to increase the number of people working to create and
primarily, maintain the metadata.

Many catalogers would see this as a loss of control of the record, and
it would be since untrained people could make many mistakes, but nobody
can convince me that a record created by an experience cataloger that
becomes outdated, where the title no longer describes anything that
exists and a URL that points into the 404 Not Found Twilight Zone is
good for anything except to confuse everyone and provide bad publicity
for our field.

MARC should change in this scenario as well. First, to XML and then to
allow some freedom for the creators, perhaps an area for some keywords
of their choice, some special URLs for them, and other possible fields
reserved for their use.

And yes, for static digital resources, AACR2 has proven itself to be
adequate. I think a lot can be done today that would help everyone
concerned, from the selectors and catalogers, to the creators and
researchers. The technology is so powerful today that we are only
limited by our imaginations.

Jim Weinheimer

--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] [Fwd: Re: [RDA-L] FW: [RDA-L] Slave to the title page?]

2009-01-02 Thread Stephen Hearn

I agree that pushing out cataloging doesn't result in consistent data
records, but that's not really what I was suggesting. My suggestion was
that it might be possible to push out the assigning of unique
identifiers to be used in description and access records, if the process
of doing so could be well automated and the agency doing so had the
necessary credibility. If all I got from a website's metadata was the
LCCN that the creator had received and assigned to it, that could be
used to aggregate all the available efforts to describe or catalog the
website in more formal ways. The level and authority of any record found
would be reflected in the record; the LCCN would promise nothing except
uniqueness. Guidelines would be needed only to ensure uniqueness--to
advise against using the same one for different productions, etc. Kind
of a URI, only more universal.

Of course, experience teaches that no voluntary system is perfect. There
are publishers which re-use their ISBNs. But the threshhold for success
in assigning a unique identifier correctly is a bit lower than that for
creating good cataloging, so the rate of success would hopefully be higher.

Stephen

Karen Coyle wrote:

Weinheimer Jim wrote:


The biggest problem, which is even more important now than before is:
why would a website creator or outside, for-profit publisher want to
cooperate at all if this record is placed in some stinky, old library
catalog? Huge problems are easy to point to.



Just to note on the idea of pushing out the creation of cataloging to
the creator, that was the original impetus behind Dublin Core
   http://dublincore.org/about/history/

and it has failed, even though it promised to make web searching more
accurate (not put data into library catalogs). Creators aren't
interested, especially as long as their work can be found, without that
effort, through search engines. You can argue all day about how much
better things would be if we had metadata for the title and the creator
and the current date, but we've been there, done that, to no avail. It
is possible to extract some metadata from web documents, and it's
possible that Google may make use of some of the html coding in its
indexing. But I am convinced that we're going to have to get along
without much human cooperation.

kc

--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234



--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] RDA records coding and systems

2008-11-13 Thread Stephen Hearn

On the technical side I can imagine a set of interoperating systems that
would let one do a truncated search to find Tolstoy's record in a
database of identity records. Once that was selected, all the data would
be sufficiently marked up so that the URL and the appropriate text
string for your catalog would be automatically captured and dropped into
the data slot you started from. No copying and pasting at all.

Finding an organizational base for this is much harder. Maintaining
files of authorized terms and unique identifiers requires sustained,
skilled, long term effort. Doing so on the scale we'd like--one reliable
set of identity records (or records which link such records, like the
VIAF) for the whole world--is a huge undertaking. Some organization will
need to get paid, and have the authority to designate and manage data at
an international level. If we don't trust the existing major player
organizations to do this, and we don't have a realistic route to
consensus to create and support some new entity for this role, finding
an organizational base to host and maintain this kind of service becomes
a much more significant challenge than the technical piece.

Stephen

J. McRee Elrod wrote:

Jim said:



An example would be a book by Leo Tolstoy who has the form in the
NAF: Tolstoy, Leo, $c graf, $d 1828-1910 but in the DNB it is:
Tolstoj, Lev N. $d 1828-1910 and in the BNF, it's: Tolstoj $b Lev
Nikolaevic#780; $f 1828-1910 all reflecting cultural needs and
respective coding. If all of these things could be handled with a
URI, such as: http://orlabs.oclc.org/viaf/LC|n+79068416 ...



But in what way is cutting and pasting that url easier than cutting
and pasting the Tolstoy entry?  If keying, the url is much more
difficult to key (unlike Utlas ASNs).  The easiest way to use the
system was to cut and paste the text, and have the system substitute
the pointer.  Although one could key, to use your example, either
tolstoy leo 1828, tolstoj lev nicolaevic 1828, or tolstoj lev n
1828, and get the pointer.

But as I keep saying, using a Web url puts you at the mercy of the
Web, and in the case of the sample you give, a commercial entity.
This is not a method which could be used by a great number of
libraries in many parts of the world.  If OCLC based, I suspect it
would be denied commercial entities other than OCLC, such as SLC.


   __   __   J. McRee (Mac) Elrod ([EMAIL PROTECTED])
  {__  |   / Special Libraries Cataloguing   HTTP://www.slc.bc.ca/
  ___} |__ \__


*So the rumours at coffee breat had it.



--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] libraries, society and RDA

2008-11-10 Thread Stephen Hearn

To me the hard part is ensuring consistency, first of terminology, but
more fundamentally of granularity and categorization. The great virtue
of MARC/AACR/LSCH cataloging is that it is as consistent as it is across
many catalogs and institutions and disciplines. That's not a natural
development. The natural tendency of thinking communities is to divide
and redivide and to use language, categories, and levels of conceptual
granularity to draw distinctions between one community and another. The
uniformity achieved by the library cataloging (and it's by no means
perfect--just way better than average) comes from open standards and
professional discipline driven by an economic necessity to cooperate and
interoperate across all the divisions that most intellectual communities
depend on to define themselves. Diane's right that we can make data and
terminology standards much easier to access, but I don't see what's
going to compel diverse communities to use these standards consistently.
And like Shawne, I don't see a path yet to contribute to a bibliographic
description commons.

As for catalogs--when I use my libraries' catalogs (and I do--the Twin
Cities library systems all recognize me as a user), it's not mainly
bibliographic information that I'm looking for. I want to know what each
collection includes, and whether a copy is available, and how long the
wait for one will be, whether the titles I'm waiting for have arrived at
my branch--and all of these are kinds of information that Google can't
give me. To sleight this as an inventory function is a serious
mistake. I'm happy to see the bib information and access points finding
wider use in open web environments, but that won't begin to give me what
my libraries' catalogs do.

Stephen

Miksa, Shawne wrote:

I like fussing.

This idea of hoarding and hiding is difficult to understand as it makes it sound as if librarians, and 
especially those who catalog, are cave dwellers who can't speak. I would also ask you to not generalize all cataloging courses as 
traditional. We've been incorporating non-traditional ideas into our courses for quite some time now--although I 
don't use those two terms in my syllabi. Fundamentally, no matter the environment, we create a representation or 
surrogate of an information object for use within a system (define that as you like).

You write: Bibliographic data available freely on the web can be combined and 
presented in different ways, available to those who might want to try new aggregations 
and methods of discovery and presentation.

In your view, where does that bibliographic data originate? Who puts it into a shape or form so 
that it is available for the web? Or does it shape itself?  I was recently contacted by 
a library looking for a fresh new student to help catalog some original materials that have no 
previous records/bib data in any system. Once those description are made--formed into some sort of 
representation--then that data can be shaped into anything we want and made available through any 
system, web-based or not. When I try to understand your argument I don't see that part of it--I 
just see something miraculous happening and then all of sudden things are on the web. 
Are you suggesting that we set are representations afloat--like paper boats in a stream?

My epiphany wore off --I've lost site of your point of view.

**
Shawne D. Miksa, Ph.D.
Assistant Professor
Department of Library and Information Sciences
College of Information, Library Science, and Technology
University of North Texas
email: [EMAIL PROTECTED]
http://courses.unt.edu/smiksa/index.htm
office 940-565-3560 fax 940-565-3101
**



--
Stephen Hearn
Authority Control Coordinator/Head, Database Management Section
Technical Services, University Libraries, University of Minnesota
160 Wilson Library
309 19th Avenue South
Minneapolis, MN  55455
Ph: 612-625-2328 / Fax: 612-625-3428


Re: [RDA-L] [s.n.] used by Amazon; not confusing after all?

2008-06-10 Thread Stephen Hearn

The presence of s.n. in an Amazon record is a small, weak hook to
hang anything on; but looking at people's use of other tools can be
informative.

The one that's on my mind lately is Wikipedia. Among the principles
that Wikipedia has adopted are:

Unique entry--there's one article on Capital punishment, found under
that heading--not multiple takes on this topic, as one would find
by sifting through multiple web pages after searching the term in
Google (and skipping over the Wikipedia link, which of course came up first).

Authority--Wikipedia editors are ever ready to determine what the
preferred term of entry should be, to correct errors, provide cross
references, etc.

SEE references--search Death penalty in Wikipedia, and you get
referred to Capital punishment.

SEE ALSO references--articles on complex topics have lots of
information in sidebars showing relationships with other topics and
aspects. Even brief articles include hot-linked terms to related
Wikipedia articles, which serve the same purpose.

If searchers are much happier sorting through multiple results than
finding one, happier in an environment of competing claims than of
one governed by some form of authority, offended by any attempt to
redirect their search from their preferred term to the one used in a
resource, and would rather see personal links to my favorite sites
than clear, authoritative indications of available information on
related topics--all long-standing features of the way library
catalogs serve searchers needs--then why is Wikipedia so popular?

Wikipedia's acceptance of community input is obviously also very
important to its success, and in that regard it's an instructive
model for us, too. But would people like our catalogs better if they
were really modeled on Google searches on the open web? Wikipedia's
success in just that environment suggests not.

Stephen

At 12:48 PM 6/10/2008, you wrote:

The book in question is available *via* Amazon, but not from Amazon. In
other words, this is one of those third-party books, and in that case
Amazon obviously gets the data from the third party (a bookseller), not
the publisher. The third-party data is often of very poor quality. It
should be considered a Good Thing if these independent booksellers use
WorldCat or LoC data (and what's in Amazon looks very much like the LoC
record http://lccn.loc.gov/2007277697). Those who don't often present
very incomplete records.

kc

Mike Tribby wrote:

My guess would be that the metadata Amazon received for this book
was library metadata rather than publisher metadata (since the
latter would have identified the publisher).  I would NOT assume
from this that Amazon thought S.N. was anything other than a publisher name.

Maybe, except that the book in question has only 6 holdings in
WorldCat (certainly not a definitive guide to how many libraries
actually hold the item), so there wouldn't be that many libraries
able to contribute information in the first place. I think it's far
more likely that the information in Amazon that didn't come from
the publisher came from reviews-- and the publisher's employees
likely know where the publisher is located. The record is an LC
record, but judging by the LCCN was not a CIP record, and I'm not
sure how much information LC routinely communicates to Amazon.
Also, as I said in my original posting to Autocat, I was browsing
small press materials on Amazon and saw S.n. used on other
materials, too.  The book, BTW, is a collection of recipes and
humor from Iowa: Midwest Corn Fusion: A Collection of Recipes 
Humor, ISBN: 9780977833900.

Speaking of assumptions, isn't it a little nihlistic to think that
nobody but catalogers knows what S.n. means?



Mike Tribby
Senior Cataloger
Quality Books Inc.
The Best of America's Independent Presses

mailto:[EMAIL PROTECTED]





--
---
Karen Coyle / Digital Library Consultant
[EMAIL PROTECTED] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234




Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]


Re: [RDA-L] [s.n.] used by Amazon; not confusing after all?

2008-06-10 Thread Stephen Hearn

I read several lists, and I may have gotten this one crossed with
another; but I have seen it argued in the last few weeks and without
counter that preferred headings and cross references are evidence of
librarians' arrogance, and offensive to users who prefer their own
terms. And of course, there have been countless calls for library
catalogs to be more like Google. So it's interesting to me that
evidently, people's first choice when searching Google is Wikipedia,
which is so unlike Google in the ways that it organizes information access.

Stephen

At 04:37 PM 6/10/2008, you wrote:

Stephen Hearn wrote:


If searchers are much happier sorting through multiple results than
finding one, happier in an environment of competing claims than of
one governed by some form of authority, offended by any attempt to
redirect their search from their preferred term to the one used in a
resource, and would rather see personal links to my favorite sites
than clear, authoritative indications of available information on
related topics--all long-standing features of the way library
catalogs serve searchers needs--then why is Wikipedia so popular?


Wait, who said they were happier with those things? Nobody I've seen on
this list, or really anywhere else.

I think this is a straw man.

But of course you are right, users are not happier with those things,
the various kinds of collocation and relationship assigning that both
catalogers, wikipedia, and many other information organization projects,
perform in varying ways, are of course useful services, when done
effectively.  Who is it that says otherwise?

Jonathan




Wikipedia's acceptance of community input is obviously also very
important to its success, and in that regard it's an instructive
model for us, too. But would people like our catalogs better if they
were really modeled on Google searches on the open web? Wikipedia's
success in just that environment suggests not.

Stephen

At 12:48 PM 6/10/2008, you wrote:

The book in question is available *via* Amazon, but not from Amazon. In
other words, this is one of those third-party books, and in that case
Amazon obviously gets the data from the third party (a bookseller), not
the publisher. The third-party data is often of very poor quality. It
should be considered a Good Thing if these independent booksellers use
WorldCat or LoC data (and what's in Amazon looks very much like the LoC
record http://lccn.loc.gov/2007277697). Those who don't often present
very incomplete records.

kc

Mike Tribby wrote:

My guess would be that the metadata Amazon received for this book
was library metadata rather than publisher metadata (since the
latter would have identified the publisher).  I would NOT assume
from this that Amazon thought S.N. was anything other than a
publisher name.

Maybe, except that the book in question has only 6 holdings in
WorldCat (certainly not a definitive guide to how many libraries
actually hold the item), so there wouldn't be that many libraries
able to contribute information in the first place. I think it's far
more likely that the information in Amazon that didn't come from
the publisher came from reviews-- and the publisher's employees
likely know where the publisher is located. The record is an LC
record, but judging by the LCCN was not a CIP record, and I'm not
sure how much information LC routinely communicates to Amazon.
Also, as I said in my original posting to Autocat, I was browsing
small press materials on Amazon and saw S.n. used on other
materials, too.  The book, BTW, is a collection of recipes and
humor from Iowa: Midwest Corn Fusion: A Collection of Recipes 
Humor, ISBN: 9780977833900.

Speaking of assumptions, isn't it a little nihlistic to think that
nobody but catalogers knows what S.n. means?



Mike Tribby
Senior Cataloger
Quality Books Inc.
The Best of America's Independent Presses

mailto:[EMAIL PROTECTED]




--
---
Karen Coyle / Digital Library Consultant
[EMAIL PROTECTED] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234




Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]


--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu



Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]


Re: determining FRBR relationships

2007-12-05 Thread Stephen Hearn


By way of analogy then--Karen's approach to works would be similar to
the U.S. application profile for dealing with format and material
type in Z39.50. There, a complex interdependency of multiple fixed
field values is used to determine that item X is AV material, or is a
DVD, or whatever. But in my limited experience, systems which
actually want to interoperate will translate these interdependencies
into simple, encoded declarations, so that other systems searching
via Z39.50 can do a single search on an index of known values rather
than having to get Boolean with all the discrete fixed field values
that lie behind them. I assume that's for reasons of processing
efficiency. I'm not sure how this would scale with the vastly larger
number of discrete work entities; and if the same record is supposed
to interoperate with multiple different application profiles for what
a work is, ...


Alternatively, each community that decides it needs one could have a
registry of defined work entities, and the access in the
institution's record could be a statement of the relationship between
the work's registered ID and the object being described. The elements
needed to meet the community's definition of a work would be
contained in its work entity record. There could also be the option
of deriving from the work entity record a standard form for
searching, sorting, and display, as well as elements of the work's
definition that might be useful as access points in the institution's record.
And the work entity record could also define it's relationship to
other entity records, etc.


In any case, it's great to see this discussion progressing to the
management of multiple definitions of what the Group 1 entities are.
That I can deal with, even enjoy dealing with. But the notion that
somehow we can do good searching and collocation without
acknowledging differences in the definition of the FRBR entities, or
by allowing the same entity name to have multiple definitions (like
an undifferentiated personal name authority record) just drives me batty.


Stephen


At 12:29 PM 12/5/2007, you wrote:

David M Pimentel wrote:



It strikes me that one way to address this situation might be to focus
not only on the nature of the relationships between (FRBR) entities, but
also on who makes (and hence values) particular relationships.


Absolutely! I think this would be handled well through application
profiles. The Dublin Core folks have been working on this in the
bibliographic realm. There are already examples of application profiles,
such as those for Z39.50 and for the OpenURL. Done well, you should be
able to make decisions about the meaning of a relationship based on how
it is defined in the particular application profile. What this means is
that different communities can have their own peculiar (and some will be
peculiar) sets of relationships and definitions without being
constrained by the needs of others. (Which I think is what has tripped
up the library and archive communities in the past -- that we didn't
have a way to express different views using the same data elements.)

Here's one article on application profiles. There are probably many more
and perhaps better ones that others can contribute. I think I'll start a
page on APs on the futurelib wiki.

   http://www.ariadne.ac.uk/issue25/app-profiles/

kc

--
---
Karen Coyle / Digital Library Consultant
[EMAIL PROTECTED] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234





Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]


Re: Wrong model--entity relationship?

2007-07-03 Thread Stephen Hearn
 just a theory.
Think of what we risk if that theory is wrong.

kc




Martha M. Yee

[EMAIL PROTECTED] mailto:[EMAIL PROTECTED]



Sara Shatford Layne

[EMAIL PROTECTED]


--
---
Karen Coyle / Digital Library Consultant
[EMAIL PROTECTED] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234





Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]


scanned images vs transcription

2007-02-21 Thread Stephen Hearn


Elizabeth O'Keefe has doubts about the reliability of scanned images
as substitutes for t.p. transcription. Greta de Grote would welcome
scanned images of video credits as a substitute for transcription.
Some time back on this list, Barbara Tillett mentioned the
availability of scanned images as one of the factors making
transcription less obligatory.


How important is this piece of the discussion? Are there differing
visions of how the use of scanned images would be implemented or of
the reliability of scanned images in general that could account for
the varying comfort levels? Will RDA have any more specifics on using
scanned images as part of the description? If scanned images are a
factor, how does that change what other kinds of data would be
recorded as part of a textual description?


Stephen




Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]


Re: Controlled Access Point as Textual Identifier

2007-02-14 Thread Stephen Hearn


Responding to Jonathan Rochkind's analysis:


You're right that controlled headings are basically textual identifiers,
and that we often get muddled in our analyses when we forget that many
elements of the catalog record have multiple uses and purposes, not just
one. A couple of your other terms bother me. By foreign, do you mean
related in more common parlance? And when you say that the bib record
1XX/245 establishes the textual identifier for the entity it represents
(if that's what you meant), bear in mind that establishing is something
usually done by authority records. Where no authority record exists (e.g.,
as for many serial titles), the bib record de facto establishes the
textual identifier; but as soon as an authority appears, that establishing
function of the bib record's data is eclipsed. And there are
work/expression identifiers (e.g., Tolstoy ... Short stories. English.
Selections which will not be established by any bib record because no
manifestation will use them.


Note also that these textual identifiers are unlike many other identifiers
because they are hierarchically structured. A single string can reference
multiple contexts. The example above will contextualize its bib record with
works by Tolstoy, short story collections by Tolstoy, short story
collections in English by Tolstoy, ... The term identifier usually
signifies a unique value that designates the object, with uniqueness being
the prime purpose of the access point; whereas controlled access points
tend operate more like a verbal classification system, with useful
collocation being their prime purpose. If there's nothing to collocate
with, there's arguably no reason to declare a work/expression level identifier.


Which brings us to one of the main problems with the FRBR analysis. FRBR
wants to address objects from the work level down. In fact, however
important they may be for the process of creation, the work and expression
levels are meaningless for the description of most bibliographic objects.
The few objects that get cited, re-edited, and reworked often enough to
warrant work/expression analysis and designation to support useful
collocation are a small minority. As such, they are a bad basis for a
general model of bibliographic description. We'd get a sounder model if we
stood FRBR on its head, and assumed that the item level is primary and
universal, and that some items belong to a class of manifestations, and
some manifestations belong to a class of expressions, and some expressions
belong to a class of works--but that the essential contents of only a few
items actually transcend to the numinal realm of bibliographic works in
any meaningful way.


Stephen



At 11:36 AM 2/14/2007, you wrote:

access points as 'textual identifiers' ??





Stephen Hearn
Authority Control Coord./Database Mgmt. Section Head
Technical Services Dept.
University of Minnesota
160 Wilson Library   Voice: 612-625-2328
309 19th Avenue South  Fax: 612-625-3428
Minneapolis, MN 55455  E-mail: [EMAIL PROTECTED]