Jonathan talks about the question of "meaning", which I think is one of
the key issues that we face in trying to formalize library cataloging
for machine processing.

"The trick is figuring out how our overall systems
(and I don't just mean software, I mean the whole endeavor) is going to
work in an environment where sometimes you have more meaning encoded,
and sometimes less."

But this doesn't just apply to the FRBR Group I levels, it applies to
all of the fields that library data encodes. Let me give an example:
Pagination. I'll use RDA as the rules in this example, but I believe it
also applies to AACR2.

In RDA, there is a field called "extent" that is defined as:

  "Extent reflects the number of units and/or subunits making up a
resource." (3.4.0.1.1)

That's fine as meaning goes. Here are examples from RDA Chapter 3:

  327 pages
  1 sculpture
  2 portfolios ([18] leaves; [24] leaves)

These are fairly clear: number + unit. Then you get:

  viii, 278 pages

This could be interpreted as:
  viii pages
  278 pages

But it has another meaning, which is that it is also conveying (I could
say primarily conveying) the PAGINATION, that is, how the pages are
numbered. Pagination is important for distinguishing editions, so this
is good information, but it isn't the same as the number of pages in the
item -- especially not to a computer.

Why does this matter?

It matters because it has a different meaning -- a meaning that humans
can distinguish, but computers cannot. And even for humans the meaning
is somewhat ambiguous -- since only numbered pages are included.

Because in the past our data was designed as text to be read by humans,
we could rely on humans to make inferences about the data, which meant
that there was tolerance for this kind of mixing of meanings. But if,
for example, we want to be able to match up ONIX records and library
records for the same item (and there are good reasons to do that both
for acquisitions purposes and user service purposes), then this mixture
of meanings makes it hard to compare the ONIX number of pages (which is
literally the number of pages in the book -- that's after all what they
pay the printer for) with the library pagination. (Believe me it's hard
-- I've been working on this kind of match.) In essence, if we want to
include number + unit AND pagination in the same record, we should
distinguish between them.

And let's not go into a rant against the publishers. There is a
distinction often made also in abstracting and indexing data, where an
article can have both page numbers ("43-47") and number of pages ("5").
This latter is used in ILL to estimate copying costs.

So if we want our data to play well in the world of bibliographic
information, we have to pay close attention to meaning. We have our
habits (as I believe I showed when I asked about title case, which drew
many responses but much speculation), but those do not serve us well if
we can't turn them into unambiguous data definitions. I think it is a
shame that RDA is carrying forward some of our habits without thinking
more about meaning.

kc

p.s. And before someone asks, no I am not suggesting that libraries
should code both # of pages and pagination, nor that publishers should
do the same thing. I am suggesting that you don't put both in the same
field, nor give them the same name. Not just us, BTW, this is true for
all data.


--
-----------------------------------
Karen Coyle / Digital Library Consultant
[EMAIL PROTECTED] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------

Reply via email to