Jonathan talks about the question of "meaning", which I think is one of the key issues that we face in trying to formalize library cataloging for machine processing.
"The trick is figuring out how our overall systems (and I don't just mean software, I mean the whole endeavor) is going to work in an environment where sometimes you have more meaning encoded, and sometimes less." But this doesn't just apply to the FRBR Group I levels, it applies to all of the fields that library data encodes. Let me give an example: Pagination. I'll use RDA as the rules in this example, but I believe it also applies to AACR2. In RDA, there is a field called "extent" that is defined as: "Extent reflects the number of units and/or subunits making up a resource." (3.4.0.1.1) That's fine as meaning goes. Here are examples from RDA Chapter 3: 327 pages 1 sculpture 2 portfolios ([18] leaves; [24] leaves) These are fairly clear: number + unit. Then you get: viii, 278 pages This could be interpreted as: viii pages 278 pages But it has another meaning, which is that it is also conveying (I could say primarily conveying) the PAGINATION, that is, how the pages are numbered. Pagination is important for distinguishing editions, so this is good information, but it isn't the same as the number of pages in the item -- especially not to a computer. Why does this matter? It matters because it has a different meaning -- a meaning that humans can distinguish, but computers cannot. And even for humans the meaning is somewhat ambiguous -- since only numbered pages are included. Because in the past our data was designed as text to be read by humans, we could rely on humans to make inferences about the data, which meant that there was tolerance for this kind of mixing of meanings. But if, for example, we want to be able to match up ONIX records and library records for the same item (and there are good reasons to do that both for acquisitions purposes and user service purposes), then this mixture of meanings makes it hard to compare the ONIX number of pages (which is literally the number of pages in the book -- that's after all what they pay the printer for) with the library pagination. (Believe me it's hard -- I've been working on this kind of match.) In essence, if we want to include number + unit AND pagination in the same record, we should distinguish between them. And let's not go into a rant against the publishers. There is a distinction often made also in abstracting and indexing data, where an article can have both page numbers ("43-47") and number of pages ("5"). This latter is used in ILL to estimate copying costs. So if we want our data to play well in the world of bibliographic information, we have to pay close attention to meaning. We have our habits (as I believe I showed when I asked about title case, which drew many responses but much speculation), but those do not serve us well if we can't turn them into unambiguous data definitions. I think it is a shame that RDA is carrying forward some of our habits without thinking more about meaning. kc p.s. And before someone asks, no I am not suggesting that libraries should code both # of pages and pagination, nor that publishers should do the same thing. I am suggesting that you don't put both in the same field, nor give them the same name. Not just us, BTW, this is true for all data. -- ----------------------------------- Karen Coyle / Digital Library Consultant [EMAIL PROTECTED] http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234 ------------------------------------