Quoting Jonathan Rochkind <rochk...@jhu.edu>:


In the second case, we have an abbreviation _within_ narrative text. This is harder, because is _every_ time "ed." is found mean "edition"? No, sometimes it means "editor". Sometimes it means "editions" (with an s). It gets increasingly complicated for software to try and guess what it should be expanded to -- sometimes (rarely) it's really the word "ed" at the end of a sentence. (Okay, maybe not with "ed", but with some other ones). And the benefit to encoding it this way is... unclear to me.

And the frustrating thing for me is that RDA defines a number of controlled or semi-controlled lists of terms (like colour and extent) that are used in text fields, and therefore cannot be easily manipulated for machine purposes. So you can have something like:

1 computer disc (5 image files)

or

on side 2 of 1 videodisc

and *some* terms in those strings are from a controlled vocabulary. But embedding controlled vocabulary in a text string makes for consistency for human readers but it is essentially an inert alphanumeric string from a machine processing point of view.

kc

p.s. Here's a simple list of all of the controlled lists in RDA:
   http://kcoyle.net/rda/rda_vocablist.txt
It doesn't include the actual terms in the list, but maybe I'll get around to adding that.

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Reply via email to