Following up on some discussion on IRC yesterday (dbs/drycona), we looked at
how many records we had with bad 008 data. So we identified lots of bad data
from mostly one previous ILS, but we also had lots of errors across all of our
tcn_sources. We'll explore what automated fix-ups are possible.
Has anybody given any thought as to how we might tame some of the madness of
MARC 008's "space as data" (aka "character position") and related data
integrity challenges?
What I'm curious about is for any practical approaches that could be taken
within the MARC Editor to minimize coding errors for 008 and/or highlight /
provide validation for these errors when using the MARC editor.
For example:
* When viewing an empty fixed field element (say "Date1"), there's no
immediately visual way that I can see that there's already 4 positions reserved
as 'empty data' or if that's 3 spaces or none. Would it be helpful to have
better identification of "space as data" in fixed field section? If so, how
without cluttering up things. E.g. should empty spaces be styled in a different
background colour, or is there a way of highlighting empty character positions
as well as errors through some other visual techniques?
* Right now, 008 is twice editable (in the fixed field editor box, but also in
the 008 line item). It's hard to know how many errors are introduced through
awkwardness of direct edits to the 008 row but wonder if protecting 008 row
would help us any (some ILS systems basically hide 008 in favour of 'guided' or
input box data entry, etc.).
* or maybe just adding some simplified 008 validation (say against char_length)
so that upon saving a record you're given an error message ("Invalid 008 entry"
or whatever) with an option to ignore errors, save and continue?
Thanks,
George Duimovich
NRCan Library / Bibliothèque de RNCan