Following up on some discussion on IRC yesterday (dbs/drycona), we looked at 
how many records we had with bad 008 data. So we identified lots of bad data  
from mostly one previous ILS, but we also had lots of errors across all of our 
tcn_sources.  We'll explore what automated fix-ups are possible. 

Has anybody given any thought as to how we might tame some of the madness of 
MARC 008's "space as data" (aka "character position") and related data 
integrity challenges?  

What I'm curious about is for any practical approaches that could be taken 
within the MARC Editor to minimize coding errors for 008 and/or highlight / 
provide validation for these errors when using the MARC editor. 

For example:

* When viewing an empty fixed field element (say "Date1"), there's no 
immediately visual way that I can see that there's already 4 positions reserved 
as 'empty data' or if that's 3 spaces or none. Would it be helpful to have 
better identification of "space as data" in fixed field section? If so, how 
without cluttering up things. E.g. should empty spaces be styled in a different 
background colour, or is there a way of highlighting empty character positions 
as well as errors through some other visual techniques? 

* Right now, 008 is twice editable (in the fixed field editor box, but also in 
the 008 line item). It's hard to know how many errors are introduced through 
awkwardness of direct edits to the 008 row but wonder if protecting 008 row 
would help us any (some ILS systems basically hide 008 in favour of 'guided' or 
input box data entry, etc.).

* or maybe just adding some simplified 008 validation (say against char_length) 
so that upon saving a record you're given an error message ("Invalid 008 entry" 
or whatever) with an option to ignore errors, save and continue?

Thanks,

George Duimovich
NRCan Library / Bibliothèque de RNCan



Reply via email to