Hi Ed, > I think this is a great idea. At first I was thinking that it would be > nice to be able to pass your normalize() function a MARC::Record object, > which would magically normalize all the relevant fields (like a good > cataloger). This could be a subclass MARC::Record::NACO which adds a new > method normalize(), or if Andy was willing could be added to the > MARC::Record core. > > However, the docs [1] seem to say that it is only possible to determine > how a field should normalize in the context of the collection of records > that it is a part of...and that MARC::Record has no way of determining
> this, so perhaps this idea is not on target? Okay, I think you're right that subclassing MARC::Record isn't going to cut the mustard, since MARC::Batch would still not pick it up (thus it isn't exactly a drop-in replacement, which would be ideal). > If you would like to contribute your NACO normalization function to cpan > (as I definitely think you should), and my reading of the lc docs are > correct, then I would recommend you add a Text::NACO module. The > Normalize part is a bit redundant because all the modules in Text do some > kind of normalization. The package could export a function normalize() on > demand, which you then pass a string, and get back the NACO normalized > version. You could also add it to the Biblio namespace as Biblio::NACO, or > MARC::NACO, but that's really your call as the module author :) The main > thing is to get it up there somewhere. What I'm now envisioning is a module, still called MARC::Record::NACO, which is not a subclass, but would export two functions on demand, normalize() and compare(). --- * normalize() inputs: either a MARC::Record object or a string. This should probably accept an arbitrary number of inputs so, you can do my @normrecs = normalize( @records ); rather than my @normrecs; foreach my $rec ( @records ) { push @normrecs, normalize( $rec ); } But you still could if you wanted to. Given a M::R object it would do as the rules state [1] for the appropriate fields in the record. Returns a M::R object. Given a string, it would apply the string normalization rules. Returns a string. * compare() inputs: either two M::R objects or two strings. Given two M::R objects, both are normalize()'ed. It would return false (or should it be true?) if, based on the rules [1], some field in $a matches some field in $b. Given two strings, both are again normalize()'ed and a simple "cmp" is performed. --- It sucks that given different inputs the results returned are a bit inconsistent. However, there's no way to say that $a > $b for a M::R (is there? :). One might want to be able to sort normalized strings, so it makes sense that compare()'ing two strings does a "cmp". How's that sound? -Brian Cassidy ( [EMAIL PROTECTED] ) [1] http://lcweb.loc.gov/catdir/pcc/naco/normrule.html http://www.gordano.com - Messaging for educators.