This is related to my previous post (9/17/2015) about deleting 035 fields after RDA-ification. Jon Gorman solved that one for me by pointing out that I probably had a problem with my perl libraries.
But now, instead of creating the record from the database and writing it back to the database, I am reading from a file exported from my database, which is UTF-8. Specifically, the blasted copyright symbol again. As stored in the database, the copyright symbol is encoded as C2 A9, which if I read the tables correctly, is the correct UTF-8 encoding for copyright. But when I read the record from a file and write it back to the file after deleting the problematic 035, the encoding for the copyright symbol has been turned into A9. This "transformation" happens both when running the perl program on my pc and on the unix server. Interestingly, complicated Unicode seems to be okay. I took a record with Hebrew vernacular characters and edited it using my program, then ran the source record and target record through xxd. I then diffed the files; it showed no difference. But the before and after of the record that has the copyright symbol munges the copyright by stripping the C2. Here's the program. If anybody can tell my what I'm doing wrong I'd really appreciate it. ---------------------------------------------------------------------------------------------------------- use strict; use warnings; use MARC::Record; use MARC::Batch; my $infile='4788022.bib'; my $batch = MARC::Batch->new('USMARC',"$infile"); my $outfile='4788022.edited.bib'; open(OUTPUT, ">$outfile"); while (my $record = $batch->next) { my $f001 = $record->field('001'); my $bib_id = $f001->as_string(); my @a035 = $record->field('035'); foreach my $f035 (@a035) { if (my $f035a = $f035->subfield('a')) { if ($f035a eq $bib_id) { $record->delete_field($f035); } } } print OUTPUT $record->as_usmarc(); } Anne L. Highsmith Director, Consortia Systems TAMU Libraries 5000 TAMU College Station, TX 77843-5000 979 862 4234 hism...@tamu.edu