Eric-- I'm with Leif. The output you got looks like utf-8 displayed on a terminal that doesn't support it. Whether you need to fix the terminal display is another matter--I've never felt compelled to do so.
Anyway, I think you can now sign yourself "Eric Did-it-right-the-first-time Morgan!" Mike > -----Original Message----- > From: Leif Andersson [mailto:leif.anders...@sub.su.se] > Sent: Tuesday, March 26, 2013 5:57 PM > To: Eric Lease Morgan; perl4lib@perl.org > Subject: Re: reading and writing of utf-8 with marc::batch > > Hi Eric, > > my first guess would be your terminal is not utf8. > If you comment out > #binmode( STDOUT, ":utf8" ); > and that does the trick, then you can start looking for how to change > your terminal settings. > (And that can sometimes be a rather frustrating task, I'm afraid) > > /Leif Andersson > Stockholm UL > ________________________________________ > Från: Eric Lease Morgan [emor...@nd.edu] > Skickat: den 26 mars 2013 21:22 > Till: perl4lib@perl.org > Ämne: reading and writing of utf-8 with marc::batch > > For the life of me I can't figure out how to do reading and writing of > UTF-8 with MARC::Batch. > > I have a UTF-8 encoded file of MARC records. Dumping the records and > greping for a particular string illustrates the validity: > > $ marcdump und.marc | grep Sainte-Face > und.marc > 1000 records > 2000 records > 3000 records > 4000 records > 5000 records > 6000 records > 7000 records > 8000 records > 9000 records > 10000 records > 11000 records > 12000 records > 245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face > 610 20 _aArchiconfrérie de la Sainte-Face > 13000 records > $ > > I then run a Perl script that simply reads each record and dumps it to > STDOUT. Notice how I define both my input and output as UTF-8: > > #!/shared/perl/current/bin/perl > > # configure > use constant MARC => './und.marc'; > > # require > use strict; > use MARC::Batch; > > # initialize > binmode ( MARC, ":utf8" ); > my $batch = MARC::Batch->new( 'USMARC', MARC ); > $batch->strict_off; > $batch->warnings_off; > binmode( STDOUT, ":utf8" ); > > # read & write > while ( my $marc = $batch->next ) { print $marc->as_usmarc } > > # done > exit; > > But my output is munged: > > $ ./marc.pl > und.mrc > $ marcdump und.mrc | grep Sainte-Face > und.mrc > 1000 records > 2000 records > 3000 records > 4000 records > 5000 records > 6000 records > 7000 records > 8000 records > 9000 records > 10000 records > 11000 records > 12000 records > 245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face > 610 _aArchiconfrérie de la Sainte-Face > 13000 records > $ > > What am I doing wrong!? > > -- > Eric Lease Morgan > University of Notre Dame > > 574/631-8604