Eric--

I'm with Leif. The output you got looks like utf-8 displayed on a terminal that 
doesn't support it. Whether you need to fix the terminal display is another 
matter--I've never felt compelled to do so. 

Anyway, I think you can now sign yourself "Eric Did-it-right-the-first-time 
Morgan!"

Mike

> -----Original Message-----
> From: Leif Andersson [mailto:leif.anders...@sub.su.se]
> Sent: Tuesday, March 26, 2013 5:57 PM
> To: Eric Lease Morgan; perl4lib@perl.org
> Subject: Re: reading and writing of utf-8 with marc::batch
> 
> Hi Eric,
> 
> my first guess would be your terminal is not utf8.
> If you comment out
> #binmode( STDOUT, ":utf8" );
> and that does the trick, then you can start looking for how to change
> your terminal settings.
> (And that can sometimes be a rather frustrating task, I'm afraid)
> 
> /Leif Andersson
> Stockholm UL
> ________________________________________
> Från: Eric Lease Morgan [emor...@nd.edu]
> Skickat: den 26 mars 2013 21:22
> Till: perl4lib@perl.org
> Ämne: reading and writing of utf-8 with marc::batch
> 
> For the life of me I can't figure out how to do reading and writing of
> UTF-8 with MARC::Batch.
> 
> I have a UTF-8 encoded file of MARC records. Dumping the records and
> greping for a particular string illustrates the validity:
> 
>   $ marcdump und.marc | grep Sainte-Face
>   und.marc
>   1000 records
>   2000 records
>   3000 records
>   4000 records
>   5000 records
>   6000 records
>   7000 records
>   8000 records
>   9000 records
>   10000 records
>   11000 records
>   12000 records
>   245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face
>   610 20 _aArchiconfrérie de la Sainte-Face
>   13000 records
>   $
> 
> I then run a Perl script that simply reads each record and dumps it to
> STDOUT. Notice how I define both my input and output as UTF-8:
> 
>   #!/shared/perl/current/bin/perl
> 
>   # configure
>   use constant MARC => './und.marc';
> 
>   # require
>   use strict;
>   use MARC::Batch;
> 
>   # initialize
>   binmode ( MARC, ":utf8" );
>   my $batch = MARC::Batch->new( 'USMARC', MARC );
>   $batch->strict_off;
>   $batch->warnings_off;
>   binmode( STDOUT, ":utf8" );
> 
>   # read & write
>   while ( my $marc = $batch->next ) { print $marc->as_usmarc }
> 
>   # done
>   exit;
> 
> But my output is munged:
> 
>   $ ./marc.pl > und.mrc
>   $ marcdump und.mrc | grep Sainte-Face
>   und.mrc
>   1000 records
>   2000 records
>   3000 records
>   4000 records
>   5000 records
>   6000 records
>   7000 records
>   8000 records
>   9000 records
>   10000 records
>   11000 records
>   12000 records
>   245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face
>   610    _aArchiconfrérie de la Sainte-Face
>   13000 records
>   $
> 
> What am I doing wrong!?
> 
> --
> Eric Lease Morgan
> University of Notre Dame
> 
> 574/631-8604

Reply via email to