Eric, Have you tried checking how MARC::Batch views the encoding?
e.g. # read & write while ( my $marc = $batch->next ) { print $marc->encoding(); print $marc->as_usmarc; } It is supposed to pick up the encoding from 09 in the leader but I am not sure this is totally reliable. If you know this is definitely a utf8 file you can mannually set the encoding (but you shouldn't have to). e.g. # read & write while ( my $marc = $batch->next ) { $marc->encoding('UTF-8'); print $marc->as_usmarc; } regards Alan -- Alan Brown Library Systems Liaison Officer Bury Library Service Resource Services Textile Hall Manchester Rd Bury BL9 0DG 0161 253 5877 http://www.bury.gov.uk/libraries http://library.bury.gov.uk -----Original Message----- From: Eric Lease Morgan [mailto:emor...@nd.edu] Sent: 26 March 2013 20:22 To: perl4lib@perl.org Subject: reading and writing of utf-8 with marc::batch For the life of me I can't figure out how to do reading and writing of UTF-8 with MARC::Batch. I have a UTF-8 encoded file of MARC records. Dumping the records and greping for a particular string illustrates the validity: $ marcdump und.marc | grep Sainte-Face und.marc 1000 records 2000 records 3000 records 4000 records 5000 records 6000 records 7000 records 8000 records 9000 records 10000 records 11000 records 12000 records 245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face 610 20 _aArchiconfrérie de la Sainte-Face 13000 records $ I then run a Perl script that simply reads each record and dumps it to STDOUT. Notice how I define both my input and output as UTF-8: #!/shared/perl/current/bin/perl # configure use constant MARC => './und.marc'; # require use strict; use MARC::Batch; # initialize binmode ( MARC, ":utf8" ); my $batch = MARC::Batch->new( 'USMARC', MARC ); $batch->strict_off; $batch->warnings_off; binmode( STDOUT, ":utf8" ); # read & write while ( my $marc = $batch->next ) { print $marc->as_usmarc } # done exit; But my output is munged: $ ./marc.pl > und.mrc $ marcdump und.mrc | grep Sainte-Face und.mrc 1000 records 2000 records 3000 records 4000 records 5000 records 6000 records 7000 records 8000 records 9000 records 10000 records 11000 records 12000 records 245 00 _aAnnales de l'Archiconfrérie de la Sainte-Face 610 _aArchiconfrérie de la Sainte-Face 13000 records $ What am I doing wrong!? -- Eric Lease Morgan University of Notre Dame 574/631-8604 ----------------------------------------------------------------- Why not visit our website www.bury.gov.uk ----------------------------------------------------------------- Incoming and outgoing e-mail messages are routinely monitored for compliance with our information security policy. The information contained in this e-mail and any files transmitted with it is for the intended recipient(s) alone. It may contain confidential information that is exempt from the disclosure under English law and may also be covered by legal,professional or other privilege. If you are not the intended recipient, you must not copy, distribute or take any action in reliance on it. If you have received this e-mail in error, please notify us immediately by using the reply facility on your e-mail system. If this message is being transmitted over the Internet, be aware that it may be intercepted by third parties. As a public body, the Council may be required to disclose this e-mail or any response to it under the Freedom of Information Act 2000 unless the information in it is covered by one of the exemptions in the Act. Electronic service accepted only at legalservi...@bury.gov.uk and on fax number 0161 253 5119 . *************************************************************