Re: Opening & writing to UTF-8 files; copyright symbol again -- solution
On Fri, Nov 13, 2015 at 10:05:01PM +, Highsmith, Anne L wrote: > I should probably say, "apparent solution" 'cause character set issues never > seem to end. > > However, combining Jon Gorman's recommendation with some Googling, I get: > > my $outfile='4788022.edited.bib'; > open (my $output_marc, '>', $outfile) or die "Couldn't open file $!" ; > binmode($output_marc, ':utf8'); > You can set the correct encoding succinctly on opening files e.g. open my $fh, '>:encoding(UTF-8)', $outfile Hope that helps C. -- Colin Campbell Chief Software Engineer, PTFS Europe Limited Content Management and Library Solutions +44 (0) 800 756 6803 (phone) +44 (0) 7759 633626 (mobile) colin.campb...@ptfs-europe.com skype: colin_campbell2 http://www.ptfs-europe.com
Re: printing UTF-8 encoded MARC records with as_usmarc
On Tue, Jul 31, 2012 at 09:25:55AM -0400, Smith,Devon wrote: I just recently came across this presentation which lays out pretty much all the issues with Unicode in perl, and makes some recommendations for best practices. You may find some general insight into the whole situation by going over it. In the course of preparing the latest edition of the Camel book Tom Christiansen created a Perl Unicode Cookbook see http://www.perl.com/pub/2012/04/perlunicook-standard-preamble.html Its available in a few different places on the web C. -- Colin Campbell Chief Software Engineer, PTFS Europe Limited Content Management and Library Solutions +44 (0) 800 756 6803 (phone) +44 (0) 7759 633626 (mobile) colin.campb...@ptfs-europe.com skype: colin_campbell2 http://www.ptfs-europe.com
Re: Turning MARC record object back into a string
On Mon, Mar 12, 2012 at 01:49:36PM +, Anne Highsmith wrote: Can anybody think of any reason why I shouldn't just use my $new_marc = $bib_rec-as_usmarc();. And if you can, can you suggest an alternative way to correctly turn the object into a string? No thats the way the interface should work. I think your use of sprintf is mistaken. sprintf takes a format string and a list of variables and formats the variables according to the format string. Use of sprintf and printf is often a danger sign if they turn up in perl code. What your code was doing was using the string returned by the -as_usmarc method and using that to format an empty list of variables, as soon as '%' cropped up in a record the fact that the list was empty became a problem. Your only potential problems with the marc string are if you want to pass it to some interface that will get upset at MARC's embedded control characters or the character set used in the record. But I suspect thats not a problem because sprintf would not have changed that. Colin -- Colin Campbell Chief Software Engineer, PTFS Europe Limited Content Management and Library Solutions +44 (0) 800 756 6803 (phone) +44 (0) 7759 633626 (mobile) colin.campb...@ptfs-europe.com skype: colin_campbell2 http://www.ptfs-europe.com
Re: Bug in MARC::Record ?
On 05/01/11 10:21, Paul Poulain wrote: Hello, replace use Carp qw(croak); by use Carp qw(croak carp); and things are going much better. More simply just use Carp; croak and carp (and confess) are exported by default. Colin -- Colin Campbell Chief Software Engineer, PTFS Europe Limited Content Management and Library Solutions +44 (0) 845 557 5634 (phone) +44 (0) 7759 633626 (mobile) colin.campb...@ptfs-europe.com skype: colin_campbell2 http://www.ptfs-europe.com
Re: Moose based Perl library for MARC records
On 10/11/10 11:59, Bill Birthisel wrote: 3. There is not really much of a problem (for users) with long names. It appears MooseX:: is currently in common use on CPAN - so I would recommend MooseX::MARC. That appears to be to clearest choice and the one that fits the current naming patterns most closely. MooseX is wrong. MooseX is the namespace for extensions to Moose such as new Types. This module should be in the MARC namespace. Something like MARC::NG is called for. Looking forward to using this, Frédéric, good idea Cheers Colin -- Colin Campbell Chief Software Engineer, PTFS Europe Limited Content Management and Library Solutions +44 (0) 208 366 1295 (phone) +44 (0) 7759 633626 (mobile) colin.campb...@ptfs-europe.com skype: colin_campbell2 http://www.ptfs-europe.com
Re: MARC Records, XML, and encoding
Edward Summers wrote: On May 18, 2006, at 6:48 AM, Joshua Ferraro wrote: Anyway, if anyone can shed some light on this I'd be grateful. I believe the data loss you are seeing is due to your source records--not to do with character translation. Just a quick look but I think in many cases the actual record length is 2 more than the length stated in the leader Cheers Colin
Re: Corrupt MARC records
Ron Davies wrote: Has anybody ever seen a MARC record where the order of the field data wasn't the same as that of the entries in the directory? I'm not questioning the logic of reading a record using the field lengths and offsets, just wondering if anybody had ever seen this occur in the wild. I never have. I have although I can't recall where it came from. (It was some years ago) The problem was exacerbated because the program reading it assumed the directory and field sequence matched and was not flagging any errors. It was sometime later that users spotted some records were odd and it took a while to trace it back to this cause. Colin -- Colin Campbell Software Development Consultant Sirsi Ltd
Re: [patch] Accept # as Blank Indicator
On Tue, Nov 18, 2003 at 10:50:22PM -0600, Chuck Bearden [EMAIL PROTECTED] wrote: On Tue, Nov 18, 2003 at 07:50:39PM -0600, Ed Summers wrote: On Tue, Nov 18, 2003 at 08:11:39PM -0500, Morbus Iff wrote: MARC::Field-new('100','1','', a='Logan, Robert K.', d='1939-'), MARC::Field-new('100','1','#', a='Logan, Robert K.', d='1939-'), I don't like this. The # is used simply as a typographical convention in LC's online docs. It has nothing to do with the actual content found in MARC records. I think Ed is right. As I recall, OCLC used to use an underscore for blank indicator positions, but now they seem to be using the doodad represented in this image: I'd second that. I've seem software that generated serious problems through confusing the the typographical conventions with the data. Also the fact the space is enclosed in quotes is doing the same thing for the Human reader that LOC's hash is doing (showing it is a space not a null string) Colin -- Colin Campbell Technical Services Consultant Sirsi Ltd [EMAIL PROTECTED]
Re: Manuall created records
On Wed, Oct 15, 2003 at 05:41:54PM +0300, Christoffer Landtman [EMAIL PROTECTED] wrote: The character at position 24 which according to Ashley should be a field terminator instead of blank, did apparently not affect Zebra. I do not know if this is good or bad (Zebra too sloppy or Ashley too strict...=) but for the time being, it seems to work. The first field terminator should be after the directory (which follows the label) C. -- Colin Campbell Technical Services Consultant Sirsi Ltd [EMAIL PROTECTED]