Re: Opening & writing to UTF-8 files; copyright symbol again -- solution

2015-11-16 Thread Colin Campbell
On Fri, Nov 13, 2015 at 10:05:01PM +, Highsmith, Anne L wrote:
> I should probably say, "apparent solution" 'cause character set issues never 
> seem to end.
> 
> However, combining Jon Gorman's recommendation with some Googling, I get:
> 
> my $outfile='4788022.edited.bib';
> open (my $output_marc, '>', $outfile) or die "Couldn't open file $!" ;
> binmode($output_marc, ':utf8');
> 
You can set the correct encoding succinctly on opening files 
 e.g. open my $fh, '>:encoding(UTF-8)', $outfile

Hope that helps
C.

-- 
Colin Campbell
Chief Software Engineer,
PTFS Europe Limited
Content Management and Library Solutions
+44 (0) 800 756 6803 (phone)
+44 (0) 7759 633626  (mobile)
colin.campb...@ptfs-europe.com
skype: colin_campbell2

http://www.ptfs-europe.com


Re: printing UTF-8 encoded MARC records with as_usmarc

2012-08-01 Thread Colin Campbell
On Tue, Jul 31, 2012 at 09:25:55AM -0400, Smith,Devon wrote:
 I just recently came across this presentation which lays out pretty much all 
 the issues with Unicode in perl, and makes some recommendations for best 
 practices. You may find some general insight into the whole situation by 
 going over it.
In the course of preparing the latest edition of the Camel book Tom
Christiansen created a Perl Unicode Cookbook see
http://www.perl.com/pub/2012/04/perlunicook-standard-preamble.html

Its available in a few different places on the web

C.

-- 
Colin Campbell
Chief Software Engineer,
PTFS Europe Limited
Content Management and Library Solutions
+44 (0) 800 756 6803 (phone)
+44 (0) 7759 633626  (mobile)
colin.campb...@ptfs-europe.com
skype: colin_campbell2

http://www.ptfs-europe.com


Re: Turning MARC record object back into a string

2012-03-14 Thread Colin Campbell
On Mon, Mar 12, 2012 at 01:49:36PM +, Anne Highsmith wrote:
 
 Can anybody think of any reason why I shouldn't just use  my $new_marc = 
 $bib_rec-as_usmarc();. And if you can, can you suggest an alternative way 
 to correctly turn the object into a string?
 
No thats the way the interface should work.
I think your use of sprintf
is mistaken. sprintf takes a format string and a list of variables and
formats the variables according to the format string. Use of sprintf and
printf is often a danger sign if they turn up in perl code.
What your code was doing was using the string returned by the -as_usmarc
method and using that to format an empty list of variables, as soon as
'%' cropped up in a record the fact that the list was empty became a
problem. 

Your only potential problems with the marc string are if you want to
pass it to some interface that will get upset at MARC's embedded control
characters or the character set used in the record. But I suspect thats
not a problem because sprintf would not have changed that.

Colin

-- 
Colin Campbell
Chief Software Engineer,
PTFS Europe Limited
Content Management and Library Solutions
+44 (0) 800 756 6803 (phone)
+44 (0) 7759 633626  (mobile)
colin.campb...@ptfs-europe.com
skype: colin_campbell2

http://www.ptfs-europe.com


Re: Bug in MARC::Record ?

2011-01-05 Thread Colin Campbell
On 05/01/11 10:21, Paul Poulain wrote:
 Hello,

 replace
 use Carp qw(croak);
 by
 use Carp qw(croak carp);
 and things are going much better.

More simply just

use Carp;

croak and carp (and confess) are exported by default.

Colin

-- 
Colin Campbell
Chief Software Engineer,
PTFS Europe Limited
Content Management and Library Solutions
+44 (0) 845 557 5634 (phone)
+44 (0) 7759 633626  (mobile)
colin.campb...@ptfs-europe.com
skype: colin_campbell2

http://www.ptfs-europe.com


Re: Moose based Perl library for MARC records

2010-11-10 Thread Colin Campbell
On 10/11/10 11:59, Bill Birthisel wrote:

 3. There is not really much of a problem (for users) with long names. It
 appears MooseX:: is currently in common use on CPAN - so I would
 recommend MooseX::MARC. That appears to be to clearest choice and the
 one that fits the current naming patterns most closely.

MooseX is wrong. MooseX is the namespace for extensions to Moose such as
new Types. This module should be in the MARC namespace. Something like
MARC::NG is called for.
Looking forward to using this, Frédéric, good idea
Cheers
Colin


-- 
Colin Campbell
Chief Software Engineer,
PTFS Europe Limited
Content Management and Library Solutions
+44 (0) 208 366 1295 (phone)
+44 (0) 7759 633626  (mobile)
colin.campb...@ptfs-europe.com
skype: colin_campbell2

http://www.ptfs-europe.com


Re: MARC Records, XML, and encoding

2006-05-18 Thread Colin Campbell

Edward Summers wrote:


On May 18, 2006, at 6:48 AM, Joshua Ferraro wrote:

Anyway, if anyone can shed some light on this I'd be grateful.


I believe the data loss you are seeing is due to your source 
records--not to do with character translation.
Just a quick look but I think in many cases the actual record length is 
2 more than the length stated in the leader


Cheers
Colin


Re: Corrupt MARC records

2005-05-11 Thread Colin Campbell
Ron Davies wrote:
Has anybody ever seen a MARC record where the order of the field data 
wasn't the same as that of the entries in the directory? I'm not 
questioning the logic of reading a record using the field lengths and 
offsets, just wondering if anybody had ever seen this occur in the wild. 
I never have.

I have although I can't recall where it came from. (It was some years 
ago) The problem was exacerbated because the program reading it assumed 
the directory and field sequence matched and was not flagging any 
errors. It was sometime later that users spotted some records were odd 
and it took a while to trace it back to this cause.

Colin
--
Colin Campbell
Software Development Consultant
Sirsi Ltd


Re: [patch] Accept # as Blank Indicator

2003-11-19 Thread Colin Campbell
On Tue, Nov 18, 2003 at 10:50:22PM -0600, Chuck Bearden [EMAIL PROTECTED] wrote:
 On Tue, Nov 18, 2003 at 07:50:39PM -0600, Ed Summers wrote:
  On Tue, Nov 18, 2003 at 08:11:39PM -0500, Morbus Iff wrote:
MARC::Field-new('100','1','', a='Logan, Robert K.', d='1939-'),
MARC::Field-new('100','1','#', a='Logan, Robert K.', d='1939-'),
  
  I don't like this. The # is used simply as a typographical convention in LC's
  online docs. It has nothing to do with the actual content found in MARC
  records.
 
 I think Ed is right.  As I recall, OCLC used to use an underscore for
 blank indicator positions, but now they seem to be using the doodad
 represented in this image:
 
I'd second that. I've seem software that generated serious problems
through confusing the the typographical conventions with the data. Also
the fact the space is enclosed in quotes is doing the same thing for the
Human reader that LOC's hash is doing (showing it is a space not a null
string)
Colin

-- 
  Colin Campbell 
  Technical Services Consultant
  Sirsi Ltd
  [EMAIL PROTECTED]


Re: Manuall created records

2003-10-15 Thread Colin Campbell
On Wed, Oct 15, 2003 at 05:41:54PM +0300, Christoffer Landtman [EMAIL PROTECTED] 
wrote:
 The character at position 24 which according to Ashley should be a field 
 terminator instead of blank, did apparently not affect Zebra. I do not 
 know if this is good or bad (Zebra too sloppy or Ashley too strict...=) 
 but for the time being, it seems to work.
 
The first field terminator should be after the directory (which follows
the label)
C.

-- 
  Colin Campbell 
  Technical Services Consultant
  Sirsi Ltd
  [EMAIL PROTECTED]