Re: reading and writing of utf-8 with marc::batch [double encoding]

2013-03-28 Thread Ashley Sanders
tional/questions/qa-forms-utf-8 You'll need to add the MARC control characters ^_, ^^, and ^] to the ASCII part of the expression in the above page. (I think the w3c example is aimed at XML1.0 in which the MARC control characters are not allowed.) Ashley. -- Ashley Sanders a.sand...@manchester

Re: Identifying file formats for older files

2011-09-02 Thread Ashley Sanders
but the results weren't very good. >> >> Anyone have recommendations? I'd prefer it be in perl but it doesn't have to >> be. >> >> Arvin >> -- Ashley Sanders a.sand...@manchester.ac.uk Copac http://copac.ac.uk -- A Mimas service funded by JISC

Re: Invalid UTF-8 characters causing MARC::Record crash.

2011-05-17 Thread Ashley Sanders
-8 data mixed in with UTF-8. Or a MARC-8 record with the wrong leader info. Unfortunately bad UTF-8 is pretty common in my experience. You can use a regexp to check if something is valid utf-8: http://keithdevens.com/weblog/archive/2004/Jun/29/UTF-8.regex Then it's up to you to take appropriate

Re: Splitting a large file of MARC records into smaller files

2010-01-25 Thread Ashley Sanders
e MARC end-of-record characters into newlines. Then use the split command to carve up the output of tr into files of 1000 records. You then may have to use tr to convert the newlines back to MARC end-of-record characters. Ashley. -- Ashley Sanders a.sand...@manchester.ac.uk Copac http://copac.ac.uk A Mimas service funded by JISC

Re: Help for z39.50 search in proquest dissertations and theses

2008-02-19 Thread Ashley Sanders
struction, ie: $mgr = new Net::Z3950::Manager(async => 0, user => xxx, pass => yyy); or perhaps adding it here instead: $conn = new Net::Z3950::Connection($mgr, $host, $port, async => 0); Hope this helps, Regards, Ashley. -- Ashley Sanders [EMAIL PROTECTED] Copac http://copac.ac.uk A Mimas service funded by JISC

Re: Obtaining holdings data from library catalogs

2007-12-20 Thread Ashley Sanders
k the same MARC bib record that you are already getting, and attached to it, a series of holdings records either in MARC format or in a z39.50 specific format. How you access these holdings records will, again, vary depending on the client software you are using. Regards, Ashley. -- Ashley Sanders

Re: Character set tests [was MARC::Charset]

2007-03-15 Thread Ashley Sanders
appear somewhere. Ashley. -- Ashley Sanders [EMAIL PROTECTED] Copac http://copac.ac.uk A MIMAS Service funded by JISC

Re: MARC::Charset

2007-03-14 Thread Ashley Sanders
]|\xf0[cC]) which may be rather too simple. For a critical application I'd come up with something a bit better (after first eye-balling a load of records.) Just as an aside, I'm not using perl -- I'm using the Boost Regexp library for C++ (which is a good implementation of perl regexp

Re: MARC::Charset

2007-03-14 Thread Ashley Sanders
ring of text (which admittedly you don't tend to get in MARC records) that tests as UTF-8 is very unlikely to be anything else. Distinguishing Latin-1 from MARC-8 is a bit more like guess work. As a test for MARC-8 I look for the common combining diacritics followed by a vowel. Regards, A

Re: Slowdown when fetching multiple records

2006-02-21 Thread Ashley Sanders
umber of records it will return in any one reuqest, but I wouldn't have thought 20 records would cause any server a problem (unless they are very large records.)) Ashley. -- Ashley Sanders [EMAIL PROTECTED] Copac http://copac.ac.uk -- A MIMAS service funded by JISC

Re: diff and sed

2005-04-04 Thread Ashley Sanders
) | ed - xx Ashley. -- Ashley Sanders [EMAIL PROTECTED] Copac http://copac.ac.uk -- A MIMAS service funded by JISC

Re: option items sorted in pop-up menus

2005-03-10 Thread Ashley Sanders
a hash reference. I think you need to use the -lables option which does take a reference to a hash. The -lables hash lets you display one thing to the user and return another value to your script. Regards, Ashley. -- Ashley Sanders [EMAIL PROTECTED] Copac http://copac.ac.uk -- A MIMAS service funded by JISC

Re: Ignoring Diacritics accessing Fixed Field Data

2005-01-13 Thread Ashley Sanders
t; by John Doe becomes Doe,Foo in our database. Which is a long winded way of saying that a simple substr ($TITLE, 0, 4) may not be appropriate in all cases. Regards, Ashley. -- Ashley Sanders [EMAIL PROTECTED] Copac http://copac.ac.uk -- A MIMAS service funded by JISC

Re: Character sets

2004-11-24 Thread Ashley Sanders
cord does not play nicely with Unicode (UTF8). http://rt.cpan.org/NoAuth/Bug.html?id=3707 It is possible they are MARC-8 characters rather than utf-8. In MARC-8 E5 is "macron" and F2 is "dot below." Is MARC::Record trying to treat than as Unicode when in fact they are MARC-

Re: adding a MARC tag called SYS

2004-08-03 Thread Ashley Sanders
MARC 21 specifications for record structure, character sets and exchange media" published in 2000 by the LoC. ISBN 0844410063. Ashley. -- Ashley Sanders [EMAIL PROTECTED] COPAC: A public bibliographic database from MIMAS, funded by JISC http://copac.ac.uk/ - [EMAIL PROTECTED]

Re: adding a MARC tag called SYS

2004-08-02 Thread Ashley Sanders
eroes." So by the above definitions a tag of "00A" is a control field whereas "SYS" is a data field. Regards, Ashley. -- Ashley Sanders [EMAIL PROTECTED] COPAC: A public bibliographic database from MIMAS, funded by JISC http://copac.ac.uk/ - [EMAIL PROTECTED]

Re: Zeta Perl Opac Format

2004-04-27 Thread Ashley Sanders
cords or of a special Z39.50 defined format. Regards, Ashley. -- Ashley Sanders[EMAIL PROTECTED] COPAC: A public bibliographic database from MIMAS, funded by JISC http://copac.ac.uk/ - [EMAIL PROTECTED]

Re: Adding non standard MARC subfields with MARC::Record

2004-04-05 Thread Ashley Sanders
and '=' or else weirdness will ensue. :) Just to point out that '?' and '=' are (amongst many other non alpha- numeric characters) explicitly allowed by MARC21 for use in local data elements. So they are standard conforming really. Ashley. -- Ashley Sanders[EMAIL PROTECTED] COPAC: A public bibliographic database from MIMAS, funded by JISC http://copac.ac.uk/ - [EMAIL PROTECTED]

Re: Manuall created records

2003-10-15 Thread Ashley Sanders
Ed, > Thanks for the details Ashley. The full details (my email had a couple of typos) are at: http://www.loc.gov/marc/bibliographic/ecbdldrd.html (I think the above page uses # to represent a space character.) Ashley. -- Ashley Sanders[EMAIL PROTEC

Re: Manuall created records

2003-10-15 Thread Ashley Sanders
Test Author00aTest Title If an application such as zebra is doing things correctly, then it has every right to think the record is bad if it sees these errors. Of course, it may be something else completely. Regards, Ashley. -- Ashley Sanders[EMAIL PROTEC

Re: newbie question: isbn -> marc?

2003-08-22 Thread Ashley Sanders
'isbn' => $isbn); > foreach my $i (1..$rs->count) { > $book = $rs->match($i); > print $book->title_proper, "\n"; > ... other MARC::Record operations here ... > } Have you seen the perl binding of Zoom; an easy to u