subject:"Re\: \[CODE4LIB\] more on MARC char encoding"

Re: [CODE4LIB] more on MARC char encoding

2012-04-26 Thread Joe Atzberger

mobile # do...@uta.edu # http://rocky.uta.edu/doran/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Deng, Sai Sent: Friday, April 20, 2012 8:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-20 Thread Andrew Cunningham

@LISTSERV.ND.EDU] On Behalf Of Robert Haschart Sent: Thursday, April 19, 2012 2:23 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 On 4/18/2012 12:08 PM, Jonathan Rochkind wrote: On 4/18/2012 11:09 AM, Doran, Michael D wrote

Re: [CODE4LIB] more on MARC char encoding

2012-04-20 Thread Deng, Sai

, 2012 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding Ah, thanks Terry. That canned cleaner in MarcEdit sounds potentially useful -- I'm in a continuing battle to keep the character encoding in our local marc corpus clean. (The real blame here is on cataloger

Re: [CODE4LIB] more on MARC char encoding

2012-04-20 Thread Reese, Terry

outside of the general smart quote issue. --TR -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Deng, Sai Sent: Friday, April 20, 2012 6:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding If a canned cleaner can

Re: [CODE4LIB] more on MARC char encoding

2012-04-20 Thread Doran, Michael D

# http://rocky.uta.edu/doran/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Deng, Sai Sent: Friday, April 20, 2012 8:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding If a canned cleaner can be added

Re: [CODE4LIB] more on MARC char encoding

2012-04-19 Thread Deng, Sai

[mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tod Olson Sent: Tuesday, April 17, 2012 10:13 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 In practice it seems to mean UTF-8. At least I've only seen UTF-8, and I can't

Re: [CODE4LIB] more on MARC char encoding

2012-04-19 Thread Reese, Terry

AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding If your records are really in MARC8 not UTF8, your best bet is to use a tool to convert them to UTF8 before hitting your XSLT. The open source 'yaz' command line tools can do it for Marc21. The Marc4J package can

Re: [CODE4LIB] more on MARC char encoding

2012-04-19 Thread Jonathan Rochkind

Sent: Thursday, April 19, 2012 11:13 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding If your records are really in MARC8 not UTF8, your best bet is to use a tool to convert them to UTF8 before hitting your XSLT. The open source 'yaz' command line tools can do

Re: [CODE4LIB] more on MARC char encoding

2012-04-19 Thread LeVan,Ralph

quotes/values. --TR -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Thursday, April 19, 2012 11:13 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding If your records are really

Re: [CODE4LIB] more on MARC char encoding

2012-04-19 Thread Jonathan Rochkind

On 4/19/2012 3:23 PM, LeVan,Ralph wrote: We see Unicode data pasted into MARC8 records all the time. It happens enough that my MARC8-Unicode converter takes a second look at illegal MARC8 bytes and tries a UTF-8 encoding as well. Right. I see it too. I'm arguing that means cataloger entry

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-19 Thread Robert Haschart

On 4/18/2012 12:08 PM, Jonathan Rochkind wrote: On 4/18/2012 11:09 AM, Doran, Michael D wrote: I don't believe that is the case. Take UTF-8 out of the picture, and consider the MARC-8 character set with its escape sequences and combining characters. A character such as an n with a tilde

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Tod Olson

It has to mean UTF-8. ISO 2709 is very byte-oriented, from the directory structure to the byte-offsets in the fixed fields. The values in these places all assume 8-bit character data, it's completely baked in to the file format. -Tod On Apr 17, 2012, at 6:55 PM, Jonathan Rochkind wrote:

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Peter Noerr

for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Bill Dueber Sent: Tuesday, April 17, 2012 5:50 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 On Tue, Apr 17, 2012 at 8:46 PM, Simon Spero sesunc

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Jonathan Rochkind

On 4/18/2012 6:04 AM, Tod Olson wrote: It has to mean UTF-8. ISO 2709 is very byte-oriented, from the directory structure to the byte-offsets in the fixed fields. The values in these places all assume 8-bit character data, it's completely baked in to the file format. I'm not sure that

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Doran, Michael D

-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tod Olson Sent: Wednesday, April 18, 2012 5:04 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread LeVan,Ralph

In fact, I worry that the standard may pre-date UTF-8, with it's reference to UCS --- if I understand things right, at one point there was only one unicode encoding, called UCS, which is basically a backwards-compatible subset of what became UTF-16. So I worry the standard really means

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Karen Coyle

UTF-8 was the marc standard from the beginning: http://www.loc.gov/marc/marbi/1998/98-18.html The first proposals were a character mapping between Unicode and MARC-8 and didn't mention the character encodings, thus the term UCS which was a common term for Unicode at that time. (see:

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Huwig,Steve

. -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Doran, Michael D Sent: Wednesday, April 18, 2012 10:05 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 Hi Tod, I'm

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Doran, Michael D

/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Huwig,Steve Sent: Wednesday, April 18, 2012 9:21 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 I could be mistaken (never having

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Andy Kohler

I don't know about ISO 2709 itself, but the MARC21 implementation of it refers to octets, aka 8-bit bytes: http://www.loc.gov/marc/specifications/specrecstruc.html Characters may be encoded using one or more than one octet, depending on the character set. All ASCII characters are encoded using

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Houghton,Andrew

-Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Tuesday, April 17, 2012 19:55 To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 Okay, forget XML for a

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Doran, Michael D

.) ;-) -- Michael -Original Message- From: Jonathan Rochkind [mailto:rochk...@jhu.edu] Sent: Wednesday, April 18, 2012 11:09 AM To: Code for Libraries Cc: Doran, Michael D Subject: Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21 On 4/18/2012 11:09 AM

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-18 Thread Tod Olson

In practice it seems to mean UTF-8. At least I've only seen UTF-8, and I can't imagine the code that processes this stuff being safe for UTF-16 or UTF-32. All of the offsets are byte-oriented, and there's too much legacy code that makes assumption about null-terminated strings. -Tod On Apr

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-17 Thread Simon Spero

On Tue, Apr 17, 2012 at 7:55 PM, Jonathan Rochkind rochk...@jhu.edu wrote: Okay, forget XML for a moment, let's just look at marc 'binary'. First, for Anglophone-centric MARC21. Actually Anglo and Francophone centric. And the USMARC style 245 was a poor replacement for the UKMARC approach

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

2012-04-17 Thread Bill Dueber

On Tue, Apr 17, 2012 at 8:46 PM, Simon Spero sesunc...@gmail.com wrote: Actually Anglo and Francophone centric. And the USMARC style 245 was a poor replacement for the UKMARC approach (someone at the British Library hosted Linked Data meeting wondered why there were punctation characters

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21

25 matches

Site Navigation

Mail list logo

Footer information