RE: Character sets - kind of solved

2004-12-09 Thread Bryan Baldus
>For that matter I think it's time to remove MARC.pm as well :) I can do the latter unless there are objections. I'm taking inspiration from MARC.pm for the MARC::File::MARCMaker module I'm working on, but I've got a local copy of v. 1.13 from the SourceForge files page. Removing the older v. 1.07

Re: Character sets - kind of solved

2004-12-09 Thread Ed Summers
On Thu, Dec 09, 2004 at 10:32:25AM -0600, John Hammer wrote: > That fixed the problem, going back a version. That will teach me not to > use a beta version for production. Perhaps v1.39_01 should be removed from CPAN to avoid any further confusion. For that matter I think it's time to remove MARC

Re: Character sets - kind of solved

2004-12-09 Thread John Hammer
That fixed the problem, going back a version. That will teach me not to use a beta version for production. Thanks to all who took the time to help me with this, especially Ed. John On Wed, 8 Dec 2004 19:57:26 -0600 Ed Summers <[EMAIL PROTECTED]> wrote: > On Wed, Dec 08, 2004 at 05:47:23PM -060

Re: Character sets - kind of solved

2004-12-08 Thread Ed Summers
On Wed, Dec 08, 2004 at 05:47:23PM -0600, John Hammer wrote: > MARC::Record version 1.39_01. Using diff there is no difference in the > files when using Perl to read in and write out the data. Can you try downgrading to v1.38? v1.39_01 has some experimental utf8 handling code in it which was rele

Re: Character sets - kind of solved

2004-12-08 Thread John Hammer
MARC::Record version 1.39_01. Using diff there is no difference in the files when using Perl to read in and write out the data. John On Wed, 8 Dec 2004 15:43:29 -0600 Ed Summers <[EMAIL PROTECTED]> wrote: > On Wed, Dec 08, 2004 at 03:31:18PM -0600, John Hammer wrote: > > How would deleting the

Re: Character sets - kind of solved

2004-12-08 Thread Ed Summers
On Wed, Dec 08, 2004 at 03:31:18PM -0600, John Hammer wrote: > How would deleting the illegal characters cause changes to the characters in > lines 680 and 690 above? It doesn't explain it :) What version of MARC::Record are you using? What happens when you use perl to read in the data and write

Re: Character sets - kind of solved

2004-12-08 Thread John Hammer
That's different from what I get. What I get is: 1c1 < 30 32 33 35 36 63 61 6d 20 20 32 32 30 30 34 38 |02356cam 220048| --- > 30 32 33 36 34 63 61 6d 20 20 32 32 30 30 34 38 |02364cam 220048| 21,30c21,30 105,149c105,149 < 0680 20 1f 61 42 69 73 e5 61 f2 74 e5 69 2

Re: Character sets - kind of solved

2004-12-08 Thread Ed Summers
On Tue, Dec 07, 2004 at 12:53:44PM -0600, John Hammer wrote: > Attached are the two files. The Marc file seems to be using a Windows font > (1251?). As for the program, the same changes occur if I just read the Marc > file and write it back out with no changes. The Perl I am using is 5.8.3 Ok, I

Re: Character sets - kind of solved

2004-12-07 Thread Ed Summers
John Hammer wrote: > You are correct in assuming the locale environment is set up for UTF-8 > on my computer. However, that wouldn't explain why the record is > different pre-processing vs. post-processing with MARC::Record. Viewing > the two records with the same app (in this case vi) gives differ

Re: Character sets - kind of solved?

2004-12-06 Thread John Hammer
On Mon, 6 Dec 2004 08:54:21 -0600 "Doran, Michael D" <[EMAIL PROTECTED]> wrote: > The original record from John Hammer did not contain UTF-8, it contained > MARC-8. I believe that the fact that the combining MARC-8 characters > were replaced by a generic replacement character only indicates that

Updating MARC::File::XML (was Re: Character sets - kind of solved?)

2004-12-06 Thread Mike Rylander
ld rather I not. :) -- Mike Rylander [EMAIL PROTECTED] GPLS -- PINES Development Database Developer http://open-ils.org > > > -- Michael > > # Michael Doran, Systems Librarian > # University of Texas at Arlington > # 817-272-5326 office > # 817-688-1926 cell > # [E

RE: Character sets - kind of solved?

2004-12-06 Thread Doran, Michael D
s Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 cell # [EMAIL PROTECTED] # http://rocky.uta.edu/doran/ > -Original Message- > From: Mike Rylander [mailto:[EMAIL PROTECTED] > Sent: Saturday, December 04, 2004 1:31 PM > To: [EMAIL PROTECTED] >

Re: Character sets - kind of solved?

2004-12-05 Thread Ed Summers
On Sat, Dec 04, 2004 at 02:30:53PM -0500, Mike Rylander wrote: > I've got a working patch that correctly transcodes records from > USMARC(MARC-8) to MARC21slim(UTF8) and back again. Mike, would you like CVS access priveledges on the sourceforge site so you can commit this stuff? I'm not actively u

Re: Character sets - kind of solved?

2004-12-04 Thread Mike Rylander
ar_set=unicode&char_type=he > x&char_value=fffd > (or just go to http://rocky.uta.edu/doran/urdu/search.cgi and plug > in fffd > > -- Michael > > # Michael Doran, Systems Librarian > # University of Texas at Arlington > # 817-272-5326 office > # 817-688-1

RE: Character sets - kind of solved?

2004-12-03 Thread Doran, Michael D
# 817-688-1926 cell # [EMAIL PROTECTED] # http://rocky.uta.edu/doran/ > -Original Message- > From: Ashley Sanders [mailto:[EMAIL PROTECTED] > Sent: Wednesday, November 24, 2004 2:23 AM > Cc: [EMAIL PROTECTED] > Subject: Re: Character sets > > Ed Summers wrote: >

Re: Character sets

2004-11-24 Thread Ed Summers
On Wed, Nov 24, 2004 at 08:22:47AM +, Ashley Sanders wrote: > Is MARC::Record trying to treat than as Unicode when in fact they > are MARC-8? MARC::Record currently does no transformation of character sets that I'm aware of. There is a completely separate module MARC::Charset whi

RE: Character sets

2004-11-24 Thread Jacobs, Jane W
am is looking for G0 character sets or vice versa. It looks like your bracketed characters: {e5} etc. are G1, not G0. I'm no great programmer (all right, no programmer at all really) but my experience is that G0 seems to be the preferred. I hope some of that might be helpful. JJ **Views ex

Re: Character sets

2004-11-24 Thread Ashley Sanders
Ed Summers wrote: On Tue, Nov 23, 2004 at 04:10:05PM -0600, John Hammer wrote: I have a character problem that I hope someone can help me with. In a MARC record I am modifying using MARC::Record, one of the names contains letters with diacritics. Looking at the name with a hex editor, it gives, wit

Re: Character sets

2004-11-23 Thread Ed Summers
On Tue, Nov 23, 2004 at 04:10:05PM -0600, John Hammer wrote: > I have a character problem that I hope someone can help me with. In a MARC > record I am modifying using MARC::Record, one of the names contains letters > with diacritics. Looking at the name with a hex editor, it gives, with hex > v

Character sets

2004-11-23 Thread John Hammer
I apologize if this is not the correct list to ask this question. I'm at a loss, however, to know where to ask. I have a character problem that I hope someone can help me with. In a MARC record I am modifying using MARC::Record, one of the names contains letters with diacritics. Looking at the