Re: combining getc() and unicode strings problem?

tim23456 Thu, 16 Dec 2004 15:13:14 -0800

Hello Jonathan, all

> Not had the misfortune to need to play with this stuff, but I guess
> the documentation for perl is a good place to start:
>
[snip]

yes, i read these man pages more than just one time now (at different times), 
so i think i should have not missed anything.

the perl-manpages in question do give information, which functions work and 
which will not work (speaking about unicode..), concerning the getc() function 
however nothing is mentioned.

I've also searched bugs.perl.org for any issue concerning 'getc()', but could 
not dig up anything concerning the unicode context.

> Some aspects are version dependent, so make sure your script
> insists on a minimum version of perl.

This is not (yet) a problem, because I am developer and user at the same time.

> Why are you doing this?  Is most of your experience with C?

I'm afraid to say that i do not qualify as a programmer having any knowledge at 
all.

Consequently I am open to all suggestions of how to accomplish my problem in 
another way. Please help, if you can. What i cannot change however is the fact, 
that i have to cope with UTF-8 input. That's because i am using characters such 
as "§", which cannot be represented with 8859-1 (=Latin1) or 8859-15 Euro (hope 
i am not starting incorrectly at this point!). US-ASCII does also not qualify 
for my needs.
The perl man pages in general explicetely state that recent versions of Perl 
are "unicode ready by default".

I am using Perl 5.8.5 on Linux. Any input on this is very much appreciated.

Thank you, Tim

> Jonathan Paton
> 
> On Thu, 16 Dec 2004 19:18:06 +0200, [EMAIL PROTECTED] <[EMAIL PROTECTED]> 
> wrote:
> > Hello,
> > 
> > i have intensively searched the web for a solution on the following 
> > problem, but could not find any indication for it.
> > 
> > The following code does basicelly nothing else then reading in a file on 
> > single char basis and writing it to a file again. The input file is encoded 
> > as UTF-8 as well as the output file i want to create. I read in the 
> > characters by using getc().
> > However i still get incorrect results in my output-file. Does anybody know 
> > of mistakes i do when combining getc() with reading unicode files?
> > 
> > Any input is greatly appreciated. Thanks very much in advance!
> > 
> > Tim
> > 
> > ( I am using Perl 5.8.5 on Intel SuSE 9.2)
> > 
> > ..
> > 
> > open(INFILE, "< $ARGV[0]") || die "\nCannot open from-file!";
> > open(OUTFILE, "> $ARGV[1]") || die "\nCannot create to-file!";
> > 
> > binmode(OUTFILE, ":utf8");
> > binmode(INFILE, ":utf8");
> > 
> > ..
> > 
> > while(!eof(INFILE)) {
> > 
> >   for ($i = 1; $i < $Ntes_Zeichen; $i++) {
> > 
> >     $dummy = getc(INFILE); if (eof(INFILE)) {exit}
> >     print OUTFILE $dummy;
> > 
> >   }
> > 
> >   $dummy = getc(INFILE);
> >   print OUTFILE $ersetze_durch;
> > 
> > }
> > 
> > close(INFILE);
> > close(OUTFILE);

[snip]
__________________________________________________________
Mit WEB.DE FreePhone mit hoechster Qualitaet ab 0 Ct./Min.
weltweit telefonieren! http://freephone.web.de/?mc=021201

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: combining getc() and unicode strings problem?

Reply via email to