* Pali Rohár [2016-05-12 20:23]:
> If both functions should do same thing, why we have duplicity?
Encode.pm is big and fairly slow, because it handles a zillion encodings
and has lots of options for handling invalid input data. Perl needs only
UTF-8 transcoding and needs it fast, so it has code f
On Friday 06 May 2016 09:24:01 Karl Williamson wrote:
> On 05/05/2016 08:37 AM, Pali Rohár wrote:
> >Hi!
> >
> >I though that I understand UTF-8 encoding/decoding done in perl until I
> >looked into source code of Encode package... (exactly sub encode_utf8)
> >
>
On 05/05/2016 08:37 AM, Pali Rohár wrote:
Hi!
I though that I understand UTF-8 encoding/decoding done in perl until I
looked into source code of Encode package... (exactly sub encode_utf8)
Before... I only read description of Encode package (not source code):
https://metacpan.org/pod/Encode
* Pali Rohár [2016-05-06 14:50]:
> 1. What is difference between those two calls?
>
> utf8::encode($str);
>
> and
>
> $str = Encode::encode('utf8', $str);
>
> 2. What is difference between those?
>
> utf8::decode($str);
> $str = Encode::decode_utf8($str);
They do the same thing with different
Hi!
I though that I understand UTF-8 encoding/decoding done in perl until I
looked into source code of Encode package... (exactly sub encode_utf8)
Before... I only read description of Encode package (not source code):
https://metacpan.org/pod/Encode#UTF-8-vs.-utf8-vs.-UTF8
I tried to find some
* Geoffrey Leach [2014-02-10 07:35]:
> Is there a way to force (from my module) the choice to be LE? It turns
> out that the library I'm supporting (taglib) works in LE.
Does it need a BOM prepended?
If not, just do the obvious and `encode('UTF-16LE', $str)`.
C.f. `perldoc Encode::Unicode`.
Re
I'm the maintainer of Audio::Taglib.
Summary of my perl5 (revision 5 version 16 subversion 3) configuration:
osname=linux, osvers=3.10.9-200.fc19.x86_64,
archname=x86_64-linux-thread-multi
$utf16 = encode("UTF-16", "\x{6211}\x{7684}") prepends a big-endian BOM. As
best I can tell this r
In <http://stackoverflow.com/q/6281049#comment-7334585>, tchrist asks:
| Is `encoding::warnings` actually still needed given the `/dual` modifiers
| and the `unicode_strings` feature?
signature.asc
Description: This is a digitally signed message part.
Michael Ludwig (mil...@gmx.de) writes:
>> For instance, I use Windows exclusively, so Unicode in file names is
>> no problem.
>
> Did a quick test:
>
> (v5.12.1) built for MSWin32-x86-multi-thread (so ActiveState)
>
> * aââ¬Â¦b.txt
> * not correct
> * doesn't have anything with "uni" or "utf"
tance, I use Windows exclusively, so Unicode in file names is
> no problem.
Did a quick test:
\,,,/
(o o)
--oOOo-(_)-oOOo--
use strict;
use warnings;
use utf8;
my $fn = 'a…b.txt'; # mit Unicode-Zeichen
open my $fh, '>:encoding(UTF-8)', $fn or die
Michael Ludwig (mil...@gmx.de) writes:
> Erland Sommarskog schrieb am 29.01.2011 um 14:02 (+0100):
>
>> Yes, there certainly seems to be some more stuff to do in the Unicode
>> support in Perl. For instance, support for Unicode filenames in open
>> or opendir.
>
> I think there is no portable ans
Erland Sommarskog schrieb am 29.01.2011 um 14:02 (+0100):
> Yes, there certainly seems to be some more stuff to do in the Unicode
> support in Perl. For instance, support for Unicode filenames in open
> or opendir.
I think there is no portable answer here, as it depends on the
filesystem's suppor
"Jan Dubois" (j...@activestate.com) writes:
> I've double-checked with Leon, who thinks that this is due to bug 38456:
>
> http://rt.perl.org/rt3//Public/Bug/Display.html?id=38456
>
> He made a patch to fix the bug, and the patch has been applied to
> bleadperl already. I ran you sample scri
On Fri, 21 Jan 2011, Erland Sommarskog wrote:
> "Jan Dubois" (j...@activestate.com) writes:
> > You need to stack the I/O layers in the right order. The :encoding()
> > layer needs to come last (be at the bottom of the stack), *after* the
> > :crlf layer adds the ad
"Jan Dubois" (j...@activestate.com) writes:
> You need to stack the I/O layers in the right order. The :encoding()
> layer needs to come last (be at the bottom of the stack), *after* the
> :crlf layer adds the additional carriage returns. The way to pop the
> default :crlf
I wrote:
> I saw some discussion today that the :raw pseudo-layer in the open()
> call will also remove the buffering layer (it doesn’t do that when you
> use it in a binmode() call). I’ll try to remember to send a followup
> once I actually understand what is going on.
That seems indeed to be the
:40 PM
To: perl-unicode@perl.org
Subject: Re: encoding(UTF16-LE) on Windows
Jan Dubois wrote:
Files opened on Windows already have the :crlf layer pushed by default,
so you somehow need to get the :encoding layer *below* it.
Is it possible to re-write the working statement
open(my $fh
On Fri, 21 Jan 2011, Erland Sommarskog wrote:
>
> There is still one thing that is not clear to me. The incorrect end-of-line
> was
>
> 0D 00 0A
>
> But the way you describe it, I would expect it to be
>
> 0D 0A 00
I went back to the very first message in the thread, where you write:
| Wh
"Jan Dubois" (j...@activestate.com) writes:
> Now when you print a string to the filehandle, then it will be passed
> to the top-most layer first (:crlf), which will s/\n/\r\n/g on the
> string, and then passes it on to the next lower layer :encoding, which
> will do the
Jan Dubois wrote:
Files opened on Windows already have the :crlf layer pushed by default,
so you somehow need to get the :encoding layer*below* it.
Is it possible to re-write the working statement
open(my $fh, ">:raw:encoding(UTF-16LE):crlf", $filename) or die $!;
in a w
[RE: encoding(UTF16-LE) on Windows]
Jan Dubois schrieb am 20.01.2011 um 12:45 (-0800):
> On Thu, 20 Jan 2011, Michael Ludwig wrote:
> > Erland Sommarskog schrieb am 20.01.2011 um 08:29 (-):
> > > "Jan Dubois" (j...@activestate.com) writes:
> > > > You n
On Thu, 20 Jan 2011, Erland Sommarskog wrote:
> One can sense some potential for improvements. Not the least in the
> documentation area.
This is open source. Patches welcome! This is how things get better.
Cheers,
-Jan
On Thu, 20 Jan 2011, Michael Ludwig wrote:
> Erland Sommarskog schrieb am 20.01.2011 um 08:29 (-):
> > "Jan Dubois" (j...@activestate.com) writes:
> > > You need to stack the I/O layers in the right order. The :encoding()
> > > layer needs to come last (b
Erland Sommarskog schrieb am 20.01.2011 um 08:29 (-):
> "Jan Dubois" (j...@activestate.com) writes:
> > You need to stack the I/O layers in the right order. The :encoding()
> > layer needs to come last (be at the bottom of the stack), *after* the
> > :crlf lay
"Jan Dubois" (j...@activestate.com) writes:
> You need to stack the I/O layers in the right order. The :encoding()
> layer needs to come last (be at the bottom of the stack), *after* the
> :crlf layer adds the additional carriage returns. The way to pop the
> default :crlf
Jan Dubois schrieb am 19.01.2011 um 11:08 (-0800):
> You need to stack the I/O layers in the right order. The :encoding()
> layer needs to come last (be at the bottom of the stack), *after* the
> :crlf layer adds the additional carriage returns. The way to pop the
> default :crlf
On Wed, 19 Jan 2011, Michael Ludwig wrote:
> Erland Sommarskog schrieb am 17.01.2011 um 13:57 (-):
> > I'm on Windows and I have this small script:
> >
> >use strict;
> >open F, '>:encoding(UTF-16LE)', "slask2.txt";
> >
Erland Sommarskog schrieb am 17.01.2011 um 13:57 (-):
> I'm on Windows and I have this small script:
>
>use strict;
>open F, '>:encoding(UTF-16LE)', "slask2.txt";
>print F "1\n2\n3\n";
>close F;
>
> When I open t
I'm on Windows and I have this small script:
use strict;
open F, '>:encoding(UTF-16LE)', "slask2.txt";
print F "1\n2\n3\n";
close F;
When I open the output in a hex editor I see
31 00 0D 0A 00 32 00 0D 0A 00 33 0D 0A 00
I would expect to se
harryfm...@comcast.net wrote:
Various places in the Perl docs say, with good and sufficient reason, that
when reading a UTF-8 file, it should be opened '<:encoding(utf8)' rather than
'<:utf8'.
Actually, what you want is ":encoding(UTF-8)" because that is more strict.
-- Darren Duncan
Dear List,
Various places in the Perl docs say, with good and sufficient reason, that when
reading a UTF-8 file, it should be opened '<:encoding(utf8)' rather than
'<:utf8'.
The thing is, nowhere can I find documented what happens when a malformed
character is enc
Am 03.02.2010 um 08:55 schrieb Aristotle Pagaltzis:
> * Michael Ludwig [2010-02-02 17:35]:
>> use encoding 'utf8';
>
> The `encoding` pragma is broken. Do not use it.
>
> You want
>
>use open ':encoding(UTF-8)', ':std';
Thanks
* Michael Ludwig [2010-02-02 17:35]:
> use encoding 'utf8';
The `encoding` pragma is broken. Do not use it.
You want
use open ':encoding(UTF-8)', ':std';
Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>
I was under the assumption that:
use encoding 'utf8';
was equivalent to:
use utf8; # source in UTF-8
binmode STDIN, ':utf8';
binmode STDOUT, ':utf8;
But that does not seem to be the case. Please consider and run
the following script:
use strict;
use warnings;
es not test, if a string is
really UTF-8. It seems to be to intended to check, if perl stores
the string internally in a multi byte encoding.
mfg Martin Kögler.
Tuesday 04 December 2007 10:47:39 Ismail Dönmez yazmıştı:
> Tuesday 04 December 2007 10:44:12 Martin Koegler yazmıştı:
> > On Tue, Dec 04, 2007 at 10:33:39AM +0200, Ismail Dönmez wrote:
> > > Following to_utf8 function works for me :
> >
> > For me too (Debian sarge+etch).
>
> Thanks for testing.
return $str;
In the original thread, there was some discussion, that some people
might want a different fallback endcoding. So mayme you should
keep the second call to decode for the fallback encoding.
> }
mfg Martin Kögler
El 4/12/2007, a las 9:55, Ismail Dönmez escribió:
Tuesday 04 December 2007 10:47:39 Ismail Dönmez yazmıştı:
Tuesday 04 December 2007 10:44:12 Martin Koegler yazmıştı:
On Tue, Dec 04, 2007 at 10:33:39AM +0200, Ismail Dönmez wrote:
Following to_utf8 function works for me :
For me too (Debian
On Tue, Dec 04, 2007 at 09:55:04AM +0200, Ismail Dönmez wrote:
> Tuesday 04 December 2007 Tarihinde 09:50:28 yazmt??:
> > The bug affects old versions of perl (Debian sarge = oldstable).
> > As it works on the newer Debian etch, do you really think, that it is
> > a good idea to report issue?
>
> >
> > if(utf8::valid($str))
> > {
> > utf8::decode($str);
> > }
> > ·
> > return $str;
>
> In the original thread, there was some discussion, that some people
> might want a different fallback endcoding. So mayme you should
Tuesday 04 December 2007 10:28:59 Ismail Dönmez yazmıştı:
> Tuesday 04 December 2007 10:16:34 Martin Koegler yazmıştı:
> [...]
>
> > print t("#öäü");
> > print t("#ÀöÌ");
> > print "\n";
>
> How about this one, doesn't even use Encode, uses just built-in utf8
> function :
>
> [~]> cat test.pl
>
Tuesday 04 December 2007 10:16:34 Martin Koegler yazmıştı:
[...]
> print t("#öäü");
> print t("#ÀöÌ");
> print "\n";
How about this one, doesn't even use Encode, uses just built-in utf8
function :
[~]> cat test.pl
binmode STDOUT, ':utf8';
my $str = "#öäü";
if (utf8::valid($str))
{
utf8:
tr,
> > >>>>>>Encode::FB_DEFAULT);
> > >>>>>>- }
> > >>>>>>+ eval { return ($res = decode_utf8($str, Encode::FB_CROAK));
> > >>>>>>};
> > >>>>>>+ return decode($fallback_encoding, $str, Enc
eval { return ($res = decode_utf8($str, Encode::FB_CROAK));
> >>>>>>};
> >>>>>>+ return decode($fallback_encoding, $str, Encode::FB_DEFAULT);
> >>>>>> }
> >>>>>>
> >>This version
Tuesday 04 December 2007 Tarihinde 09:50:28 yazmıştı:
> The bug affects old versions of perl (Debian sarge = oldstable).
> As it works on the newer Debian etch, do you really think, that it is
> a good idea to report issue?
Same problem here with v5.8.8 which is latest stable perl5 release.
Regar
On Mon, Dec 03, 2007 at 06:02:54PM +0100, Jakub Narebski wrote:
> On Mon, 3 Dec 2007, Martin Koegler wrote:
> > eval { $res = decode_utf8(...); }
> > if ($@)
> > return decode(...);
> > return $res
> >
> > or
> >
> > eval { $res = decode_utf8(...); }
> > if (defined $res)
> > return $
ding, $str,
> > >>>>>> Encode::FB_DEFAULT);
> > >>>>>> -}
> > >>>>>> +eval { return ($res = decode_utf8($str, Encode::FB_CROAK)); };
> > >>>>>> +return decode($fallback_encoding, $str, Encode::FB_DEFAULT);
> >
val { return ($res = decode_utf8($str, Encode::FB_CROAK)); };
> >>>>>> + return decode($fallback_encoding, $str, Encode::FB_DEFAULT);
> >>>>>> }
> >>
> >> This version is broken on Debian sarge and etch. Feeding a UTF-8 and a
> >&g
de::FB_CROAK)); };
+ return decode($fallback_encoding, $str, Encode::FB_DEFAULT);
}
This version is broken on Debian sarge and etch. Feeding a UTF-8 and a latin1
encoding of the same character sequence yields to different results.
For the record, this was on a debian
t;>> - }
>>>>> + eval { return ($res = decode_utf8($str, Encode::FB_CROAK)); };
>>>>> + return decode($fallback_encoding, $str, Encode::FB_DEFAULT);
>>>>> }
>
> This version is broken on Debian sarge and etch. Feeding a UTF-8 and a latin1
&
On 2007-11-29 01:04, Jenkins, Nicholas S (GE Money) wrote:
I have 2 data files I want to compare...one is in UTF-16BE (Windows
"Unicode" format) and one is in UTF-8 format.
I wrote 3 perl programs:
*)1 to normalize data in the UTF-16BE source and write to a UTF-8
formatted output file
*)1 to
Hi...I found this DL via the perldoc.perl.org/perluniintro page...if I'm
violating protocol for writing directly, please pardon.
I have 2 data files I want to compare...one is in UTF-16BE (Windows
"Unicode" format) and one is in UTF-8 format.
I wrote 3 perl programs:
*)1 to normalize data in the
On 2007-11-13 19:56, Juerd Waalboer wrote:
$rv = open (OUT2, ">:utf8", "sample2");
Should work well. Remember that you shouldn't use :utf8 for input. In
the general case, :encoding(UTF-8) is safest.
Can you elaborate more on the subtle difference betwe
Paul Bijnens skribis 2007-11-15 14:52 (+0100):
> Can you elaborate more on the subtle difference between:
> binmode(STDIN, ":utf8");
> binmode(STDIN, ":encoding(UTF-8)");
http://search.cpan.org/~rgarcia/perl-5.9.5/pod/perlunifaq.pod#Cheat?!_Tell_me,_how_can_I_che
this. \x in Perl takes codepoint numbers, and C384 is
not the codepoint for the character that you want.
Likewise, the codepoint U+00E5 (LATIN SMALL LITTER A WITH RING ABOVE) is
not at all like U+C3A5, even though the UTF-8 encoding is C3 A5.
Please do yourself a big favor and learn about the di
stanå Bruk AB
Open 'open (OUT1, ">", "sample1")' returns 1
Open 'open (OUT2, ">:utf8", "sample2")' returns 1
Open 'open (OUT3, '>:enco
amatti Shashidhar (DS/EES1)
> Sent: Friday, October 26, 2007 12:03 PM
> To: 'perl-unicode@perl.org'
> Subject: Utf8 encoding
>
> Hello,
> I am parsing an xml file using libxml2. The xml file has umlauts(german
> keys ü/ö/ä etc) , °(degree) atc as the characters.
rser = XML::LibXML->new();
my $doc = $parser->parse_file( $x_file );
What are the contents of the file? (What is its encoding?)
The first line should contain the encoding:
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker <[EMAIL PROTECTED]
icate encoding !"
Below is the snippet of code that I used.
my $parser = "";
my $doc = "";
$parser = XML::LibXML->new();#
$doc = $parser->parse_file( $x_file );
I tried to encode using "setEncoding" but no results.
-Shashi
Dear Mr. Kogai,
Your Encode supports Vietnamese viscii and CP1258 encoding.
As far as I know, there are more 4 legacy Vietnamese out there:
VNI
VPS
TCVN
VIQR
# Refer to http://vietunicode.sourceforge.net/charset/
# for mapping tables.
As a request, can you make Perl's Encode support
On Fri, 2007-01-19 at 22:01 +0900, Marty Pauley wrote:
> On Thu, 18 Jan 2007 20:50:50 +0100 Kjetil Torgrim Homme
> <[EMAIL PROTECTED]> wrote:
>
> > I request you add "646" as an alias for "ascii".
>
> But it isn't the same as ascii! We shouldn't need to add a bug to Perl
> just to fix a dodgy So
Hello
On Thu, 18 Jan 2007 20:50:50 +0100 Kjetil Torgrim Homme
<[EMAIL PROTECTED]> wrote:
> I request you add "646" as an alias for "ascii".
But it isn't the same as ascii! We shouldn't need to add a bug to Perl
just to fix a dodgy Solaris 8 locale setup.
If it really is ISO 646 then you need t
in Solaris 8, the "C" locale uses a charset which is unknown to Perl:
#! /local/bin/perl
use I18N::Langinfo qw(langinfo CODESET);
my $term_encoding = langinfo(CODESET());
binmode STDOUT, ":encoding($term_encoding)";
pri
Hi,
I’m trying to normalize a filehandle of unknown
encoding to UTF8. There is a lot of documentation about changing/converting
data formats but nothing I’ve tried works. Here is my problem and what I
tried to do to solve it.
I have a form upload which is allowing my clients to
Created ticket # 16663.SADAHIRO Tomoyuki <[EMAIL PROTECTED]> wrote: On Mon, 19 Dec 2005 22:28:55 -0800 (PST), rajarshi das <[EMAIL PROTECTED]>wrote> I am testing this with iso-2022-jp encoding :> > use encoding 'iso-2022-jp';> >
On Mon, 19 Dec 2005 22:28:55 -0800 (PST), rajarshi das <[EMAIL PROTECTED]> wrote
> I am testing this with iso-2022-jp encoding :
> ----
> use encoding 'iso-2022-jp';
>
> $a = "^[$B$!^[(B";
> print "a : $a\n";
> ---
--- SADAHIRO Tomoyuki <[EMAIL PROTECTED]> wrote:
>
> On Wed, 14 Dec 2005 05:19:00 -0800 (PST), rajarshi
> das <[EMAIL PROTECTED]> wrote
>
> > Hi,
> >
> > The following two line script gives an error on
> z/OS : "Unknown encoding '
Rajarshi Das <[EMAIL PROTECTED]> writes:
> Hi,
>
> The following two line script gives an error on z/OS : "Unknown encoding
> 'iso-2022-
> jp' at line ..".
> -
> use Encode; use encoding 'iso-2022-jp';
> -
On Wed, 14 Dec 2005 05:19:00 -0800 (PST), rajarshi das <[EMAIL PROTECTED]> wrote
> Hi,
>
> The following two line script gives an error on z/OS : "Unknown encoding
> 'iso-2022-jp' at line ..".
> -
> use Encode;
> use encodi
Hi, The following two line script gives an error on z/OS : "Unknown encoding 'iso-2022-jp' at line ..". - use Encode; use encoding 'iso-2022-jp'; How do we confirm if iso-2022-jp is supported on z/OS or not ? Or if it i
> On Wed, 7 Sep 2005 20:39:20 -0700, Jerzy Giergiel <[EMAIL PROTECTED]>
> said:
> Neither of those fallbacks is OK, I want á converted to accent
> stripped version of itself i.e. a. The second solution isn't very
> helpful either, it's basically tr replacement table which is not
On Sep 08, 2005, at 12:39 , Jerzy Giergiel wrote:
Neither of those fallbacks is OK, I want á converted to accent
stripped version of itself i.e. a. The second solution isn't very
helpful either, it's basically tr replacement table which is not
much fun to write when majority of upper 128 cha
27;s gotta be a simpler and more elegant solution.
thanks anyway.
sorry for bugging people here with a trivial question. I need to
convert from MacRoman encoding to asci (7-bit). Encode package
simply replaces out of range characters with a question mark. I
need something intelligent lex
On Sep 08, 2005, at 11:22 , Jerzy Giergiel wrote:
sorry for bugging people here with a trivial question. I need to
convert from MacRoman encoding to asci (7-bit). Encode package
simply replaces out of range characters with a question mark. I
need something intelligent lexically speaking
sorry for bugging people here with a trivial question. I need to
convert from MacRoman encoding to asci (7-bit). Encode package simply
replaces out of range characters with a question mark. I need
something intelligent lexically speaking. For example aacute should
be converted to a. Any
On Fri, Aug 19, 2005 at 05:51:10PM +0530, Sastry wrote:
> Hi
>
> The test case uses the invariant character that is below <127 on
> ISO-8859-16 codepage. Since character 'a' has a codepoint of 129 on
> EBCDIC, is there a place in the code where it should apply
> NATIVE_TO_ASCII macro on the inp
Clark <[EMAIL PROTECTED]> wrote:
> On Fri, Aug 19, 2005 at 05:01:04PM +0530, Sastry wrote:
> > Hi Nicholas
> >
> > With reference to my previous mail on encoding module
> >
> > use Encode;
> > $string = "a";
> > $enc_string = encode("iso
On Fri, Aug 19, 2005 at 05:01:04PM +0530, Sastry wrote:
> Hi Nicholas
>
> With reference to my previous mail on encoding module
>
> use Encode;
> $string = "a";
> $enc_string = encode("iso-8859-16", $string);
> print "\n String: $string\n&quo
Hi Nicholas
With reference to my previous mail on encoding module
use Encode;
$string = "a";
$enc_string = encode("iso-8859-16", $string);
print "\n String: $string\n";
print "\n enc_string: $enc_string\n";
a)How different are those ext/Encode/def_t.
On Wed, Aug 10, 2005 at 02:11:45PM +0530, Sastry wrote:
> On 8/9/05, Nicholas Clark <[EMAIL PROTECTED]> wrote:
> > On Tue, Aug 09, 2005 at 10:58:48AM +0530, Sastry wrote:
> > > > $enc_string = encode("iso-8859-16", $string);
> > So $enc_string should be a single byte, 97, everywhere.
> Can you su
ring = "a";
> > > $enc_string = encode("iso-8859-16", $string);
> > >
> > > print ord ($enc_string), "\n";
>
> 73. Odd.
>
> It should print 97 on all platforms. Because:
>
> $string contains 1 byte, the byte that represe
ode("iso-8859-16", $string);
> > >
> > > print ord ($enc_string), "\n";
>
> 73. Odd.
>
> It should print 97 on all platforms. Because:
>
> $string contains 1 byte, the byte that represents 'a' in the platform's
> default character encoding.
>
> The encode call should convert from the default encoding to iso-8859-16
> And 'a' in iso-8859-16 is 97.
> Everywhere.
>
> So $enc_string should be a single byte, 97, everywhere.
>
> Nicholas Clark
>
ot;\n";
73. Odd.
It should print 97 on all platforms. Because:
$string contains 1 byte, the byte that represents 'a' in the platform's
default character encoding.
The encode call should convert from the default encoding to iso-8859-16
And 'a' in iso-8859-16 is 97.
Everywhere.
So $enc_string should be a single byte, 97, everywhere.
Nicholas Clark
Hi
I get 73 printed on EBCDIC platform. I think it is supposed to print
129 as it is the numeric equivalent of 'a'.
-Sastry
On 8/8/05, Nicholas Clark <[EMAIL PROTECTED]> wrote:
> On Thu, Aug 04, 2005 at 11:51:44AM +0530, Sastry wrote:
> > Hi
> >
> > I am running the following script on EBCDI
On Thu, Aug 04, 2005 at 11:51:44AM +0530, Sastry wrote:
> Hi
>
> I am running the following script on EBCDIC
>
> use Encode;
> $string = "a";
> $enc_string = encode("iso-8859-16", $string);
> print "\n String: $string\n";
> print "\n enc_string: $enc_string\n";
>
>
> The output:
>
> String: a
Hi
I am running the following script on EBCDIC
use Encode;
$string = "a";
$enc_string = encode("iso-8859-16", $string);
print "\n String: $string\n";
print "\n enc_string: $enc_string\n";
The output:
String: a
enc_string: ñ (This is the character for codepoint \xF1 on iso-8859-16)
What is th
Hi
I have the following problem which gives different results when I run
on linux machine and on z/OS
Could explain me the reason for so?
On linux the enc_string will still beeuro
whereas on z/OS it is / >,
Thanks in advance
regards
Sastry
use Encode;
$string = "eur
I have tried the following script on perl-5.8.[356], and I was wonder if
someone could confirm whether this is a bug or known shortcoming of
use encoding 'utf8';
>From the encoding docs:
> Implicit upgrading for byte strings
>
> By default, if strings operating under byte
Good afternoon!
I can't find in internet some module for perl 5.6.1 that can decode or
encode strinmg to or from ucs-2 format. Can you send that to me.
Thank you!
Andrew from Russia
>>> for (qw/UTF-8 UTF-16LE UTF-16BE UTF-32LE UTF-32BE/)
>>> {
>>>my $backup = $string;
>>>open F, "<:encoding($_)", \$backup;
>>>my $char;
>>>read F, $char, 1, 0;
>>>close F;
>>>
>>
ot;");
>>
>> for (qw/UTF-8 UTF-16LE UTF-16BE UTF-32LE UTF-32BE/)
>> {
>>my $backup = $string;
>>open F, "<:encoding($_)", \$backup;
>>my $char;
>>read F, $char, 1, 0;
>>close F;
>>
>>die u
> {
> my $backup = $string;
>open F, "<:encoding($_)", \$backup;
>my $char;
>read F, $char, 1, 0;
>close F;
>
>die unless $backup eq $string;
> }
>
>Gives
>
> utf8 "\xFE" does not map to Unicode at ... line 13.
>
s? :
my $uri = 'http://www.lemonde.fr';
my $fin = '/tmp/latin1.html';
my $fout = '/tmp/utf8.html';
my $charsetin = "text/html; charset=iso-8859-1";
my $charsetout = "text/html; charset=UTF-8";
`curl -o $fin $uri` ;
open(FIN, "<:encoding(iso-8859-1)&qu
om UTF-8
to perl's internal form. Again this is trivial.
>
>However it's not working.
>
>Does that mean that the encoding of the actual characters on the page is
>not in the charset in the meta tag?
Quite possibly - do you mean the chars in the headers or the body?
>
27;s decode function
> to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
> right? I then store that in the db.
>
> However it's not working.
>
> Does that mean that the encoding of the actual characters on the page is
> not in the charset in
Please forgive this going to both lists but I'm not sure where things
are going wrong...
I have many website around the world that I need to index. They're
straight HTML pages rather than perl-served and thus the headers say the
content-type is 'text/html' .. without mentionin
Hi,
Enocde 2.08, PerlIO::scalar 0.02, ActivePerl 5.8.2,
#!perl -w
use strict;
use warnings;
use Encode;
my $string = encode(UTF16 => "");
for (qw/UTF-8 UTF-16LE UTF-16BE UTF-32LE UTF-32BE/)
{
my $backup = $string;
open F, "<:encoding($_)&quo
de:RETURN_ON_ERR
>>> is
>>> on when the callar is PerlIO::encoding...
>>
>> Or, one could backport PerlIO::encoding (with your patch) to CPAN and
>> require this latest version for Encode 2.08.
>
>That was what came across my mind first but I found it was not
d to return
>> immediately at partial character but now Encode:RETURN_ON_ERR is
>> required, meaning those who installed Encode-2.07 on older perl are in
>> trouble w/ PerlIO. So I am looking for a solution which does that
>> without tweaking PerlIO::encoding.
>
>W
d looks like something is broken in
>> PerlIO::encoding.
>> More precisely, ext/PerlIO/t/encoding.t fails test 14, that tests
>> open(F,'<:encoding(utf-8)',$threebyte).
>
>The easiest solution is the patch below;
>
>--- ext/PerlIO/encoding/encoding.pm.dist
1 - 100 of 184 matches
Mail list logo