Re: Installing Encode.pm for Perl 5.6.1

2013-07-05 Thread Nicholas Clark
of modules known to pass tests on 5.6.1 There may be an older version of Cache::Memcached that works on 5.6.1 Otherwise I'd guess that your least worse choice to progress is to locally fork Cache::Memcached and remove the code in it which requires Encode. Nicholas Clark

Re: Determining IO layer set on filehandle

2010-01-29 Thread Nicholas Clark
set the STDOUT filehandle to ':raw' and > then to restore the previous IO layers. > Is there a way to determine the IO layers applying to a filehandle > just from the filehandle itself? I think you want PerlIO::get_layers($fh) I'm not sure where it's documented. Nicholas Clark

Re: TAP YAML Diagnostics

2008-04-06 Thread Nicholas Clark
are not those which you want to consider reserved. I guess that one needs to loop over all characters in the string, and verify that if $char eq lc $char then also $char ne uc $char. (But one could first short circuit the common pass case with the test above) Nicholas Clark

Re: bytes pragma problems

2007-04-16 Thread Nicholas Clark
print bytes::twtowtdi() > > Were these problems resolved in later releases of Perl? This appears to have been solved by 5.8.7 Nicholas Clark

Re: Unicode::Collate, useful but useless

2007-04-15 Thread Nicholas Clark
CET 5.0.0 with the release of 5.8.9, it could break things for people who have installed Unicode::Collate with 5.8.8 (or earlier) and are currently using DUCET 4.1.0 So it wouldn't be a great idea. Nicholas Clark

Re: List of unsupported unicode characters?

2007-01-10 Thread Nicholas Clark
hink it likely that 5.8.9 will ship with Unicode 5.0.0 data Nicholas Clark

Re: Layers Issue in SVN::Notify

2006-07-11 Thread Nicholas Clark
raw'). But obviously I'm > wrong. Is there something I'm missing about when and/or where the IO > layer should be set? Anyone run into something like this before? I doubt that perl's at fault. What are you piping the data into, and what does it think of $LANG in the environment? Nicholas Clark

Re: Converting between UTF8 and local codepage without specifying local codepage

2005-11-09 Thread Nicholas Clark
> locale but Unix is all over the map. I don't know. I have little to no experience of doing conversion of real data, certainly for data outside of ISO-8859-1 and UTF-8, and I've never used I18N::Langinfo. I hope that someone else on this list can give a decent answer. Nicholas Clark

Re: Converting between UTF8 and local codepage without specifying local codepage

2005-11-09 Thread Nicholas Clark
data, and there is a (buggy) assumption that 8 bit data can be converted to Unicode by assuming that it's ISO-8859-1. Definitely buggy. Not possible to change without breaking backward compatibility. Nicholas Clark

Re: should a non-breaking space character be treated as whitespace in perl source?

2005-10-25 Thread Nicholas Clark
ault. Under use utf8; Unicode word characters can also be used in identifiers. I doubt that this will change in perl 5, because the parser is written in C, and so it would be very hard work to replace it with something that was fully Unicode aware. Nicholas Clark

Re: Encoding iso-8859-16

2005-08-19 Thread Nicholas Clark
; NATIVE_TO_ASCII macro on the input character? I don't know. And if the test is only checking for invariant characters below 127, it doesn't strike me as a very thorough test. Nicholas Clark

Re: Encoding iso-8859-16

2005-08-19 Thread Nicholas Clark
w. How thorough are the tests? Do the tests check for the conversion of characters with Unicode code points >127? You're asking questions beyond my knowledge. Nicholas Clark

Re: Encoding iso-8859-16

2005-08-10 Thread Nicholas Clark
On Wed, Aug 10, 2005 at 02:11:45PM +0530, Sastry wrote: > On 8/9/05, Nicholas Clark <[EMAIL PROTECTED]> wrote: > > On Tue, Aug 09, 2005 at 10:58:48AM +0530, Sastry wrote: > > > > $enc_string = encode("iso-8859-16", $string); > > So $enc_string should

Re: Encoding iso-8859-16

2005-08-09 Thread Nicholas Clark
On Tue, Aug 09, 2005 at 10:58:48AM +0530, Sastry wrote: > Hi > > I get 73 printed on EBCDIC platform. I think it is supposed to print > 129 as it is the numeric equivalent of 'a'. > > -Sastry > > > > On 8/8/05, Nicholas Clark <[EMAIL PROTECTED]

Re: Transliteration operator(tr//)on EBCDIC platform

2005-08-08 Thread Nicholas Clark
a regexp leave a gap, but [\x89-\x91] not. I don't know where ranges in tr/// are parsed, but given that I grepped for EBCDIC and didn't find any analogous code, it looks like tr/\x89-\x91// is treated as tr/i-j// and in turn i-j is treated as letters and always "special cased" I don't know if tr/i-j// and tr/\x89-\x91// should behave differently (ie whether we currently have a bug) Nicholas Clark

Re: Encoding iso-8859-16

2005-08-08 Thread Nicholas Clark
On your EBCDIC platform, what does this give? use Encode; $string = "a"; $enc_string = encode("iso-8859-16", $string); print ord ($enc_string), "\n"; __END__ Nicholas Clark

Re: gmake (perl-5.8.6) fails on z/OS

2005-07-28 Thread Nicholas Clark
ete. > Thanks for all your help on this. Do you have any idea *why* this change makes things work? Nicholas Clark

Re: bareword test on ebcdic.

2005-07-27 Thread Nicholas Clark
* would explain the failures, and be the > > thing that needs > > correcting. The test file would need if/else with a > > different test on EBCDIC. > what would you suggest be put in the if/ else ? I think that the regression tests tended to do something like if (ord 'A' == 65) { # Do the ASCII/UTF-8 version } else { # Assume EBCDIC } Nicholas Clark

Re: gmake (perl-5.8.6) fails on z/OS

2005-07-26 Thread Nicholas Clark
that make a > difference to miniperl ? Well, the code is linked into miniperl, so I can only assume that it's getting called. If so, does removing the second instance of NATIVE_TO_UTF() improve things? Nicholas Clark

Re: gmake (perl-5.8.6) fails on z/OS

2005-07-26 Thread Nicholas Clark
thing I can think of is that I notice that further down that function there is: #ifdef EBCDIC uv = NATIVE_TO_UTF(uv); #else if ((uv == 0xfe || uv == 0xff) && !(flags & UTF8_ALLOW_FE_FF)) { warning = UTF8_WARN_FE_FF; goto malformed; } #endif Is that second call to NATIVE_TO_UTF still present in your modified code? Nicholas Clark

Re: gmake (perl-5.8.6) fails on z/OS

2005-07-26 Thread Nicholas Clark
y change you've made to the source code? 2: Without that change, how does your build fail? How do the errors differ? Nicholas Clark

Re: value stored in $1

2005-06-10 Thread Nicholas Clark
On Fri, Jun 10, 2005 at 12:02:27PM +0100, Nicholas Clark wrote: > It would be better if you sent 1 e-mail to both perl-unicode.perl.org > and perl5-porters@perl.org to ask a given question, rather than two. It would help if I got the address correct. (or avoided using the format used in DN

Re: request for categorising unicode test failures on z/OS.

2005-04-22 Thread Nicholas Clark
spective ? I also can't answer this, but my hunch is that from a debugging perspective, tackling 7) and 5) first is the way to go. Until these bugs are solved, it's quite probable that attempts to solve the other problems will be hindered by errors introduced by these bugs. Nicholas Clark

Re: is it utf8 or unicode?

2005-03-16 Thread Nicholas Clark
e bugs. CPAN module authors have also fixed bugs. Nicholas Clark

Re: is it utf8 or unicode?

2005-03-16 Thread Nicholas Clark
ecifically its XS code is not checking the internal UTF8 flag before doing things with the PV. Nicholas Clark

Re: is it utf8 or unicode?

2005-03-16 Thread Nicholas Clark
ncode(_utf8_on); my $data = "\xC3\x84"; _utf8_on($data); open FH, ">:utf8", "aa"; print FH $data ; print length($data); to tell perl that the file handle is expecting UTF8 rather than the default, then you get a 2 byte file output. Nicholas Clark

Re: real UTF-8 vs. utf8n_to_uvuni()

2004-12-06 Thread Nicholas Clark
est 107 in ext/Unicode/Normalize/t/illegal.t at line 59 fail #11 not ok 108 # Failed test 108 in ext/Unicode/Normalize/t/illegal.t at line 60 fail #11 not ok 109 # Failed test 109 in ext/Unicode/Normalize/t/illegal.t at line 61 fail #11 not ok 110 # Failed test 110 in ext/Unicode/Normalize/t/illegal.t at line 62 fail #11 ok 111 ok 112 I don't know what is at fault here, the tests, or the patch. Nicholas Clark

Re: Segfault using HTML::Entities

2004-07-07 Thread Nicholas Clark
know how. I don't think that you need to file a report, as we're now aware of it. Jarkko managed to cut the test case down to something very small, but we can't manage to make a fix that doesn't break regexps in something else, seemingly completely unrelated. Nicholas Clark

Re: Segfault using HTML::Entities

2004-06-30 Thread Nicholas Clark
On Wed, Jun 30, 2004 at 10:15:13PM +0100, Richard Jolly wrote: > > On 30 Jun 2004, at 17:52, Nicholas Clark wrote: > > >On Tue, Jun 29, 2004 at 06:49:16PM +0100, Richard Jolly wrote: > >> Script > > > >Could you resend the script/data test case as an attac

Re: Segfault using HTML::Entities

2004-06-30 Thread Nicholas Clark
On Tue, Jun 29, 2004 at 06:49:16PM +0100, Richard Jolly wrote: > Script Could you resend the script/data test case as an attachment please? It's been mangled by the format flowed on your mailer and currently I'm getting errors which suggest that I can't undo that damage.

Re: Unicode filenames on Windows with Perl >= 5.8.2

2004-06-22 Thread Nicholas Clark
for arguing's sake these days. Suggest a workable solution, volunteer to actually do it and I think that everyone will be happy. My only thought is should the API be full SVs, or char pointer plus utf8/not flag? (possibly as 1 bit in a flags word) Nicholas Clark

Re: BOM and principle of least surprise

2004-05-18 Thread Nicholas Clark
rated it into the maintenance branch, but I intend to, so unless it causes really really strange errors (very unlikely) it will be in 5.8.5. (Which in turn will be in mid July) Nicholas Clark

Re: How to handle unicode strings in utf8 and pre-utf8 pragma perls

2003-05-31 Thread Nicholas Clark
if $] >= 5.006, utf8; On CPAN as http://search.cpan.org/author/ILYAZ/if-0.0101/ In the core since 5.8.0 Nicholas Clark

Re: Reading/writing non-Unicode files with perl5.8?

2003-01-14 Thread Nicholas Clark
L and LC_CTYPE don't contain a string matching /utf-?8/i) I don't know what sets these variables on RedHat systemwide, so I don't know how to change them. My personal opinion is that it was premature of RedHat to make RedHat 8.0 *default* to using UTF8 locales, given the general state o

Re: [not-yet-a-PATCH] compress Encode better

2002-12-21 Thread Nicholas Clark
On Mon, Nov 04, 2002 at 03:26:16AM +, [EMAIL PROTECTED] wrote: > Nicholas Clark <[EMAIL PROTECTED]> wrote: > :I've been experimenting with how enc2xs builds the C tables that turn into the > :shared objects. enc2xs is building tables (arrays of struct encpage_t) which >

Re: [not-yet-a-PATCH] compress Encode better

2002-12-20 Thread Nicholas Clark
On Mon, Nov 04, 2002 at 03:26:16AM +, [EMAIL PROTECTED] wrote: > Nicholas Clark <[EMAIL PROTECTED]> wrote: > > :The default method is to see if my substring is already present somewhere, > :if so note where, if not append at the end. The (currently buggy) -O optimiser >

Re: CGI and UTF

2002-11-20 Thread Nicholas Clark
t 5.8.0?] Basically is there something that the perl development community needs to do (or change) that would avoid this in future? Nicholas Clark -- Befunge better than perl? http://www.perl.org/advocacy/spoofathon/

Re: [not-yet-a-PATCH] compress Encode better

2002-11-04 Thread Nicholas Clark
ogai wrote: > > oh wait! Encode.xs remains unchanged so Encode::* may still work > > Confirmed. The NC patch works w/ preexisting shlibs. Good. It would have been worrying if it had not. The idea was not to change any of the internal data structures visible to any code anywhere, just to change how the U8 strings they point were arranged. Nicholas Clark -- z-code better than perl?http://www.perl.org/advocacy/spoofathon/

Re: [not-yet-a-PATCH] compress Encode better

2002-11-03 Thread Nicholas Clark
On Sun, Nov 03, 2002 at 11:13:25PM +, Nicholas Clark wrote: > Currently the appended patch passes all regression tests on FreeBSD on > bleadperl. However, having experimented I know that the new -O function it > provides is buggy in some way, as running -O on the Chinese encodi

[not-yet-a-PATCH] compress Encode better

2002-11-03 Thread Nicholas Clark
Encode-O0-Agg/lib/auto/Encode/Symbol/Symbol.so 937328 18075-Encode-O0-Agg/lib/auto/Encode/TW/TW.so 12039 18075-Encode-O0-Agg/lib/auto/Encode/Unicode/Unicode.so 4706245 total Nicholas Clark -- XSLT better than perl? http://www.perl.org/advocacy/spoofathon/ --- ext/Encode/bin/enc2xs.orig S

Re: RFC 2231 (was Re: Encode::MIME::Header...)

2002-10-09 Thread Nicholas Clark
obsessed with speed currently, so I doubt I can find them by searching for "speed". And I can't remember why I might have suggested that allowing optional arguments would induce serious slowdown. (by implication even when no optional arguments are used) Nicholas Clark PS shameless plug for optimising your perl code talk: http://www.ccl4.org/~nick/P/Fast_Enough/

Re: Unicode to UTF-8

2002-09-08 Thread Nicholas Clark
e considering upgrading from something like 5.005, is there any reason not to consider going straight to 5.8.0? The Unicode 5.8.0 support in 5.8.0 is much better than 5.6.1, and it also fixes many of the bugs still present in 5.6.1. (Nothing is perfect - a few new bugs have been reported in 5.8.0,

Re: Unicode::Collate 0.23 Released

2002-09-05 Thread Nicholas Clark
cpan.org, enter Unicode::Collate in the box, hit go, top of the returned list) CPAN's usually a the best place to start when looking for anything perl. Nicholas Clark

Re: Pattern matching with Unicode (5.6.1)

2002-08-15 Thread Nicholas Clark
On Thu, Aug 15, 2002 at 05:28:43PM -0400, David Gray wrote: > > I'm having a bit of a problem getting Unicode pattern > > matching to do what I would like it to. > > I guess my question wasn't entirely clear. I'm reading in the attatched > file and trying to split it on "\n\n". > > When I'm loo

Re: how to utf8::encode and ::decode in 5.6.1

2002-08-10 Thread Nicholas Clark
On Tue, Aug 06, 2002 at 10:36:09PM +0900, SADAHIRO Tomoyuki wrote: > > On Mon, 5 Aug 2002 22:17:10 +0100 > Nicholas Clark <[EMAIL PROTECTED]> wrote: > > > I'm trying to backport ExtUtils::Constant from 5.8.0 to work on perl pre > > 5.8.0. Currently ExtUtils:

Re: Tk804 + Encode-1.50 :-) again

2002-04-19 Thread Nicholas Clark
e power of the dark side. M-x flyspell-mode Definitely part of the dark side because here it defaults to American. And then refuses to start because I don't have American dictionaries installed. ispell has no problem "just running" and finding the correct dictionaries. Nicholas C

Re: [Encode] Encode::Guide ? (Was: Re: Encode::CJKguide ...)

2002-03-27 Thread Nicholas Clark
the various hoops the different encoding systems are forced to jump through, when one only knows languages which use the Roman alphabet and therefore has had no direct experience of anything other than ASCII and ISO 8859-1. But it's more important to get Encode working well than spend time on th

Re: Encode::XS for CJK

2002-02-21 Thread Nicholas Clark
$seen{$uch} = [$page,$ch]; } enter($e2u,$ech,$uch,$e2u,0); enter($u2e,$uch,$ech,$u2e,0); } else { # No character at this position # enter($e2u,$ech,undef,$e2u); } $ch++; } Is there a bug? Should the $ch++ hap

Re: Encode; Should we aggregate all EUCs?

2002-02-06 Thread Nicholas Clark
On Wed, Feb 06, 2002 at 09:59:44AM +, Nick Ing-Simmons wrote: > Nicholas Clark <[EMAIL PROTECTED]> writes: > >On Tue, Feb 05, 2002 at 04:29:34PM +, Nick Ing-Simmons wrote: > >> If I throw jis208.enc into the pot, then without -O it is 12s > >> and with

Re: Encode; Should we aggregate all EUCs?

2002-02-05 Thread Nicholas Clark
isk buffers, so the OS is doing its best however you config things]. Or am I barking up the wrong tree? Nicholas Clark -- EMCFT http://www.ccl4.org/~nick/CV.html

Re: Encode; Should we aggregate all EUCs?

2002-02-05 Thread Nicholas Clark
It saves 22K but is that worth while? Then surely this extra searching becomes the configure question? Try harder to compress CJK encodings (this will slow your build considerably)? [no] Unless we find a more efficient algorithm to search for common substrings. Nicholas Clark -- EMCFT http://www.ccl4.org/~nick/CV.html