Re: iso-2022-jp, adding encodings..

2001-06-16 Thread Dan Kogai
Ken Lunde's book http://www.oreilly.com/catalog/cjkvinfo/ Dan the Man with Too Many Charsets to Handle -- _ Dan Kogai __/ CEO, DAN co. ltd. /__ /-+-/ 2-8-14-418 Shiomi Koto-ku Tokyo 135-0052 Japan /--/--- mailto: [EMAIL PROTECTED] / http://www.dan.co.jp/ - __/ /Tel

Re: Japanese text search problem

2001-08-07 Thread Dan Kogai
on 01.8.7 9:34 PM, Jarkko Hietaniemi at [EMAIL PROTECTED] wrote: On Tue, Aug 07, 2001 at 05:37:00PM +0530, Ashutosh Salgarkar wrote: Hi all, We are trying to search japanese keyword using a search string(in perl using pattern matching). We are facing problem while searching a particular

Re: Japanese text search problem

2001-08-07 Thread Dan Kogai
on 01.8.7 10:53 PM, Jarkko Hietaniemi at [EMAIL PROTECTED] wrote: * Use perl 5.6.0 or above I would strongly urge using 5.6.1 at this point: several Unicode bugs in 5.6.0 were fixed for 5.6.1. Everyone hear that? Use 5.6.1. That is also the version that is covered by 3rd Camel Book (It

Re: Japanese text search problem

2001-08-07 Thread Dan Kogai
on 01.8.8 1:14 AM, Benjamin Franz at [EMAIL PROTECTED] wrote: On Tue, 7 Aug 2001, Ashutosh Salgarkar wrote: my $safe_key = quotemeta($key1); $searchStr =~ m/$safe_key/; is probably what you want. I am presuming you are trying to use m// to search for exact string matches rather than

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
On 2002.02.02, at 00:32, Jarkko Hietaniemi wrote: So far as I see Linux iconv is ascii-preservative while ICS's is Unicode-strict. From Perl's point of view ASCII preservative should be default. Why? I have already answered in the previous mail (Subject:More on Unicode Mappings,

Re: ICU's uconv vs Linux iconv and UTF-8

2002-02-01 Thread Dan Kogai
Marco, Thank you for elaborating my points. On 2002.02.02, at 01:40, Marco Cimarosti wrote: The entire former contents of this directory are obsolete and have been moved to the OBSOLETE directory. The latest information may be found in the Unihan.txt file in the latest Unicode

Encode; Should we aggregate all EUCs?

2002-02-04 Thread Dan Kogai
Folks, First, thank you for perl@14550. Based upon that, I tried aggregating all EUCs (euc-(cn|jp|kr)) as Nick suggested. It did work nicely except for the time it compiles. Awful lot of time. The tty gets silent for some 3 minutes at "Writing compiled form". With EUC alone taking

Encode-0.40.tar.gz

2002-02-16 Thread Dan Kogai
Folks, I have made Encode-0.40 available at http://www.dan.co.jp/~dankogai/Encode-0.40.tar.gz Changes Implemented based upon Encode/EUC_JP Encode::JP (Changed from Encode::Japanese to make room for submodules) euc-jp shiftjis 7bit-jis

Re: Some Encode::TW test results.

2002-02-18 Thread Dan Kogai
Autrijus, welcome to the club :) On 2002.02.19, at 09:43, Autrijus Tang wrote: Being a native Big5/GB (and HKSCS, Big5+, etc) user, I'm extremely happy to see Dan's work on Encode.pm. :-) Actually more credit should go to Nick Ing-Simmons; Encode::XX is so far all based upon his

Re: Encode test problems in EBCDIC

2002-02-21 Thread Dan Kogai
jhi, On 2002.02.22, at 07:54, Jarkko Hietaniemi wrote: Hi, the new JP test and the old Tcl test are doing somewhat okay in EBCDIC (I'm using an OS/390 mainframe). I wish I had an access to it... Failed Test Stat Wstat Total Fail Failed List of Failed

5.8 roadmap and Encode

2002-02-28 Thread Dan Kogai
jhi, Sorry for the long silence My load average has been high this week On 20020226, at 05:38, Jarkko Hietaniemi wrote: Reminder: 573 one week from now (2002-03-04), code freeze (*) two weeks from now (2002-03-11), 580 RC1 three weeks from now (2002-03-18) (*) Small limited bug fixes

Re: 5.8 roadmap and Encode

2002-03-04 Thread Dan Kogai
On 20020305, at 05:28, Jarkko Hietaniemi wrote: Yup, the extra Encode::XX were needed for static builds I wonder how many folks need static perl these days Note that the CNa JPa KRa TWa bring in extra 55 MB, which is quite a bit extra, the whole Perl being here (x86 linux) only 6 MB I

Re: What to do with non-assigned points?

2002-03-18 Thread Dan Kogai
On Monday, March 18, 2002, at 07:12 , Nick Ing-Simmons wrote: What should Encode:: do in such cases: A. U+FFFD B. Map octet to Unicode/iso-8859-1 C. Use a private use page... We should leave option C for those who want to write their own encoding modules. Whether we should use A

Encode-0.90 is now available

2002-03-19 Thread Dan Kogai
From: Dan Kogai [EMAIL PROTECTED] Date: Tue Mar 19, 2002 11:04:28 Asia/Tokyo To: Dan Kogai [EMAIL PROTECTED] Subject: Encode-0.90 now available Encode Hackers, I have made Encode-0.90 available at http://www.dan.co.jp/~dankogai/Encode-0.90.tar.gz Changes since perl@15300

Encode-0.90 revised

2002-03-19 Thread Dan Kogai
Encode Hackers, On Tuesday, March 19, 2002, at 11:05 , Dan Kogai wrote: Encode Hackers, I have made Encode-0.90 available at http://www.dan.co.jp/~dankogai/Encode-0.90.tar.gz I have revised Encode-0.90 so that it checks jisx0212-1990. The URL remains the same, http://www.dan.co.jp

Time to upload Encode to CPAN?

2002-03-19 Thread Dan Kogai
On Wednesday, March 20, 2002, at 01:01 , Jarkko Hietaniemi wrote: You may want to grab http://www.iki.fi/jhi/Encode-for-Dan.tgz and make that 0.91 (Sadahiro's tweaks, one tweak from NI-S) Yikes. I just made 0.91 with Sadahiro's but without the from NI-S. But it is only at

Re: [craigberry@mac.com: + not a valid filename char on VMS]

2002-03-19 Thread Dan Kogai
Oops. I mean to send it to [EMAIL PROTECTED], not [EMAIL PROTECTED] Address completion is sometimes too smart Craig and Encode hackers, Okay, Here is the quick solution which will also be the part of official changes. In the instruction follows, I assume perl/ext/Encode or

Re: [tagunov@motor.ru: Better names for JIS X 0201/0208/0212? (was: ISO-8859-1 vs ISO 8859-1 (typo + UTF8 case too :)]

2002-03-19 Thread Dan Kogai
Aanton, You may not know this but JISX2\d\d does not determine encoding at at all. They are tables that actual encodings base upon. In other words, RAW JISX2\d\d are NEVER used in actual encodings. Here is how EUC-JP uses ALL of them. EUC-JP consists of ASCII:

No phonetic scripts and still happy?

2002-03-20 Thread Dan Kogai
On Thursday, March 21, 2002, at 01:14 , Larry Wall wrote: Jean-Michel Hiver writes: : Do Chinese use Katakana? Does such encoding make : sense? : : Not as far as I know. Hiragana / Katakana is exclusively a Japanese : thing ain't it? Yes. Bopomofo is the Chinese equivalent. Not

Re: Encode-0.93 released; Uploaded to CPAN

2002-03-20 Thread Dan Kogai
On Thursday, March 21, 2002, at 02:51 , Jarkko Hietaniemi wrote: You really need to have this since Sadahiro's patch removed Tcl::Extended... Change 15336 by jhi@alpha on 2002/03/19 18:41:50 Begone, Encode::Tcl::Extended. Thank you. Applied. Proof as follows; rcsdiff -u

Re: No phonetic scripts and still happy?

2002-03-20 Thread Dan Kogai
On Thursday, March 21, 2002, at 05:54 , Nicholas Clark wrote: By Alphabet you mean the Roman Alphabet? Yes. Roman alphabet that is. But Greek Alphabet is no stranger here in Japan (They all exist in JIS X 0208). Any high school students have to learn to spell theta to mean angle (but I

Re: ext/Encode/t/CJKalias.t

2002-03-20 Thread Dan Kogai
On Thursday, March 21, 2002, at 07:53 , Jarkko Hietaniemi wrote: Change 15377 by jhi@alpha on 2002/03/20 21:50:11 A plan is better. Affected files ... ... //depot/perl/ext/Encode/t/CJKalias.t#3 edit Thank you. Applied. Come to think of it, there is no reason no to test

Alert: Encode alias implementation is buggy

2002-03-20 Thread Dan Kogai
Encode Hackers, I have found a big bug in alias implementation; you can't override alias later on! Suppose there is a user who wants to use vendor encoding. S/he may Encode::define_alias( qr/sjis$/i = 'cp932'); # Windoze or die! But you will still get shiftjis. I checked

Encode alias implementation fixed!

2002-03-21 Thread Dan Kogai
On Thursday, March 21, 2002, at 10:55 , Dan Kogai wrote: sub define_alias [snip] Sorry. My finger has slipped. I am please to announce that define_aliase() has been fixed now. It was tougher than I though. Changes involve... + Encode/ibm-125?.ucm Added from icu distibution

Alert: Module Layout Updated

2002-03-21 Thread Dan Kogai
Encode Hackers, Here are changes since 0.95. I decided to let ISO-8859 and other single-byte encodings demand-load like Encode::XX. + Byte/Byte.pm + Byte/Makefile.PL + EBCDIC/EBCDIC.pm + EBCDIC/Makefile.PL + Symbol/Makefile.PL + Symbol/Symbol.pm ! Encode.pm ! Encode.xs Latin and single

Re: Encode::JP unsupported while Encode::CN and others are

2002-03-23 Thread Dan Kogai
Oh, I forgot to say that the patch has been applied anyhow. I will release 0.97 very soon (within an hour) INCLUDING so you may wait before that. Dan On Sunday, March 24, 2002, at 04:07 , Dan Kogai wrote: On Sunday, March 24, 2002, at 03:21 , Jarkko Hietaniemi wrote: Change 15442 by jhi

Re: [gp@familiehaase.de: [PATCH] Re: coexist of ebcdic.c EBCDIC.c on Cygwin not possible]

2002-03-23 Thread Dan Kogai
Gerrit, I've fixed that one before the patch arrives but yours is more beautiful than mine (I used 'bcd' in place of your 'ebcdic_t' and 'dingbats' instead of 'symbol_t'), I will use your version. I will also fix other Makefile.PL to be consistent. Dan the Encode Maintainer On Sunday,

Encode::(CN|JP|KR|TW) unsupported on EBCDIC env

2002-03-23 Thread Dan Kogai
On Sunday, March 24, 2002, at 04:18 , Jarkko Hietaniemi wrote: The CN.t and TW.t are marked do not go here if EBCDIC, so they are just skipped. The respective CN/TW/KR.pm should be marked, too, that's just an oversight, since as Encode now stands, it does not work. Okay, I will then mark

Re: Smoke 15435

2002-03-23 Thread Dan Kogai
On Sunday, March 24, 2002, at 05:34 , Jarkko Hietaniemi wrote: The JP used to be okay on Win32, at least for a few days now... Could I possibly have broken something when integrating the latest Encode to bleadperl? I didn't modify of the Encode files, though...? I am not sure. I need

Smarter EBCDIC handling

2002-03-23 Thread Dan Kogai
On Sunday, March 24, 2002, at 05:19 , Jarkko Hietaniemi wrote: It seems that I sadly have to conclude that any encodings that uses MSB in bytes as multibyte character mark won't work The Golden Rule of Software Engineering: Adding Another Layer of Indirection Helps. Usually. The

Re: Introduce new alias for GB 18030

2002-03-23 Thread Dan Kogai
On Sunday, March 24, 2002, at 01:03 , Anton Tagunov wrote: Hello Dan! Maybe we would be better off with having something like define_alias( qr/^GB(?:\s|-)?(.+)/i = 'gb$1' ); this will make things like GB 18030 GB 12345 work. It looks good enough for me but I will wait for

Re: URL in L, non-ascii text in pod

2002-03-24 Thread Dan Kogai
On Sunday, March 24, 2002, at 05:39 , Autrijus Tang wrote: Yes. Please consult Lperlpodspec: Lname -- a hyperlink ...Notably, the content has to be checked for whether it looks like a URL, or whether it has to be split on literal | and/or / (in the right order!), and so on,

[Encode] Proposal; Make them all .ucm and detach Encode::Tcl

2002-03-24 Thread Dan Kogai
On Monday, March 25, 2002, at 04:06 , Dan Kogai wrote: *.pm and *.pod are easy to fix but *.enc is tough because Encode::Tcl and compile faithfully generate canonical encoding name out of filenames. For the time being I will fix *.pod and *.pm but I am stoked on *.euc. Tell me what

Re: [Encode] Proposal; Make them all .ucm and detach Encode::Tcl

2002-03-25 Thread Dan Kogai
On Monday, March 25, 2002, at 04:41 , Dan Kogai wrote: 0.Generate all the necessary *.ucm via perl5.7.3 ./compile -n $encname -o Encode/$encname.ucm Encode/$encname.enc 1. rename non-8.3-compliant *.ucm and reflect changes to */Makefile.PL 2. detach Encode::Tcl and upload it to CPAN

[Encode] Encode::Perl?

2002-03-25 Thread Dan Kogai
On Monday, March 25, 2002, at 06:59 , Nick Ing-Simmons wrote: It should not be too hard to take the .ucm file parsing from 'compile' and teach Encode::Tcl-like all-perl code to read .ucm-s. We can then rename it Encode::Perl ;-) I am considering that kind of option but I am not sure if it

Long name rocks! But how about *.ecm?

2002-03-25 Thread Dan Kogai
On Monday, March 25, 2002, at 09:37 , Nick Ing-Simmons wrote: in trouble? Or perl on such systems are smart enough to load UNIVERSA.pm (I guess this is the case). They load UNIVERSAL.pm and the OS truncates it and finds UNIVERSA.pm. Size reduction was a byproduct of */Makefile.PL

Configure patch to make static perl happy

2002-03-25 Thread Dan Kogai
jhi, I know you have already come up to the same conclusion but here is the patch to Configure to make static perl all right. Tested under FreeBSD. diff -u Configure Configure.orig --- Configure Mon Mar 25 23:09:16 2002 +++ Configure.orig Sat Mar 23 01:00:07 2002 -19455,7

So many Dans!

2002-03-25 Thread Dan Kogai
Now that I have submitted Encode-0.99, I am ready to go casual a little while On Tuesday, March 26, 2002, at 02:56 , Nick Ing-Simmons wrote: My earliest exposure to Japanese was from Judo terms - in Judo black-belt grades are known as 1st Dan, 2nd Dan etc. I seem to recall (it was years

[Encode] Encoding vs. Charset

2002-03-25 Thread Dan Kogai
Encode hackers (Especially Autrijius) I am now fairly content with the feature set of Encode so I decided to write some programs based upon it. And I have found that most of Chinese (Continental; seems like Taiwanese are much more technically correct) and Korean mails and web pages

Encode::CJKguide ( 500 lines Long!)

2002-03-26 Thread Dan Kogai
and the modern rocks, this module is for you. Perl 5.6 tackled the modern when it added Unicode support internally. Now in Perl 5.8 we tackled the classic by adding supoprt for other encodings externally. I hope you like it. =head1 Author(s) By Dan Kogai Elt[EMAIL PROTECTED]gt. Send your comments to Elt

More words on CJK-Guide (Was: Re: CJK-Guide)

2002-03-26 Thread Dan Kogai
On Wednesday, March 27, 2002, at 01:52 , Jarkko Hietaniemi wrote: I looked at it once more, and I'm torn. It does contain a lot good information, so I would like to have it Encode for 5.8.0, but at the same time, I must say that it is quite Unicode-negative. Not quite as negative as I've

Re: missing =head1:s

2002-03-26 Thread Dan Kogai
Let's work on easy ones first... On Wednesday, March 27, 2002, at 11:11 , Jarkko Hietaniemi wrote: buildtoc: ../ext/Encode/TW/TW.pm: cannot find =head1 NAME buildtoc: ../ext/Encode/Symbol/Symbol.pm: cannot find =head1 NAME buildtoc: ../ext/Encode/lib/Encode/10646_1.pm: cannot find =head1 NAME

[Encode] *.ucm files

2002-03-27 Thread Dan Kogai
On Wednesday, March 27, 2002, at 01:35 , Autrijus Tang wrote: CP949 is there in Encode::KR. CP950 is in Encode::TW. CP936 is in Encode::CN. CP932 is in Encode::JP. I have made it that way to make perl on the run save as much memory as possible. Only those in need get loaded. I've put

[Encode] Johab support, et al.

2002-03-27 Thread Dan Kogai
On Wednesday, March 27, 2002, at 01:40 , Jungshik Shin wrote: I've looked around ext/Encode and I found that CP949 is supported. So, what has to be added is JOHAB and what needs to be modified is EUC-KR to support 8byte seq. representation of Hangul syllables (see

Copyright of the generated image via http://www.unicode.org/cgi-bin/refglyph

2002-03-27 Thread Dan Kogai
As part of an effor to make Encode module successful, I am writing a simple web typesetter that renders whole text, not just each character via cgi (or mod_perl if the memory is enough). I would like to use the gif-rendered glyphs that are available via

Re: Is it true that there are no longer no official mappings from JIS X to Unicode?

2002-03-27 Thread Dan Kogai
On Wednesday, March 27, 2002, at 08:06 , Anton Tagunov wrote: Hello, Dan! BTW, is the guy speaking http://www.debian.or.jp/~kubota/unicode-symbols.html.en right or not? His article is dated september.. He is speaking about lack of official tables, that the tables have been withdrawn and

[Encode] Encode::Guide ? (Was: Re: Encode::CJKguide ...)

2002-03-27 Thread Dan Kogai
Anton, I am glad you liked it but as I announced Encode::CJKguide has been dropped. I am instead planning to make even more comprehensive guide that is not limited to CJK and upload it as Encode::Guide to CPAN. I will definely call for your help. Dan On Wednesday, March 27, 2002, at

[Encode] Todo before release 1.00

2002-03-27 Thread Dan Kogai
Jungshik, I am ALMOST ready to release version 1.00 of Encode. But with a last minute, welcome addition of Jungshik Shin to the team, I am tempted to wait until he is done with Johab. So Jungshik, how long will you take to submit Johab.pm ? If you think it takes longer (FYI we are

Re: Is it true that there are no longer no official mappings from JIS X to Unicode?

2002-03-28 Thread Dan Kogai
Maintainter On Thursday, March 28, 2002, at 11:14 , Kenneth Whistler wrote: Dan Kogai passed on this question: [snip] So here is your definitive answer. [snip]

Re: a new patch for 2022_KR.pm

2002-03-28 Thread Dan Kogai
On Thursday, March 28, 2002, at 08:17 , Jungshik Shin wrote: Dan, Sorry again. Hopefully, this is the last one. The decoder still has a problem when used in a stream. Cheers, Jungshik All three of them carefully studied, patched agains Encode-0.99 then merged back to

Fwd: [Encode] compile - bin/enc2xs

2002-03-28 Thread Dan Kogai
Oops, forgot to see [EMAIL PROTECTED] Dan Begin forwarded message: From: Dan Kogai [EMAIL PROTECTED] Date: Fri Mar 29, 2002 05:39:42 Asia/Tokyo To: Nick Ing-Simmons [EMAIL PROTECTED], Jarkko Hietaniemi [EMAIL PROTECTED], Autrijus Tang [EMAIL PROTECTED], Jungshik Shin [EMAIL PROTECTED

[Encode] 1.00 released at last!

2002-03-28 Thread Dan Kogai
, shiftjis, symbol, UCS-2, UCS-2le, US-ascii, UTF-8, viscii =head1 AUTHORS Anton Tagunov Autrijus Tang Dan Kogai Gerrit P. Haase Jarkko Hietaniemi Jungshik Shin Mark-Jason Dominus Michael G Schwern Nicholas Clark Nick Ing-Simmons Paul Marquess SADAHIRO Tomoyuki Spider Boardman =head1 Thanks! to all

[Encode 1.00] Even @INC manipulation and nasty #! line

2002-03-28 Thread Dan Kogai
On Friday, March 29, 2002, at 12:30 , Jarkko Hietaniemi wrote: Looks good so far (still compiling, expect for one nit): in enc2xs I needed to add the dotdot-dotdot: BEGIN { unshift INC, qw(../../lib ../../../lib ../../../../lib); $ENV{PATH} .= ';../..;../../..;../../../..' if $^O

[Encode 1.00] EveL @INC manipulation and nasty #! line

2002-03-28 Thread Dan Kogai
Hypnos is still dragging me :( typos everywhere... On Friday, March 29, 2002, at 04:22 , Dan Kogai wrote: BEGIN { unshift INC, qw(../../lib ../../../lib ../../../../lib); $ENV{PATH} .= ';../..;../../..;../../../..' if $^O eq 'MSWin32'; } Yes, This INC manipulation is ugly

[Encode] Charset-0.01 released

2002-03-29 Thread Dan Kogai
, open, PerlIO AUTHOR Dan Kogai [EMAIL PROTECTED] COPYRIGHT AND LICENSE Copyright 2002 by Dan Kogai, all rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Re: Encode and Unicode 3.2 and CJK

2002-03-29 Thread Dan Kogai
On Saturday, March 30, 2002, at 04:44 , Jarkko Hietaniemi wrote: Gentlemen, you may want to read Unicode 3.2 ( http://www.unicode.org/unicode/reports/tr28/ ) It does say something about Han, Katakana, and Hangul (sections 10.1, 10.3, and 10.4). (No, I don't know what happened to 10.2). What

Re: [Encode] Charset-0.01 released

2002-03-29 Thread Dan Kogai
On Saturday, March 30, 2002, at 04:43 , Martin 'Kingpin' Thurn wrote: Very nice, but shouldn't you name it Filter::Charset to prevent namespace explosion? I first thought so but the temptation to perl -MCharset=your-encoding -e was too insatiable. Dan the Man with Too Many Toplevel

[Encode] 1.01 released; changes minor

2002-03-29 Thread Dan Kogai
I have uploaded Encode-1.01 as http://www.dan.co.jp/~dankogai/Encode-1.01.tar.gz and CPAN. Here is the Changes. 1.01 $Date: 2002/03/29 20:59:39 $ ! Makefile.PL ! README s/USE_SCRIPTS/MORE_SCRIPTS/ ! Makefile.PL installs enc2xs by default for external Encode:: modules in CPAN, such

[Encode] poll; should *.ucm be relocated out of Encode?

2002-03-29 Thread Dan Kogai
Encode Hackers, I think I have overlooked the impact of ucm-transition. It is possible that *.ucm be relocated elsewhere so they won't be installed in lib/perl5. Unlike *.enc files which are actually used by modules *.ucm are not needed for runtime because they are all compiled-in.

Re: patch 15589/Encode 1.00 update: icelend-icelandic

2002-03-29 Thread Dan Kogai
On Saturday, March 30, 2002, at 12:52 , Andreas J. Koenig wrote: I think, the renaming is a mishap, right? Ping? Pong. Sorry. I'm listening now. -code_set_name maciceland +code_set_name MacIcelandic I see no justification for the change, I mean the appended ic, not the StudlyCaps. The

[Encode] How to support (Apple's) compound Unicode characters?

2002-03-29 Thread Dan Kogai
On Saturday, March 30, 2002, at 03:24 , Dan Kogai wrote: Okay. I've checked http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ One more time and it seems that other missing encodings are available as well, such as korean. I'll look into that. I think I have found the reason why

[Encode] *.ucm relocated; Encode/ - ucm/

2002-03-30 Thread Dan Kogai
Autrijus, On Saturday, March 30, 2002, at 07:20 , Autrijus Tang wrote: On Sat, Mar 30, 2002 at 07:07:34AM +0900, Dan Kogai wrote: mm. In a sense, .ucm files is like .c files -- it can contain comments and even embedded documentations (CWEB, heh), but in the end of day it's the .so

Re: [Encode] encoding vs. Charset

2002-03-30 Thread Dan Kogai
On Sunday, March 31, 2002, at 12:46 , Jarkko Hietaniemi wrote: Feel free to to adopt encoding.pm as a part of Encode and further develop it. Okay. I will add it from the next version of Encode. Oh, one more thing. I would like you to copy ext/Encode/bin/enc2xs to utils/enc2xs.PL Come

[Encode] should piconv be installed in INST_SCRIPT as well?

2002-03-30 Thread Dan Kogai
On Sunday, March 31, 2002, at 01:52 , Jarkko Hietaniemi wrote: I will. How about the piconv? Well, let's take a poll first on this. While this script is not as imperative as enc2xs, It is a part of perl culture to reinvent a better wheel, such as a2p and find2perl. For those of you

[Encode] is there any way to tell the current I/O disciplines ?

2002-03-30 Thread Dan Kogai
NI-S, I am now working on encoding.pm. I wrote a more elaborate version of jperl.t and the prognosis is bright. Now I am adding I/O control for encoding.pm and found one thing; To make no encoding work with I/O control, the old state of line disciplines must be stored somewhere.

[Encode] 1.10 released

2002-03-31 Thread Dan Kogai
Folks, Encode-1.10 is now available via http://www.dan.co.jp/~dankogai/Encode-1.10.tar.gz As well as CPAN -- before the Fool's day (in Zulu, that is. It already is in JST (+0900)). Detailed Changes after the sig. Here is the summery of changes tha may affect others. *

[Encode] ahem, 1.11

2002-03-31 Thread Dan Kogai
I am embarrased to announce that I have released Encode-1.11 in less than an hour after 1.10. Available at http://www.dan.co.jp/~dankogai/Encode-1.11.tar.gz As well as CPAN. Still on April Fool's Eve on GMT. 1.11 $Date: 2002/03/31 22:12:13 $ + t/encoding.t + t/jperl.t ! MANIFEST

Re: [ml1050@freemail.hu: perl@15605 on dos]

2002-03-31 Thread Dan Kogai
Laszlo, Thanks for the report. On Monday, April 1, 2002, at 07:33 , Jarkko Hietaniemi wrote: - Forwarded message from Laszlo Molnar [EMAIL PROTECTED] - Subject: perl@15605 on dos From: Laszlo Molnar [EMAIL PROTECTED] Date: Mon, 1 Apr 2002 00:26:34 +0200 Message-ID: [EMAIL

[Encode] piconv lints

2002-04-01 Thread Dan Kogai
jhi, Thank you so much for 3 consecutive patches. All applied successfully. Combined result right after the sig. BTW, How do you say thanks in Suomi? I tried to cheat via http://www.freedict.com/ but gave me no clue (there were 10 listed under Japanese!) Dan the Encode Maintainer

Re: [Encode] How to support (Apple's) compound Unicode characters?

2002-04-01 Thread Dan Kogai
On Monday, April 1, 2002, at 07:33 , Nick Ing-Simmons wrote: Dan Kogai [EMAIL PROTECTED] writes: I think I have found the reason why some of the encodings were missing from Tcl's *.enc, which later turned into *.ucm. Apple makes use of Unicode compound characters too extensively, which

Re: Encode seriously broken

2002-04-01 Thread Dan Kogai
Andreas, On Monday, April 1, 2002, at 10:14 , Andreas J. Koenig wrote: This isn't an April's fool email though I feel as if this perl were an April's fool perl. With todays perl I cannot get the same job done that was working with perl@14354. My application (an indexer) runs for a while and

Re: [Encode] enc2txt missing under perl-current/utils/

2002-04-01 Thread Dan Kogai
On Tuesday, April 2, 2002, at 12:26 , Jarkko Hietaniemi wrote: Just checking: is enc2xs really a widely enough useful tool to install for everybody? It's pretty specialized, after all. It is special indeed for those wanting to add ucm-based new encodings. However, External encoding

Re: [Encode] MacIceland(ic)?, once again.

2002-04-01 Thread Dan Kogai
On Tuesday, April 2, 2002, at 01:00 , Jarkko Hietaniemi wrote: I also found MacR(o|u)manian disambiguous. There were both ROMANIAN.TXT and RUMANIAN.TXT with different table. So # ouououou alias have to be gone... Beware of MacRoman.ucm vs MacRomanian.ucm in 8.3, you can't just simply

Re: [FYI] JIS X 0213 - Unicode 3.2.0

2002-04-02 Thread Dan Kogai
On Wednesday, April 3, 2002, at 12:29 , SADAHIRO Tomoyuki wrote: I've update the table today. (if you like, please check it at the same place as before) The mapping for 1-11-69 and 1-11-70 must be solved, according to http://www.unicode.org/unicode/uni2book/ch07.pdf 7.8 Modifier Letters

[Encode] 1.20 released!

2002-04-04 Thread Dan Kogai
jhi and Perl Porters, I am relieved to announce that I have uploaded Encode-1.20, the final release before 5.8.0-RC1. It is available at http://www.dan.co.jp/~dankogai/Encode-1.20.tar.gz as well as CPAN. =head1 Summery of Changes =head2 Fire Extinguishers The following problem has been

A bug. [Re: qr/^UCS2-le$/i = 'UCS-2' -- what is it?]

2002-04-04 Thread Dan Kogai
On Friday, April 5, 2002, at 11:20 , Anton Tagunov wrote: Hello, Dan! A two-pence question (very quick and probably foolish :-) What is the marked alias about? define_alias( qr/^UCS-2LE$/i= 'UTF-16LE', qr/^UCS2-le$/i= 'UCS-2', );

A FIX. [Re: qr/^UCS2-le$/i = 'UCS-2' -- what is it?]

2002-04-04 Thread Dan Kogai
On Friday, April 5, 2002, at 11:33 , Dan Kogai wrote: - qr/^UCS2-le$/i= 'UCS-2', ); + qr/^UCS-2LE$/i= 'UTF-16LE'); ^^^aaaggh! Forget the last one. This one is correct. Dan-the-Encode-Maintainer diff -du -r1.20 lib/Encode

Re: [PATCH 1/2 + 0.1] Supported.pod

2002-04-05 Thread Dan Kogai
Anton, I am now working on the new revision of Supported.pod AFTER this patch is applied. I will post the whole thing tonight. Dan

Re: [Encode] In what character encoding legacy scripts are written?

2002-04-05 Thread Dan Kogai
On Friday, April 5, 2002, at 08:49 , Dan Kogai wrote: I REPEAT. until perl 6, PERL KNEW NOTHING ABOUT ENCODING. =~ s/6/5.6/ Dan

[Encode] Endian consistency and missing raw encodings for TK

2002-04-05 Thread Dan Kogai
On Friday, April 5, 2002, at 08:40 , Nick Ing-Simmons wrote: It is _really_ sad, Tk only realy _needs_ one encoding which it expects it to be called ucs-2be or iso10464-1 We don't support either name. Instead we claim UCS-2 without specifying and endian :-( As a matter of fact, I don't

[Encode] Farsi is Okay. The problem is in Indics!

2002-04-05 Thread Dan Kogai
On Friday, April 5, 2002, at 11:18 , Jarkko Hietaniemi wrote: Since it seems that we won't make it for Monday the 8th (MakeMaker is still unfinished, and UTF-8 keys are still a bit dodgy, and so on), I guess small updates on Encode (docs certainly, and obvious bugs) are still okay-- and even

[Encode] UCS/UTF mess and Surrogate Handlings

2002-04-05 Thread Dan Kogai
On Friday, April 5, 2002, at 11:10 , Jarkko Hietaniemi wrote: Change 15745 by jhi@alpha on 2002/04/05 13:07:21 Integrate perlio; Not only did UCS-2 have dodgy name it was buggy. Affected files ... ... //depot/perl/ext/Encode/lib/Encode/10646_1.pm#4 integrate

Re: [Encode] UCS/UTF mess and Surrogate Handlings

2002-04-05 Thread Dan Kogai
On Saturday, April 6, 2002, at 12:18 , Jarkko Hietaniemi wrote: P.S. Does utf8 support surrogates? Surrogate pair is definitely the No. Surrogates are solely for UTF-16. There's no need for surrogates in UTF-8 -- if we wanted to encode U+D800 using UTF-8, we *could* -- BUT we should not.

Re: [Encode] UCS/UTF mess and Surrogate Handlings

2002-04-05 Thread Dan Kogai
On Saturday, April 6, 2002, at 01:16 , Jarkko Hietaniemi wrote: Yes. I know that. My question is whether we support CONVERSION. Internals have nothing to do with that. When we say UCS-2, \x{1}-\x{10} must be discarded or croak for error. When we say I suggest croak. UTF-16,

Re: [PATCH] Encode tweaks for VMS

2002-04-06 Thread Dan Kogai
G'morning (FYI It's 16:50 JST). On Sunday, April 7, 2002, at 02:58 , Jarkko Hietaniemi wrote: Any chance of an Encode release soonish? Like, after I've eaten my cereal? :-) Take a looong time to finish your cereal. I will be released today (in JST). Schwern's patch will be in. NI-S will

[Encode] 1.26 Released

2002-04-07 Thread Dan Kogai
jhi and porters, I *must be* relieved to release version 1.26 of Encode. Available at http://www.dan.co.jp/~dankogai/Encode-1.26.tar.gz And CPAN. =h1 major changes * All (UCS-2|UTF-(16|32)(BE|LE)? are now supported. Will we support UTF-7 :? * jis02(01|08|12).ucm are back to

Re: A modest patch [Encode] 1.26

2002-04-07 Thread Dan Kogai
On Monday, April 8, 2002, at 08:33 , Anton Tagunov wrote: Hello, Dan! Very modest: typos, C, wording, uhc, x-windows-949, Windows-31J /Anton/ Thanks. Applied. Will be appeared in the next release. Dan

[Encode] 1.30 released

2002-04-07 Thread Dan Kogai
I am too sleepy to abstain from releasing ver. 1.30 of Encode, available at http://www.dan.co.jp/~dankogai/Encode-1.30.tar.gz as well as CPAN. Since the diff against perl-current was only less than 700 lines, patch was made available as http://www.dan.co.jp/~dankogai/current-1.30-diff.gz

[Encode 1.30] BOM32LE was incorrect - fixed

2002-04-08 Thread Dan Kogai
Anton, On Monday, April 8, 2002, at 10:05 , Anton Tagunov wrote: --- ext/Encode-1.30/lib/Encode/Unicode.pm.orig Mon Apr 8 14:06:28 2002 +++ ext/Encode-1.30/lib/Encode/Unicode.pm Mon Apr 8 17:00:47 2002 -12,7 +12,7 sub FBCHAR(){ 0xFFFd } sub BOM_BE(){ 0xFeFF } sub BOM16LE(){

[Encode 1.30] Patch to correct BOM value for 32LE

2002-04-08 Thread Dan Kogai
jhi, The following patch will correct incorrect value for BOM for 32LE. The first one is essentially identical to that of Anton. And the second will fix t/Unicode.t so it is more independent of Encode::Unicode (that is, should there be an error there t/Unicode.t will find it -- currently

[Encode] 1.31 in a few hours

2002-04-08 Thread Dan Kogai
On Tuesday, April 9, 2002, at 02:04 , Anton Tagunov wrote: Other items in my '[PATCH]s and questions [Encode] 1.30' mail were: - a consmetic patch to Supported.pod This one must be the most acceptable by pumpkins since it has no piece of code :) - a question whether Encoder.pm sub

Re: [PATCH]s and questions [Encode] 1.30

2002-04-08 Thread Dan Kogai
On Tuesday, April 9, 2002, at 04:33 , Philip Newton wrote: This bit appears not to have been applied? Here it is again, together with another few tweaks to Encode::Unicode. Yikes. Too late for 1.31 but applied. Patch failed in two places but it was trivial to manually roll it back.

[Encode] 3 patches

2002-04-09 Thread Dan Kogai
On Tuesday, April 9, 2002, at 10:39 , Nick Ing-Simmons wrote: I _think_ that gets me back to where I was. Now I can see if I can get jis0208 to work ... Thank you and your patches are applied flawlessly. --- lib/Encode/Unicode.pm.shipTue Apr 9 14:28:13 2002 +++

[Encode] 1.32 released

2002-04-09 Thread Dan Kogai
I am longing for the day when 5.8.0-RC1 be released when I uploaded version 1.32 of Encode. Get one via http://www.dan.co.jp/~dankogai/Encode-1.32.tar.gz Or CPAN. diff against current is also available as http://www.dan.co.jp/~dankogai/current-1.32.diff.gz And here are Changes. I would

[Encode] to make -Uusecjk

2002-04-09 Thread Dan Kogai
On Wednesday, April 10, 2002, at 06:01 , Jarkko Hietaniemi wrote: Yes, something like that could be easier to implement. Maybe installperl could do the appropriate magic (i.e. skip the CJK)? For static builds Configure needs tweaking, I think. Though I am still not sure of the irrelevance,

[Encode] Your patch applied

2002-04-10 Thread Dan Kogai
Anton, - several typos - excludes GBK from that section because it is discussed in Microsoft-related Just routine. Thank you. Applied. Since this is purely of documentation, chances are it will go with 5.8.0-RC2 at least and should there be any code change BEFORE RC1, 5.8.0-RC1.

Re: My email address in the Encode AUTHORS file

2002-04-10 Thread Dan Kogai
On Thursday, April 11, 2002, at 04:30 , Philip Newton wrote: Can you please change my email address to '[EMAIL PROTECTED]', please? Thanks! (That's also what's in the main Perl AUTHORS file.) Sure. Fixed. Dan

[Encode] 1.33 released -- minuscule changes

2002-04-10 Thread Dan Kogai
I've got a feeling 5.8.0 will be a reality, not something like a horizon which is always there in front of you but you can never reach, when I release ver. 1.33 of Encode. Available as follows; Whole: http://www.dan.co.jp/~dankogai/Encode-1.33.tar.gz and CPAN Diff against

[Encode] 1.34 released only as diff

2002-04-12 Thread Dan Kogai
On Saturday, April 13, 2002, at 05:41 , Jarkko Hietaniemi wrote: Could you rsync from AS and send me the diff? I've uploaded it as http://www.dan.co.jp/~dankogai/current-1.34.diff.gz Well, it was just 161 lines in total so I could've pasted here but for the sake of whitespaces use the one

Re: iso-2022-jp snags.

2002-04-11 Thread Dan Kogai
On Friday, April 12, 2002, at 02:30 , Nick Ing-Simmons wrote: Having hacked RFC2047 support into tkmail I have now seen some non-latin1 characters in a real perl/Tk app. There seem to be a few snags with mime's iso-2022-jp: - It failed to demand load given upper-case form ISO-2022-JP

  1   2   3   >