Re: Thai word list

2002-04-22 Thread KUSANO Takayuki
At Thu, 18 Apr 2002 15:47:19 +0200 (CEST), Werner LEMBERG wrote: > > > I may be able to help you in these area. See > > http://developer.thai.net/libinthai/ - an open-source word-break > > library > > All links are broken on this page... The Japanese host no longer > exists apparently under the

Re: SCSU compression (WAS: RE: Thai word list)

2002-04-19 Thread Markus Scherer
Yves Arrouye wrote: > Seriously, SCSU is fine for some uses, but in this example, was definitely > not the best way to appreciate a reduction in file size. Not alone, right. > By the 20% you mean an additional 20% by doing SCSU+gzip versus just gzip, > right? Yep. markus

RE: SCSU compression (WAS: RE: Thai word list)

2002-04-19 Thread Yves Arrouye
> This looks like a nice endorsement of SCSU: :D > It saves 59% just as a charset, > and it saves almost 20% in a system with a "real compression". I am all for SCSU as a charset (after my tools can view it properly), but that was not the use there. OTOH there is gzip encoding in HTTP 1.1 :) Se

RE: Thai word list

2002-04-19 Thread Miikka-Markus Alhonen
On 19-Apr-02 Yves Arrouye wrote: >> If you can process SCSU, and would appreciate a 59% reduction in file >> size, try: >> >> http://home.adelphia.net/~dewell/th18057-scsu.txt(135,731 bytes) > > Not to knock down SCSU, but if it had been gzipped instead, the resulting > file would be ab

Re: Thai word list

2002-04-19 Thread Markus Scherer
Yves Arrouye wrote: > Not to knock down SCSU, but if it had been gzipped instead, the resulting > file would be about half that size: 70,912 bytes. (The gzipped SCSU-encoded > file is 57,987 itself). This looks like a nice endorsement of SCSU: It saves 59% just as a charset, and it saves almos

RE: Thai word list

2002-04-18 Thread Yves Arrouye
> If you can process SCSU, and would appreciate a 59% reduction in file > size, try: > > http://home.adelphia.net/~dewell/th18057-scsu.txt(135,731 bytes) Not to knock down SCSU, but if it had been gzipped instead, the resulting file would be about half that size: 70,912 bytes. (The gzipp

Re: Thai word list

2002-04-18 Thread Eric Mader
At 09:42 AM 4/18/2002, Markus Scherer wrote: >Doug Ewell wrote: > >>The ICU package includes a sorted Thai word list in a UTF-8 file called >>th18057.txt. Since you may not wish to download the whole package and I >>don't know if the Thai file is available separately, I have uploaded it >>(for a

Re: Thai word list

2002-04-18 Thread Markus Scherer
Doug Ewell wrote: > The ICU package includes a sorted Thai word list in a UTF-8 file called > th18057.txt. Since you may not wish to download the whole package and I > don't know if the Thai file is available separately, I have uploaded it > (for a limited time only) to: Note that ICU has CVS

Re: Thai word list

2002-04-18 Thread Werner LEMBERG
> I may be able to help you in these area. See > http://developer.thai.net/libinthai/ - an open-source word-break > library All links are broken on this page... The Japanese host no longer exists apparently under the referenced name. Werner

Re: Thai word list

2002-04-17 Thread Doug Ewell
Werner LEMBERG <[EMAIL PROTECTED]> wrote: > I'm searching a large word list for Thai which is freely available, > i.e., either under a license similar to GPL (resp. compatible to the > GPL) or in the public domain. > > Do you know whether such a file is available? The ICU package includes a sort

Re: Thai word list

2002-04-17 Thread Samphan Raruenrom
Werner LEMBERG wrote: > I'm searching a large word list for Thai which is freely available, > i.e., either under a license similar to GPL (resp. compatible to the > GPL) or in the public domain. > Do you know whether such a file is available? This is the standard pubilc domain (3+ words) word