Re: Thai word list

2002-04-22 Thread KUSANO Takayuki
At Thu, 18 Apr 2002 15:47:19 +0200 (CEST), Werner LEMBERG wrote: I may be able to help you in these area. See http://developer.thai.net/libinthai/ - an open-source word-break library All links are broken on this page... The Japanese host no longer exists apparently under the

RE: SCSU compression (WAS: RE: Thai word list)

2002-04-19 Thread Yves Arrouye
This looks like a nice endorsement of SCSU: :D It saves 59% just as a charset, and it saves almost 20% in a system with a real compression. I am all for SCSU as a charset (after my tools can view it properly), but that was not the use there. OTOH there is gzip encoding in HTTP 1.1 :)

Re: Thai word list

2002-04-18 Thread Doug Ewell
Thai word list in a UTF-8 file called th18057.txt. Since you may not wish to download the whole package and I don't know if the Thai file is available separately, I have uploaded it (for a limited time only) to: http://home.adelphia.net/~dewell/th18057.txt(334,028 bytes) If you can process

Re: Thai word list

2002-04-18 Thread Werner LEMBERG
I may be able to help you in these area. See http://developer.thai.net/libinthai/ - an open-source word-break library All links are broken on this page... The Japanese host no longer exists apparently under the referenced name. Werner

Re: Thai word list

2002-04-18 Thread Markus Scherer
Doug Ewell wrote: The ICU package includes a sorted Thai word list in a UTF-8 file called th18057.txt. Since you may not wish to download the whole package and I don't know if the Thai file is available separately, I have uploaded it (for a limited time only) to: Note that ICU has CVS

Re: Thai word list

2002-04-18 Thread Eric Mader
At 09:42 AM 4/18/2002, Markus Scherer wrote: Doug Ewell wrote: The ICU package includes a sorted Thai word list in a UTF-8 file called th18057.txt. Since you may not wish to download the whole package and I don't know if the Thai file is available separately, I have uploaded it (for a limited

RE: Thai word list

2002-04-18 Thread Yves Arrouye
If you can process SCSU, and would appreciate a 59% reduction in file size, try: http://home.adelphia.net/~dewell/th18057-scsu.txt(135,731 bytes) Not to knock down SCSU, but if it had been gzipped instead, the resulting file would be about half that size: 70,912 bytes. (The gzipped

Thai word list

2002-04-17 Thread Werner LEMBERG
Dear Unicoders, I'm searching a large word list for Thai which is freely available, i.e., either under a license similar to GPL (resp. compatible to the GPL) or in the public domain. Do you know whether such a file is available? Werner