I did not know of RAR, but have given it a try. Even here there is a serious problem, because if the filename is non-Ascii the name of the compressed file comes out as _____.rar, with as many underlines as there were characters in the original name. In fact it is a bit less predictable : if the name is Greek, for example, you get Latin letters, if it is Cyrillic, just the underline.
This is useless then if you have a number of filenames all with the same number of characters.
Certainly more work is needed on RAR (at least on the Win 2000 version).
I know about that, since I made my Fontlist 5 work properly with arbitrary non-ascii names : http://ourworld.compuserve.com/homepages/RaymondM/fontlist5.htm .
Raymond Mercier
At 22:58 30/05/2003 -0500, you wrote:
I wonder if anyone here has ideas on these matters.
Peter
----- Forwarded by Peter Constable/IntlAdmin/WCT on 05/30/2003 10:56 PM -----
I have 3 LinguaLinks lexicons that I have converted into HTML pages - one for each entry. The languages use non-ANSI characters, so I also did a Unicode conversion at the same time.
[snip]
Everything works very well except that I cannot burn the files onto a CD because of the unicode values in the filenames. Roxio and Nero CD-burners don't accept some of the higher values found in the file names (using Jolliet, ISO9600 and UDF). Anyone have any ideas how to deal with this? For example, a filename with unicode value 026B, a tilde lower case L, causes problems.
In the meantime, to get it onto CD, I decided to try and zip all the files. Turns out almost all the zippers out there DO NOT support Unicode filenames. Doug Rintoul found WinRAR (http://www.rarlab.com/rar_archiver.htm) which does the trick in the RAR format only. There is a RAR expander for Macintosh and Linux systems as well (all of these are $29 USD). So far, have not found a freeware solution that meets unicode filename needs. Have any of you run into this yet?
I could try to determine what Unicode values are causing problems on the CD burner and do an unacceptable-to-acceptable character translation in the filenames and the links to those filenames ... but that seems like a huge compromise. Also, it will be difficult to come up with a generic solution ... that is to say, I don't know what RANGE of values are unacceptable for characters in a CD filename. Jolliet is supposed to allow Unicode filenames according to the documentation I have seen.
Larry

