On Tue, Sep 07, 2004 at 02:41:02AM +0300, Amir Hardon wrote:
> After Didi gave me the solution for the encoding problem with unzip I decided 
> to write a script for converting the filenames (Maybe I'll patch unzip in the 
> future but that's my temporary solution).
> 
> The script I wrote just change the filenames and directories names using 
> iconv, but if it tries to handle an already iso8859-8 encoded filename iconv 
> gives an error (The validity check fails, it expects the input string to be 
> encoded as iso8859-1 but it is not).
> Is there any standard unix tool I can use to first check the string's encoding 
> and then decide whether to convert it or not?

You do realize this is pure guesswork, do you? Noone can tell you the
correct answer for sure. That said, a tool that tries to do this is
called 'enca'. I never tried it myself, though.

If all you want is to know if a given string is iso8859-1 or 8859-8
than I would personally simply do a range check - Hebrew should have
bytes below 127 or between 224 and 250. Note!!! That this isn't
accurate, and I'll never do this on a large mission critical tree
without a good backup.

> 
> Thanks,
>  -Amir.
-- 
Didi


=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to