AFAIK there is no way to determine the exact encoding of the files. You can
do a "best effort" algorithm to try identifying it, but even Notepad++
sometimes fails to show the correct encoding.

That's why XML, HTML and some other metalanguages use the
[encoding="utf-8"] or [charset="utf-8"] or similar, because this must be
explicitly indicated for not misunderstanding the contents.

In similar way, when delivering text files to someone, an encoding must be
explicitly defined and agreed between the parts to not misinterpret the
contents.

UTF-16 is a little strange for me and never did deal with it, isn't used
for double byte characters, like chinese or similar?

One idea that comes to me is that you can ask for a header indicating the
encoding (like XML does), or even ask for a predefined string (always the
same, like "Test header - áàä") [with some special chars] which you can
compare to your own. If the comparison of the source string in UTF-16 does
not match your string in UTF-16, then you can assume it's UTF-8, or
re-check comparing with the same string in UTF-8


Regards.-


2018-08-01 20:00 GMT+02:00 Paul H. Tarver <p...@tpcqpc.com>:

> Ok, this may be a dumb question, but is there a reliable and easy way to
> detect and determine the file encoding on simple text files?
>
>
>
> I have a client sending me files with UTF-16 Little Endian encoding. I have
> some code in place to try to determine if a file is UNICODE based on the
> first two or four characters once the file is loaded to memory and then
> convert it using STRCONV, but I'm concerned that although it works, it is a
> bit of a hack and maybe there is a better way.
>
>
>
> Any thoughts?
>
>
>
> Paul
>
>
>
>
>
> --- StripMime Report -- processed MIME parts ---
> multipart/alternative
>   text/plain (text body -- kept)
>   text/html
> ---
>
[excessive quoting removed by server]

_______________________________________________
Post Messages to: ProFox@leafe.com
Subscription Maintenance: http://mail.leafe.com/mailman/listinfo/profox
OT-free version of this list: http://mail.leafe.com/mailman/listinfo/profoxtech
Searchable Archive: http://leafe.com/archives/search/profox
This message: 
http://leafe.com/archives/byMID/profox/cagq_jumyhtcqqpjmb-jpmcxrfwlg-txdkds5jcdx050lfc0...@mail.gmail.com
** All postings, unless explicitly stated otherwise, are the opinions of the 
author, and do not constitute legal or medical advice. This statement is added 
to the messages for those lawyers who are too stupid to see the obvious.

Reply via email to