Ingo Meyer wrote:
To anybody: Does this means that in html standardly the "iso-8859-1" is
taken?
then i will always call: new String (bytes, "iso-8859-1");
Hi Ingo,
no, you shouldn't assume ISO-8859-1 for all cases, although it's a good
guess when everything else fails. There are loads of HTML documents on
the Web using different encodings. E.g. XHTML pages should use UTF-8 as
the default unless specified otherwise.
Finding the proper encoding for a HTML page may require a couple of
checks. Here's what you can do:
1. Check the charset parameter of the Content-Type HTTP header.
2. Look for Unicode Byte Order Marks (BOMs) at the beginning of the data.
3. Look for an XML declaration and check the encoding attribute (XHTML
pages).
4. Look for <meta http-equiv="Content-Type"> elements.
Only after all of the above checks fail would I use ISO-8859-1 as a guess.
Cheers, Oliver
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]