> -----Ursprüngliche Nachricht----- > Von: Roland Weber [mailto:[EMAIL PROTECTED] > Gesendet: Mittwoch, 25. Januar 2006 16:25 > An: HttpClient User Discussion > Betreff: Re: AW: german umlauts öüä > > Hi Ingo, > > > To anybody: Does this means that in html standardly the > "iso-8859-1" > > is taken? > > No, it doesn't. Guessing the default character set is up to > the user agent (browser). But if you only want to access a > single web site, and you are reasonably sure that they won't > change the character encoding, you can still work with a default. > Thanks Roland,
my program will access many different pages. Anyway, i will resumee what i have learned to handle the charset: 1. Have a look into the header for the entry "Content-Type" and if one take this. 2. When site has text content and no charset found in header take a default (maybe "iso-8859-1") 3. If content is "text/html" and no charset so far search for "<meta http-equiv="content-type"" tag and have a look if a charset is given there cheers, Ingo > > then i will always call: new String (bytes, "iso-8859-1"); > > cheers, > Roland > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
