Michiel Meeuwissen <[EMAIL PROTECTED]> wrote: > Martijn Houtman <[EMAIL PROTECTED]> wrote: > > I did not use your new code yet, because I thought it was easier to wait for > > the binary version. I don't expect troubles with the new code, but I > > understand that you are curious about the result of your work. With the old > > code everything works for ISO-8859-1 if I remove the ISO-8859-1 encoding > > from the xml header. > > For external includes it depends now completely on what the external server > report the encoding to be. I suppose that the encoding is not reported for > XML files (because XML files are binary..). > > I don't understand really how removing the header could make a differnce for > an external XML, because it should be presentif the encodign is like that, > and I can't understand that the external server succeeds serving it like > UTF-8 when the file is actually ISO-8859-1. But anyhow, I'll make sure that > it works in my setup..
I fixed it now too for 'external' included (Includes from another server). If the content is XML (<?xml introduction), the encoding is taken from that. If it is not in that header, it defaults to UTF-8. If the content is not XML, the encoding is taken from the Content-Type header (charset=..). If that cannot be found either, the encoding is supposed to be ISO-8859-1 (the default for HTML). That means that if the Content-Type is text/xml;charset=ISO-8859-1 that that charset is actually ignored if the body starts with a <?xml header. I think that is good, because 1. a lot of tomcat version completely unrequestedly append this ISO-8859-1 thing and you cannot overrid it. 2. I think text/xml is actually binary/xml, and no charset can be externally attributed to it, because IIUC, XML is default UTF-8 and otherwise it is in the <?xml header, so inside the stream itself. Michiel -- Michiel Meeuwissen Mediacentrum 140 H'sum +31 (0)35 6772979 nl_NL eo_XX en_US mihxil' [] ()
