Hi Nick, > Hi, are you by any chance the Raving Loony I once knew at Cambridge?
Yes indeed - that must be 35 years ago now - these days I'm a bit more sensible (although the legacy of the OMRLP lives on). > Basically there are three parts to working with character encodings: > * Detecting them in incoming data. > * Converting them to order. > * Correctly labelling outgoing data. > mod_xml2enc will do all that for libxml2-based filters, and could easily be tweaked to drop the libxml2-specific optimisations for general- > purpose use. Alternatively the charset-detection from mod_xml2enc could probably be folded into mod_charset_lite. So basically mod_xml2enc will detect the incoming encoding (whatever it may be)? Are there not HTTP headers which give a good indication of the input format (albeit that you have to detect the format and read the stream to confirm it)? I'm new to Apache coding/configuration - how would xml2enc/mod_charset_lite input & output modules/filters be setup in configuration and/or chained in code? Do you have any views on libxml2 suitability for use within Apache module code? It appears to have good all-round performance compared to other XML libraries. I note that it has a C++ wrapper which is LGPL'ed so there are likely to licensing/distribution issues if I ever decided to try release code under an Apache License. Regards, John
