On Fri, 2 May 2008 13:46:16 -0700 (PDT)
John Zhang <[EMAIL PROTECTED]> wrote:

> In my output filter, I need to parse the document to search for
> certain patterns.
> 
> Where can I get the information about the (character) encoding so
> that I can parse the document correctly?  Eg the document may contain
> unicode characters and are encoded in a special encoding. 

See http://apache.webthing.com/mod_xml2enc/

If your filter uses libxml2, just use mod_xml2enc alongside it.
If not, you can still use the charset detection and transcoding.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Reply via email to