Following the BoF, I'll put down a brief marker on-list on the
theme of content-awareness.  More when I'm back home
and not totally knackered.

We have handling of certain important encodings:
SSL and compression (albeit not quite bug-free) as
standard in current versions.  I'd be interested to
expand that with some new filter modules.

1.  Character Encoding.  We have very limited capability
in mod_charset_lite.  We can expand that to support
automatic detection of charset, and either setting a request
field or transforming to a selected charset.

We can also provide an API for modules to configure this,
in cases where more than one transformation is wanted.
A real-life use case for this is where users of libxml2-based
modules such as mod_proxy_html need to use charsets
other than utf-8, and particularly charsets that are not
supported by libxml2.

2.  Generic XML support.

In mod_xmlns, a SAX2 parser parses XML to a stream of
SAX events.  Events are keyed on namespace, and
application modules can register handlers for a namespace.

A good illustrative use case was my parser for the
ESI (Edge-side includes) namespace.  I've also used it
to generate HTML and RDF from a common source:
a task you might otherwise use XSLT for, at a much
higher performance cost.  I also hacked it to support
scripting and embedded SQL queries, but that's a
line I don't see as so interesting, because it gets us
into the territory of well-established alternatives
including PHP and JSP.

Joachim Zobel's mod_xml2 abstracts this further by
defining SAX event buckets (e.g. startElement bucket)
and passing them down the filter chain.  We could build
on the same approach to pass DOM or similar nodes as
buckets for applications like XSLT.

If we use expat for this, we avoid introducing any new
dependencies.

3.  Data type library.

Our filter architecture works well for tasks such as (some)
image processing.  I don't think that's something we want
to do too much of in core, but it might add something if
we provided some basics, such as encoding/decoding
of the regular Web image formats (gif/jpeg/png, and svg
using xmlns dispatch).  A similar approach might apply
more widely to other media.


I can contribute some of this from my existing work,
including relicensing where necessary.  That is,
if there's interest in adding some of these things
as standard in 2.4.

--
Nick Kew

Reply via email to