I am sponsoring this case for Nick Kew. It is a straightfoward addition of two more Apache httpd modules. Timeout set to 10/20/200.
Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI This information is Copyright 2009 Sun Microsystems 1. Introduction 1.1. Project/Component Working Name: modules mod_proxy_html and mod_xml2enc for Apache HTTPD 2.2 1.2. Name of Document Author/Supplier: Author: Nick Kew 1.3 Date of This Document: 12 October, 2009 4. Technical Description 1. Introduction 1.1 Title modules mod_proxy_html and mod_xml2enc for Apache HTTPD 2.2 1.2 Author Nick Kew <niq at sun.com> 1.3 Date 2009-10-09 1.4 Customers Users of Apache HTTPD as a reverse proxy/gateway 1.5 Email Aliases 1.5.2 Responsible Engineer niq at sun.com 1.5.4 Interest List webstack-discuss at opensolaris.org 2. Summary 2.1 Description mod_proxy_html is a markup-aware filter capable of rewriting HTML and XHTML on-the-fly. It is commonly required in a reverse proxy situation, such as where Apache HTTPD is used as a gateway providing external access to servers hosted on an internal or private network. mod_xml2enc deals with managing character encodings (charset) support on behalf of mod_proxy_html and other filters, and is required to support internationalisation correctly where a proxied server delivers character sets that are not ASCII- or Unicode (utf-8)-compatible. 2.2 Risks and Assumptions N/A 3. Business Summary 3.1 Problem Area A web gateway, or reverse proxy, is often used to provide external access to servers hosted on a private network. Often such servers include or generate documents containing links that cannot be reached or even resolved from outside the network. mod_proxy_html serves to rewrite such links into the gateway's address space, so that they can be addressed from outside the private network. mod_proxy_html is one of several Apache filter modules based on parsing documents with libxml2. This library uses unicode (utf-8) internally, and is capable of reading documents in a limited range of other character encodings. mod_xml2enc is required to detect automatically the character set of incoming documents so they can be correctly parsed, and is required to support character encodings not directly supported by libxml2. mod_xml2enc provides strong internationalisation support for mod_proxy_html, and other libxml2-based modules such as mod_transform (XInclude/XSLT processing) and mod_xml2 (SAX2 event-based parsing). 3.2 Market/Requester mod_proxy_html has been requested by Web Stack users. 3.3 Business Justification Packaging these modules provides significant added value for Web Stack users. Specifically, it enables use of the Web Stack in a major class of applications that are not possible without these modules. 3.4 Competitive Analysis The reverse proxy/gateway is a common application, and one in which Apache HTTPD has a strong market share. In addition to widely-supported proxy functionality such as loadbalancing and cacheing/acceleration, and web application firewall (a third-party product, mod_security), Apache has a significant advantage in its filter architecture. This enables a wide range of applications in the areas of content transformation and aggregation. One such is mod_proxy_html, which enables Apache to serve as gateway to systems that would fail if accessed through a basic non-content-aware gateway. Earlier versions of mod_proxy_html have been packaged for some years by many competitors, such as Linux distributions, FreeBSD, and even Windows packagers. 3.5 Opportunity Window/Exposure These modules are relevant for the forseeable future. Earlier versions have been in widespread use for several years. 3.6 How will you know when you are done? IPS Packages will be available. They will be verified as rewriting links in a proxy context, and correctly handling "exotic" character sets using test cases such as Pravda (cyrillic script and incorporating complex transcluded content). 4. Technical Description: 4.1 Details This is a straightforward packaging effort. 4.2 Bug/RFE Number(s) CR 6716096 (mod_xml2enc) and CR 6716092 (mod_proxy_html) OSR 9711 (mod_xml2enc) and OSR 9712 (mod_proxy_html) 4.3 In Scope Documentation will be updated. 4.4 Out of Scope The tutorial at www.apachetutor.org should be updated and may additionally be duplicated at Sun for Web Stack users. However, this falls outside this project. 4.5 Interfaces Both modules introduce a number of configuration options for Apache's httpd.conf, and mod_proxy_html packages a sample configuration file proxy_html.conf configuring it to parse documents by default as standard W3C HTML 4.01 or XHTML 1.0. In addition, mod_xml2enc exports an API/ABI for internationalisation support in file mod_xml2enc.h. These are expected to remain compatible for at least the lifetime of Apache 2.2. Exported Interfaces NAME STABILITY DESCRIPTION ----------------------------------------------------------------------- /usr/apache2/2.2/include/mod_xml2enc.h Uncommitted Header file exporting i18n API/ABI /etc/apache2/2.2/samples-conf.d/proxy_html.conf Uncommitted Configuration sample xml2EncDefault Uncommitted Configuration directive xml2EncAlias Uncommitted Configuration directive xml2StartParse Uncommitted Configuration directive ProxyHTMLEvents Uncommitted Configuration directive ProxyHTMLLinks Uncommitted Configuration directive ProxyHTMLURLMap Uncommitted Configuration directive ProxyHTMLDoctype Uncommitted Configuration directive ProxyHTMLFixups Uncommitted Configuration directive ProxyHTMLMeta Uncommitted Configuration directive ProxyHTMLInterp Uncommitted Configuration directive ProxyHTMLExtended Uncommitted Configuration directive ProxyHTMLStripComments Uncommitted Configuration directive ProxyHTMLLogVerbose Uncommitted Configuration directive ProxyHTMLBufSize Uncommitted Configuration directive ProxyHTMLCharsetOut Uncommitted Configuration directive ProxyHTMLEnable Uncommitted Configuration directive SUNWapch22m-xml2enc Committed Package name SUNWapch22m-proxy-html Committed Package name Imported Interfaces NAME Stability Description ARC Case ref ----------------------------------------------------------------------- libxml2 Stable XML library PSARC/2008/032 Apache HTTPD Uncommitted Apache HTTPD PSARC/2007/169 4.6 Doc Impact Documentation is available at the originator's website. The configuration sample will direct users there. 4.7 Admin/Config Impact New Apache configuration options as documented at http://apache.webthing.com/mod_proxy_html/ and http://apache.webthing.com/mod_xml2enc/ 4.8 HA Impact N/A 4.9 I18N/L10N Impact By default, all contents are served as UTF-8, regardless of the character encoding of the original contents. mod_xml2enc supports automatic detection and administrator overrides of different character encodings, and conversion of output to an administrator's choice of encoding. 4.10 Packaging/Delivery This project is part of the Sun Webstack, and will be packaged as SUNWapch22m-xml2enc and SUNWapch22m-proxy-html It has no impact on existing packages. 4.11 Security Impact None known. 4.12 Dependencies This proposal depends on Apache HTTPD 2.2.x and libxml2 2.6 or later. libxml2: http://arc.opensolaris.org/caselog/PSARC/2008/032/ Apache HTTPD: http://arc.opensolaris.org/caselog/PSARC/2007/169/ 5. Reference Documents http://apache.webthing.com/mod_proxy_html/ http://apache.webthing.com/mod_xml2enc/ http://www.apachetutor.org/admin/reverseproxies OSR 9711 (mod_xml2enc) and OSR 9712 (mod_proxy_html) cr6716096 (mod_xml2enc) and cr6716092 (mod_proxy_html) http://arc.opensolaris.org/caselog/PSARC/2008/032/ http://arc.opensolaris.org/caselog/PSARC/2007/169/ 6. Projected Availability 2009/12 7. Prototype Availability The modules are available, and can be compiled "by hand" as prototype. APPENDIX A: Files to be delivered in SUNWapch22m-xml2enc /usr/apache2/2.2/include/mod_xml2enc.h /usr/apache2/2.2/libexec/mod_xml2enc.so /usr/apache2/2.2/libexec/amd64/mod_xml2enc.so /usr/apache2/2.2/libexec/sparc9/mod_xml2enc.so APPENDIX B: Files to be delivered in SUNWapch22m-proxy-html /etc/apache2/2.2/samples-conf.d/proxy_html.conf /usr/apache2/2.2/libexec/mod_proxy_html.so /usr/apache2/2.2/libexec/amd64/mod_proxy_html.so /usr/apache2/2.2/libexec/sparc9/mod_proxy_html.so 6. Resources and Schedule 6.4. Steering Committee requested information 6.4.1. Consolidation C-team Name: sfw 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open