martin 01/05/08 04:38:58 Modified: htdocs/manual ebcdic.html htdocs/manual/mod core.html Log: Move EBCDIC conversion blurb to where it fits better. Suggested by: Joshua Slive <[EMAIL PROTECTED]> Revision Changes Path 1.11 +149 -37 httpd-docs-1.3/htdocs/manual/ebcdic.html Index: ebcdic.html =================================================================== RCS file: /home/cvs/httpd-docs-1.3/htdocs/manual/ebcdic.html,v retrieving revision 1.10 retrieving revision 1.11 diff -u -u -r1.10 -r1.11 --- ebcdic.html 2001/03/09 10:09:47 1.10 +++ ebcdic.html 2001/05/08 11:38:35 1.11 @@ -16,7 +16,7 @@ <H1 ALIGN="CENTER">Overview of the Apache EBCDIC Port</H1> <P> - Version 1.3 of the Apache HTTP Server is the first version which + As of Version 1.3, the Apache HTTP Server includes a port to (non-ASCII) mainframe machines which use the EBCDIC character set as their native codeset.<BR> (Initially, that support covered only the Fujitsu-Siemens family of @@ -27,42 +27,148 @@ systems TPF and OS/390 were added). </P> - <P> - The port was started initially to - </P> +<HR> - <UL> - <LI> prove the feasibility of porting - <A HREF="http://dev.apache.org/">the Apache HTTP server</A> - to this platform - <LI> find a "worthy and capable" successor for the venerable - <A HREF="http://www.w3.org/Daemon/">CERN-3.0</A> daemon - (which was ported a couple of years ago), and to - <LI> prove that Apache's preforking process model can on this platform - easily outperform the accept-fork-serve model used by CERN by a - factor of 5 or more. - </UL> +<H2 ALIGN=CENTER><A NAME="ebcdic">EBCDIC-related conversion functions</A></H2> - <P> - This document serves as a rationale to describe some of the design - decisions of the port to this machine. - </P> + The EBCDIC related directives + <A HREF="mod/core.html#ebcdicconvert">EBCDICConvert</A>, + <A HREF="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</A>, and + <A HREF="mod/core.html#ebcdickludge">EBCDICKludge</A> + are available + <B>only if the platform's character set is EBCDIC</B> + (This is currently only the case on Fujitsu-Siemens' + BS2000/OSD and IBM's OS/390 and TPF operating systems). EBCDIC + stands for <EM>Extended Binary-Coded-Decimal Interchange Code</EM> + and is the codeset used on mainframe machines, in contrast to + ASCII which is ubiquitous on almost all micro computers today. + ASCII (or its extension <EM>latin1</EM>) is the basis for the HTTP + transfer protocol, therefore all EBCDIC-based platforms need a + way to configure the code set conversion rules required between + the EBCDIC based mainframe host and the HTTP socket protocol.<BR> + +<P> + On an EBCDIC based system, HTML files and other text files are + usually saved encoded in the native EBCDIC code set, while image + files and other binary data are stored with identical encoding as + on ASCII based machines. When the Apache server accesses documents, + it must therefore make a distinction between text files (to be + converted to/from ASCII, depending on the transfer direction) + and binary files (to be delivered unconverted). + Such a distinction can be made based on the assigned MIME type, or + based on the file extension (<EM>i.e.</EM>, files sharing a common file + suffix). +</P> + +<P> + By default, the configuration is symmetric for input and output + (<EM>i.e.</EM>, when a PUT request is executed for a document which was + returned by a previous GET request, then the resulting uploaded + copy should be identical to the original file). However, the + conversion directives allow for specifying different conversions + for input and output. +</P> + +<P> + The directives <a href="mod/core.html#ebcdicconvert">EBCDICConvert</a> and + <a href="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</a> are used to + assign the conversion setting (On or Off) based on file + extensions or MIME types. Each configuration setting can be defined + for input only (<EM>e.g.</EM>, PUT method), output only (<EM>e.g.</EM>, GET method), + or both input and output. By default, the conversion setting is + applied for input and output. +</P> + +<P> + Note that after modifying the conversion settings for a group of + files, it is not sufficient to restart the server. The reason for + this is the fact that a cached copy of a document (in a browser or + proxy cache) will not get revalidated by contents, but only by + date. Since the modification time of the document did not change, + browsers will assume they can reuse the cached copy.<BR> + To recover from this situation, you must either clear all cached + copies (browser and proxy cache!), or update the modification time + of the documents (using the <CODE>touch</CODE> command on the server). +</P> + +<P> + Note also that server-parsed documents (CGI scripts, .shtml files, + and other interpreted files like PHP scripts etc.) are not subject to + any input conversion and must therefore be stored in EBCDIC form + on the server side. +</P> + +<P> + In absense of any + <A HREF="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</A> directive, + and if no matching <A HREF="mod/core.html#ebcdicconvert">EBCDICConvert</A> was + found, Apache falls back to an internal heuristic which assumes + that all documents with MIME types starting with + <SAMP>"text/"</SAMP>, <SAMP>"message/"</SAMP> or + <SAMP>"multipart/"</SAMP> as well as the MIME type + <SAMP>"application/x-www-form-urlencoded"</SAMP> are text documents + stored in EBCDIC, whereas all other documents are binary files. +</P> + +<P> + In order to provide backward compatibility with older versions of + apache, the <A HREF="mod/core.html#ebcdickludge">EBCDICKludge</A> directive + allows for a less powerful mechanism to control the conversion of + documents to and from EBCDIC. +</P> + +<P> + <STRONG>Note</STRONG>:<BLOCKQUOTE> + The EBCDICKludge directive is deprecated, since its functionality + is superseded by the more powerful + <A HREF="mod/core.html#ebcdicconvert">EBCDICConvert</A> and + <A HREF="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</A> + directives.</BLOCKQUOTE> +</P> + +<P> + The directives are applied in the following order: + <OL> + <LI>First, the configured <A HREF="mod/core.html#ebcdicconvert">EBCDICConvert</A> + directives in the current context are evaluated in + configuration file order. As soon as a matching file extension + is found, the search stops and the configured conversion is + applied.<BR> + + EBCDICConvert settings inherited from parent directories are + tested after the more specific (deeper) directory levels. + </LI> + <LI>If the <A HREF="mod/core.html#ebcdickludge">EBCDICKludge</A> is in effect, + the next step tests for a MIME type of the format + <SAMP><I>type/</I><B>x-ascii-</B><I>subtype</I></SAMP>. If the + document has such a type, then the + <SAMP>"<B>x-ascii-</B>"</SAMP> substring is removed and the + conversion set to <SAMP>Off</SAMP>. + </LI> + <LI>In the next step, the configured + <A HREF="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</A> + directives are evaluated in configuration file order. If + the document has a matching MIME type, the search stops and + the configured conversion is applied.<BR> + + EBCDICConvertByType settings inherited from parent + directories are tested after the more specific (deeper) + directory levels.<BR> + + If no <A HREF="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</A> + directive at all exists in the current context, the server + falls back to the simple heuristics which assume that MIME + types starting with "text/", "message/" or "multipart/" (plus + the special type "application/x-www-form-urlencoded" used in + simple POST requests) imply a conversion, while all the rest + is delivered unconverted (<EM>i.e.</EM>, binary). + </LI> + </OL> +</P> - <H2 ALIGN=CENTER>Design Goals</H2> - <P> - One objective of the EBCDIC port was to maintain enough backwards - compatibility with the (EBCDIC) CERN server to make the transition to - the new server attractive and easy. This required the addition of - a configurable method to define whether a HTML document was stored - in ASCII (the only format accepted by the old server) or in EBCDIC - (the native document format in the POSIX subsystem, and therefore - the only sensible format in which the other POSIX tools like grep - or sed could operate on documents). Later, special EBCDIC conversion - directives were added which allow for the flexible definition of - conversion rules based on the documents' MIME type or file extension. - </P> +<HR> - <H2 ALIGN=CENTER>Technical Solution</H2> + <H2 ALIGN=CENTER><A NAME="tech">Technical Details</A></H2> <P> Since all Apache input and output is based upon the BUFF data type and its methods, the easiest solution was to add the actual @@ -111,7 +217,8 @@ requests. (See RFC2616 and src/main/http_protocol.c for details.) -<H2 ALIGN=CENTER>Porting Notes</H2> + <HR> + <H2 ALIGN=CENTER><A NAME="port">Porting Notes</A></H2> <OL> <LI> @@ -184,8 +291,9 @@ are text documents and are stored as EBCDIC files, whereas all other files are binary files (and stored in a byte-identical encoding as on an ASCII machine).<BR> - These defaults <A HREF="mod/core.html#ebcdic">can be overridden</A> - on a by-MIME-type and/or by-file-extension basis, using the + These defaults can be overridden + on a <A HREF="mod/core.html#ebcdicconvertbytype">by-MIME-type</A> and/or + <A HREF="mod/core.html#ebcdicconvert">by-file-extension</A> basis, using the directives<PRE> <A HREF="mod/core.html#ebcdicconvertbytype">EBCDICConvertByType</A> {On|Off}[={In|Out|InOut}] <EM>mimetype</EM> [...] <A HREF="mod/core.html#ebcdicconvert">EBCDICConvert</A> {On|Off}[={In|Out|InOut}] <EM>fileext</EM> [...] @@ -219,8 +327,10 @@ <BR> </LI> </OL> + + <HR> - <H2 ALIGN=CENTER>Document Storage Notes</H2> + <H2 ALIGN=CENTER><A NAME="store">Document Storage Notes</A></H2> <H3 ALIGN=CENTER>Binary Files</H3> <P> When exchanging binary files between the mainframe host and a @@ -242,6 +352,8 @@ <P> SSI documents must currently be stored in EBCDIC only. No provision is made to convert them from ASCII before processing. + The same holds for other interpreted languages, like + mod_perl or mod_php. </P> <!--#include virtual="footer.html" --> 1.188 +7 -128 httpd-docs-1.3/htdocs/manual/mod/core.html Index: core.html =================================================================== RCS file: /home/cvs/httpd-docs-1.3/htdocs/manual/mod/core.html,v retrieving revision 1.187 retrieving revision 1.188 diff -u -u -r1.187 -r1.188 --- core.html 2001/05/04 22:38:24 1.187 +++ core.html 2001/05/08 11:38:48 1.188 @@ -796,134 +796,6 @@ <P><HR> -<H2><A NAME="ebcdic">EBCDIC-related conversion functions</A></H2> - - The following EBCDIC related directives are available - <B>only if the platform's character set is EBCDIC</B> - (This is currently only the case on Fujitsu-Siemens' - BS2000/OSD and IBM's OS/390 and TPF operating systems). EBCDIC - stands for <EM>Extended Binary-Coded-Decimal Interchange Code</EM> - and is the codeset used on mainframe machines, in contrast to - ASCII which is ubiquitous on almost all micro computers today. - ASCII (or its extension <EM>latin1</EM>) is the basis for the HTTP - transfer protocol, therefore all EBCDIC-based platforms need a - way to configure the code set conversion rules required between - the EBCDIC based mainframe host and the HTTP socket protocol.<BR> - -<P> - On an EBCDIC based system, HTML files and other text files are - usually saved encoded in the native EBCDIC code set, while image - files and other binary data are stored with identical encoding as - on ASCII based machines. When the Apache server accesses documents, - it must therefore make a distinction between text files (to be - converted to/from ASCII, depending on the transfer direction) - and binary files (to be delivered unconverted). - Such a distinction can be made based on the assigned MIME type, or - based on the file extension (<EM>i.e.</EM>, files sharing a common file - suffix). -</P> - -<P> - By default, the configuration is symmetric for input and output - (<EM>i.e.</EM>, when a PUT request is executed for a document which was - returned by a previous GET request, then the resulting uploaded - copy should be identical to the original file). However, the - conversion directives allow for specifying different conversions - for input and output. -</P> - -<P> - The directives <a href="#ebcdicconvert">EBCDICConvert</a> and - <a href="#ebcdicconvertbytype">EBCDICConvertByType</a> are used to - assign the conversion setting (On or Off) based on file - extensions or MIME types. Each configuration setting can be defined - for input only (<EM>e.g.</EM>, PUT method), output only (<EM>e.g.</EM>, GET method), - or both input and output. By default, the conversion setting is - applied for input and output. -</P> - -<P> - Note that after modifying the conversion settings for a group of - files, it is not sufficient to restart the server. The reason for - this is the fact that a cached copy of a document (in a browser or - proxy cache) will not get revalidated by contents, but only by - date. Since the modification time of the document did not change, - browsers will assume they can reuse the cached copy.<BR> - To recover from this situation, you must either clear all cached - copies (browser and proxy cache!), or update the modification time - of the documents (using the <CODE>touch</CODE> command on the server). -</P> - -<P> - In absense of any - <A HREF="#ebcdicconvertbytype">EBCDICConvertByType</A> directive, - and if no matching <A HREF="#ebcdicconvert">EBCDICConvert</A> was - found, Apache falls back to an internal heuristic which assumes - that all documents with MIME types starting with - <SAMP>"text/"</SAMP>, <SAMP>"message/"</SAMP> or - <SAMP>"multipart/"</SAMP> as well as the MIME type - <SAMP>"application/x-www-form-urlencoded"</SAMP> are text documents - stored in EBCDIC, whereas all other documents are binary files. -</P> - -<P> - In order to provide backward compatibility with older versions of - apache, the <A HREF="#ebcdickludge">EBCDICKludge</A> directive - allows for a less powerful mechanism to control the conversion of - documents to and from EBCDIC. -</P> - -<P> - <STRONG>Note</STRONG>:<BLOCKQUOTE> - The EBCDICKludge directive is deprecated, since its functionality - is superseded by the more powerful - <A HREF="#ebcdicconvert">EBCDICConvert</A> and - <A HREF="#ebcdicconvertbytype">EBCDICConvertByType</A> - directives.</BLOCKQUOTE> -</P> - -<P> - The directives are applied in the following order: - <OL> - <LI>First, the configured <A HREF="#ebcdicconvert">EBCDICConvert</A> - directives in the current context are evaluated in - configuration file order. As soon as a matching file extension - is found, the search stops and the configured conversion is - applied.<BR> - - EBCDICConvert settings inherited from parent directories are - tested after the more specific (deeper) directory levels. - </LI> - <LI>If the <A HREF="#ebcdickludge">EBCDICKludge</A> is in effect, - the next step tests for a MIME type of the format - <SAMP><I>type/</I><B>x-ascii-</B><I>subtype</I></SAMP>. If the - document has such a type, then the - <SAMP>"<B>x-ascii-</B>"</SAMP> substring is removed and the - conversion set to <SAMP>Off</SAMP>. - </LI> - <LI>In the next step, the configured - <A HREF="#ebcdicconvertbytype">EBCDICConvertByType</A> - directives are evaluated in configuration file order. If - the document has a matching MIME type, the search stops and - the configured conversion is applied.<BR> - - EBCDICConvertByType settings inherited from parent - directories are tested after the more specific (deeper) - directory levels.<BR> - - If no <A HREF="#ebcdicconvertbytype">EBCDICConvertByType</A> - directive at all exists in the current context, the server - falls back to the simple heuristics which assume that MIME - types starting with "text/", "message/" or "multipart/" (plus - the special type "application/x-www-form-urlencoded" used in - simple POST requests) imply a conversion, while all the rest - is delivered unconverted (<EM>i.e.</EM>, binary). - </LI> - </OL> -</P> - -<HR> - <H2><A NAME="ebcdicconvert">EBCDICConvert</A></H2> <!--%plaintext <?INDEX {\tt EBCDICConvert} directive> --> <A @@ -995,6 +867,7 @@ </P> <P> <STRONG>See also</STRONG>: <A HREF="#ebcdicconvertbytype">EBCDICConvertByType</A> + and <A HREF="../ebcdic.html#ebcdic">Overview of the EBCDIC Conversion Functions</A> </P> <hr> @@ -1062,6 +935,7 @@ </pre> <P> <STRONG>See also</STRONG>: <A HREF="#ebcdicconvert">EBCDICConvert</A> + and <A HREF="../ebcdic.html#ebcdic">Overview of the EBCDIC Conversion Functions</A> </P> <P><HR> @@ -1130,6 +1004,11 @@ conversion. (Before Apache version 1.3.19, there was no way at all to force these binary documents to be treated as EBCDIC text files.) +</P> +<P> + <STRONG>See also</STRONG>: <A HREF="#ebcdicconvert">EBCDICConvert</A>, + <A HREF="#ebcdicconvertbytype">EBCDICConvertByType</A> + and <A HREF="../ebcdic.html#ebcdic">Overview of the EBCDIC Conversion Functions</A> </P> <HR>
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]