hirokawa                Thu Jun 28 23:20:29 2001 EDT

  Modified files:              
    /phpdoc/en/functions        mbstring.xml 
  Log:
  fixed some typos.
  
  
Index: phpdoc/en/functions/mbstring.xml
diff -u phpdoc/en/functions/mbstring.xml:1.2 phpdoc/en/functions/mbstring.xml:1.3
--- phpdoc/en/functions/mbstring.xml:1.2        Sun Jun 24 11:27:21 2001
+++ phpdoc/en/functions/mbstring.xml    Thu Jun 28 23:20:28 2001
@@ -1,117 +1,305 @@
  <reference id="ref.mbstring">
   <title>Multi-Byte String Functions</title> 
-  <titleabbrev>Multi-Byte String</titleabbrev>
+  <titleabbrev>
+   Multi-Byte String
+  </titleabbrev>
   <partintro>
    &warn.experimental;
    <sect1 id="mb-intro">
     <title>Introduction</title>
     <warning>
      <simpara>
-      This module is EXPERIMENTAL. Function name/API is subject to be
-      changed. Current conversion filter supports Japanese only.
+      This module is EXPERIMENTAL. Function name/API is subject to
+      change. Current conversion filter supports Japanese only.
      </simpara>
     </warning>
     <para>
-     There are many languages that all characters cannot be expressed
+     There are many languages in which all characters can be expressed
      by single byte. Multi-byte character codes are used to express
      many characters for many languages.  <literal>mbstring</literal>
      is developed to handle Japanese characters. However, many
      <literal>mbstring</literal> functions are able to handle
-     character codes other than Japanese.
+     character encoding other than Japanese.
     </para>
     <para>
-     Multi-byte character encoding represents single character with
+     A multi-byte character encoding represents single character with
      consecutive bytes. Some character encoding has shift(escape)
-     sequences to start/end multi-byte character string. Therefore,
+     sequences to start/end multi-byte character strings. Therefore, a
      multi-byte character string may be destroyed when it is divided
-     and/or counted, unless multi-byte character encoding safe method
-     is used. <literal>mbstring</literal> functions support multi-byte
-     character safe string functions and other utility functions such
-     as conversion functions.
+     and/or counted unless multi-byte character encoding safe method
+     is used. This module provides multi-byte character safe string
+     functions and other utility functions such as conversion
+     functions.
     </para>
+    <para>
+     Since PHP is basically designed for ISO-8859-1, some multi-byte
+     character encoding does not work well with PHP. Therefore, it is
+     important to set <literal>mbstring.internal_encoding</literal> to
+     a character encoding that works with PHP.
+    </para>
+    <para>
+     PHP4 Character Encoding Requirements 
+    </para>
+    <para>
+     <itemizedlist>
+      <listitem>
+       <simpara>
+       Per byte encoding
+       </simpara>
+      </listitem>
+      <listitem>
+       <simpara>
+       Single byte characters in range of <literal>00h-7fh</literal>
+       which is compatible with <literal>ASCII</literal>
+       </simpara>
+      </listitem>
+      <listitem>
+       <simpara>
+       Multi-byte characters without <literal>00h-7fh</literal>
+       </simpara>
+      </listitem>
+     </itemizedlist>
+    </para>
+    <para>
+     These are examples of internal character encoding that works with
+     PHP and does NOT work with PHP.
+     <informalexample>
+      <programlisting>
 
-    <sect2 id="mb-ja-basic">
-     <title>Basics for Japanese multi-byte character</title>
+Character encodings work with PHP: 
+ISO-8859-*, EUC-JP, UTF-8
+
+
+Character encodings do NOT work with PHP:
+JIS, SJIS
+      </programlisting>
+     </informalexample>
+    </para>
+    <para>
+     Character encoding, that does not work with PHP, may be converted
+     with <literal>mbstring</literal>'s HTTP input/output conversion
+     feature/function.
+    </para>
+    <note>
+     <para>
+      SJIS should not be used for internal encoding unless the reader
+      is familiar with parser/compiler, character encoding and
+      character encoding issues.
+     </para>
+    </note>
+    <note>
      <para>
-      Most Japanese characters need more than 1 byte for a
-      character. In addition to this, several character encodings are
-      used under Japanese environment. There are EUC-JP, Shift_JIS and
-      ISO-2022-JP character encoding. As Unicode is getting popular,
-      UTF-8 is used also. To develop Web application for Japanese
-      environment, it is important to use these character codes depend
-      on its purpose, HTTP input/output, RDBMS and E-mail.
+      If you use database with PHP, it is recommended that you use the
+      same character encoding for both database and <literal>internal
+      encoding</literal> for ease of use and better performance.
+      </para>
+     <para>
+      If you are using PostgreSQL, it supports character
+      encoding that is different from backend character encoding. See
+      the PostgreSQL manual for details.
      </para>
+    </note>
+
+    <sect2 id="mb-enable">
+     <title>How to Enable mbstring</title>
      <para>
+      <literal>mbstring</literal> is an extended module. You must
+      enable module with <literal>configure</literal> script. Refer
+      to the <link linkend="installation">Install</link> section for
+      details.
+     </para>
+     <simpara>
+      The following configure options are related to
+      <literal>mbstring</literal> module.
+     </simpara>
+     <para>
       <itemizedlist>
-       <listitem>
-       <simpara>
-        Storage for a character can be upto four bytes
-       </simpara>
-       </listitem>
        <listitem>
-       <simpara>
-        A multi-byte character usually has twice of width compare to
-        single byte characters. Wider character is called "zen-kaku"
-        - meaning full width, narrower character called "han-kaku" -
-        meaning half width. "zen-kaku" characters are fixed width
-        usually.
-       </simpara>
+       <para>
+        <option role="configure">--enable-mbstring</option> : Enable
+        <literal>mbstring</literal> functions. This option is
+        required to use <literal>mbstring</literal> functions.
+       </para>
        </listitem>
        <listitem>
-       <simpara>
-        Some character encoding defines shift sequence for
-        entering/exiting multi-byte character strings.
-       </simpara>
+       <para>
+        <option role="configure">--enable-mbstr-enc-trans</option> :
+        Enable HTTP input character encoding conversion using
+        <literal>mbstring</literal> conversion engine. If this
+        feature is enabled, HTTP input character encoding may be
+        converted to <literal>mbstring.internal_encoding</literal>
+        automatically.
+       </para>
        </listitem>
+      </itemizedlist>
+     </para>
+    </sect2>
+
+    <sect2 id="mb-conv">
+     <title>HTTP Input and Output</title>
+     <para>
+      HTTP input/output character encoding conversion may convert
+      binary data also. Users are supposed to control character
+      encoding conversion if binary data is used for HTTP
+      input/output.
+     </para>
+     <para>
+      If <literal>enctype</literal> for HTML form is set to
+      <literal>multipart/form-data</literal>,
+      <literal>mbstring</literal> does not convert character encoding
+      in POST data. If it is the case, strings are needed to be
+      converted to internal character encoding.
+     </para>
+     <para>
+      <itemizedlist>
        <listitem>
        <simpara>
-        Database may allocate storage for characters that differs
-        from size used in PHP even if the same character encoding is
-        used. (For example, PostgreSQL)
+        HTTP Input
        </simpara>
+       <para> There is no way to control HTTP input character
+        conversion from PHP script. To disable HTTP input character
+        conversion, it has to be done in <literal>php.ini</literal>.
+        <example>
+         <title>
+          Disable HTTP input conversion in php.ini
+         </title>
+         <programlisting role="php">
+
+;; Disable HTTP Input conversion
+mbstring.http_input = pass
+         </programlisting>
+        </example>
+       </para>
+       <para>
+        When using PHP as an Apache module, it is possible to
+        override PHP ini setting per Virtual Host in
+        <literal>httpd.conf</literal> or per directory with
+        <literal>.htaccess</literal>. Refer to the <link
+         linkend="configuration">Configuration</link> section and
+        Apache Manual for details.
+       </para>
        </listitem>
        <listitem>
        <simpara>
-        E-mail is supposed to use ISO-2022-JP.
+        HTTP Output
        </simpara>
-       </listitem>
-       <listitem>
        <para>
-        &quot;i-mode&quot; web site is supposed to use Shift_JIS.
+        There are several ways to enable output character encoding
+        conversion. One is using <literal>php.ini</literal>, another
+        is using <function>ob_start</function> with
+        <function>mb_output_handler</function> as
+        <literal>ob_start</literal> callback function.
        </para>
+       <note>
+        <para>
+         For PHP3-i18n users, <literal>mbstring</literal>'s output
+         conversion differs from PHP3-i18n. Character encoding is
+         converted using output buffer.
+        </para>
+       </note>
        </listitem>
       </itemizedlist>
      </para>
+     <para>
+      <example>
+       <title><literal>php.ini</literal> setting example</title>
+       <programlisting role="php">
+
+;; Enable output character encoding conversion for all PHP pages
+
+;; Enable Output Buffering
+output_buffering    = On
+
+;; Set mb_output_handler to enable output conversion
+output_handler      = mb_output_handler
+       </programlisting>
+      </example>
+     </para>
+     <para>
+      <example>
+       <title>Script example</title>
+       <programlisting role="php">
+
+&lt;?php
+
+// Enable output character encoding conversion only for this page
+
+// Set HTTP output character encoding to SJIS
+mb_http_output('SJIS');
+
+// Start buffering and specify "mb_output_handler" as
+// callback function
+ob_start('mb_output_handler');
+
+?&gt;
+       </programlisting>
+      </example>
+     </para>
     </sect2>
 
     <sect2 id="mb-code">
-     <title>Supported character encodings</title>
+     <title>Supported Character Encoding</title>
+     <simpara>
+      Currently, the following character encoding is supported by
+      <literal>mbstring</literal> module. Caracter encoding may
+      be specified for <literal>mbstring</literal> functions'
+      <literal>encoding</literal> parameter.  </simpara>
+     <para>
+      The following character encoding is supported in this PHP
+      extension : 
+     </para>
      <para>
-      Following character encodings are supported in this PHP
-      extension : <literal>UCS-4</literal>,
-      <literal>UCS-4BE</literal>, <literal>UCS-4LE</literal>,
-      <literal>UCS-2</literal>, <literal>UCS-2BE</literal>,
-      <literal>UCS-2LE</literal>, <literal>UTF-32</literal>, 
-      <literal>UTF-32BE</literal>, <literal>UTF-32LE</literal>,  
-      <literal>UCS-2LE</literal>, <literal>UTF-16</literal>, 
-      <literal>UTF-16BE</literal>, <literal>UTF-16LE</literal>,  
-      <literal>UTF-8</literal>, <literal>UTF-7</literal>, 
-      <literal>ASCII</literal>, <literal>EUC-JP</literal>,
-      <literal>SJIS</literal>, <literal>eucJP-win</literal>,  
-      <literal>SJIS-win</literal>,
-      <literal>ISO-2022-JP</literal>(<literal>JIS</literal>),
+      <literal>UCS-4</literal>, <literal>UCS-4BE</literal>,
+      <literal>UCS-4LE</literal>, <literal>UCS-2</literal>,
+      <literal>UCS-2BE</literal>, <literal>UCS-2LE</literal>,
+      <literal>UTF-32</literal>, <literal>UTF-32BE</literal>,
+      <literal>UTF-32LE</literal>, <literal>UCS-2LE</literal>,
+      <literal>UTF-16</literal>, <literal>UTF-16BE</literal>,
+      <literal>UTF-16LE</literal>, <literal>UTF-8</literal>,
+      <literal>UTF-7</literal>, <literal>ASCII</literal>,
+      <literal>EUC-JP</literal>, <literal>SJIS</literal>,
+      <literal>eucJP-win</literal>, <literal>SJIS-win</literal>,
+      <literal>ISO-2022-JP</literal>, <literal>JIS</literal>,
       <literal>ISO-8859-1</literal>, <literal>ISO-8859-2</literal>,
       <literal>ISO-8859-3</literal>, <literal>ISO-8859-4</literal>,
       <literal>ISO-8859-5</literal>, <literal>ISO-8859-6</literal>,
       <literal>ISO-8859-7</literal>, <literal>ISO-8859-8</literal>,
       <literal>ISO-8859-9</literal>, <literal>ISO-8859-10</literal>,
       <literal>ISO-8859-13</literal>, <literal>ISO-8859-14</literal>,
-      <literal>ISO-8859-15</literal>.
+      <literal>ISO-8859-15</literal>, <literal>byte2be</literal>,
+      <literal>byte2le</literal>, <literal>byte4be</literal>,
+      <literal>byte4le</literal>, <literal>BASE64</literal>,
+      <literal>7bit</literal>, <literal>8bit</literal> and
+      <literal>UTF7-IMAP</literal>.
+     </para>
+     <para>
+      <literal>php.ini</literal> entry, which accepts encoding name,
+      accepts &quot;<literal>auto</literal>&quot; and
+      &quot;<literal>pass</literal>&quot; also.
+      <literal>mbstring</literal> functions, which accepts encoding
+      name, and accepts &quot;<literal>auto</literal>&quot;.
+     </para>
+     <para>
+      If &quot;<literal>pass</literal>&quot; is set, no character
+      encoding conversion is performed.
+     </para>
+     <para>
+      If &quot;<literal>auto</literal>&quot; is set, it is expanded to
+      &quot;<literal>ASCII,JIS,UTF-8,EUC-JP,SJIS</literal>&quot;.
+     </para>
+     <para>
+      See also <function>mb_detect_order</function>
      </para>
+     <note>
+      <para>
+       &quot;Supported character encoding&quot; does not mean that it
+       works as internal character code.
+      </para>
+     </note>
     </sect2>
     
     <sect2 id="mb-ini">
-     <title> php.ini settings </title>
+     <title>php.ini settings</title>
      <para>
       <itemizedlist>
        <listitem>
@@ -122,63 +310,311 @@
        </listitem>
        <listitem>
        <simpara>
-        <literal>mbstring.http_input</literal> defines default HTTP input
-        character encoding.
+        <literal>mbstring.http_input</literal> defines default HTTP
+        input character encoding.
        </simpara>
        </listitem>
        <listitem>
        <simpara>
-        <literal>mbstring.http_output</literal> defines default HTTP output
-        character encoding.
+        <literal>mbstring.http_output</literal> defines default HTTP
+        output character encoding.
        </simpara>
        </listitem>
        <listitem>
        <simpara>
-        <literal>mbstring.detect_order</literal> defines default character
-        encoding detection order.
+        <literal>mbstring.detect_order</literal> defines default
+        character code detection order. See also
+        <function>mb_detect_order</function>. 
        </simpara>
        </listitem>
        <listitem>
        <simpara>
-        <literal>mbstring.substitute_character</literal> defines character
-        to substitute for invalid character codes.
+        <literal>mbstring.substitute_character</literal> defines
+        character to substitute for invalid character encoding.
        </simpara>
        </listitem>
       </itemizedlist>
      </para>
      <para>
+      Web Browsers are supposed to use the same character encoding
+      when submitting form. However, browsers may not use the same
+      character encoding. See <function>mb_http_input</function> to
+      detect character encoding used by browsers.
+     </para>
+     <para>
+      If <literal>enctype</literal> is set to
+      <literal>multipart/form-data</literal> in HTML forms,
+      <literal>mbstring</literal> does not convert character encoding
+      in POST data. The user must convert them in the script, if
+      conversion is needed.
+     </para>
+     <para>
+      Although, browsers are smart enough to detect character encoding
+      in HTML. <literal>charset</literal> is better to be set in HTTP
+      header. Change <literal>default_charset</literal> according to
+      character encoding.
+     </para>
+     <para>
       <example>
        <title><literal>php.ini</literal> setting example</title>
-       <programlisting role="php.ini">
+       <programlisting role="php">
+
 ;; Set default internal encoding
+;; Note: Make sure to use character encoding works with PHP
 mbstring.internal_encoding    = UTF-8  ; Set internal encoding to UTF-8
 
-;; Set default HTTP input character code
-mbstring.http_input = auto     ; Set HTTP input to auto
-; or
-; mbstring.http_input = SJIS     ; Set HTTP input to  SJIS
-; mbstring.http_input = eucjp-win, sjis-win, UTF-8 ; Specify order
-
-;; Set default HTTP output character code 
-mbstring.http_output = UTF-8   ; Set HTTP output encoding to UTF-8
-
-;; Set default character code detection order
-mbstring.detect_order = auto   ; Set HTTP output to auto
-; or 
-; mbstring.detect_order = eucjp-win, sjis-win, UTF-8 ; Specify order
+;; Set default HTTP input character encoding
+;; Note: Script cannot change http_input setting.
+mbstring.http_input           = pass    ; No conversion. 
+mbstring.http_input           = auto    ; Set HTTP input to auto
+                                       ; "auto" is expanded to 
+"ASCII,JIS,UTF-8,EUC-JP,SJIS"
+mbstring.http_input           = SJIS    ; Set HTTP2 input to  SJIS
+mbstring.http_input           = UTF-8,SJIS,EUC-JP ; Specify order
+
+;; Set default HTTP output character encoding 
+mbstring.http_output          = pass    ; No conversion
+mbstring.http_output          = UTF-8   ; Set HTTP output encoding to UTF-8
+
+;; Set default character encoding detection order
+mbstring.detect_order         = auto    ; Set detect order to auto
+mbstring.detect_order         = ASCII,JIS,UTF-8,SJIS,EUC-JP ; Specify order
 
 ;; Set default substitute character
-mbstring.substitute_character = 12307 ; Specify character code
-; or
-; mbstring.substitute_character = none  ; Null character
-; mbstring.substitute_character = long  ; Long 
+mbstring.substitute_character = 12307   ; Specify Unicode value
+mbstring.substitute_character = none    ; Do not print character
+mbstring.substitute_character = long    ; Long Example: U+3000,JIS+7E7E
        </programlisting>
       </example>
      </para>
+     <para>
+      <example>
+       <title><literal>php.ini</literal> setting for <literal>EUC-JP</literal> 
+users</title>
+       <programlisting role="php">
+
+;; Disable Output Buffering
+output_buffering      = Off
+
+;; Set HTTP header charset
+default_charset       = EUC-JP    
+
+;; Set HTTP input encoding conversion to auto
+mbstring.http_input   = auto 
+
+;; Convert HTTP output to EUC-JP
+mbstring.http_output  = EUC-JP    
+
+;; Set internal encoding to EUC-JP
+mbstring.internal_encoding = EUC-JP    
+
+;; Do not print invalid characters
+mbstring.substitute_character = none   
+       </programlisting>
+      </example>
+     </para>
+     <para>
+      <example>
+       <title><literal>php.ini</literal> setting for <literal>SJIS</literal> 
+users</title>
+       <programlisting role="php">
+
+;; Enable Output Buffering
+output_buffering     = On
+
+;; Set mb_output_handler to enable output conversion
+output_handler       = mb_output_handler
+
+;; Set HTTP header charset
+default_charset      = Shift_JIS
+
+;; Set http input encoding conversion to auto
+mbstring.http_input  = auto 
+
+;; Convert to SJIS
+mbstring.http_output = SJIS    
+
+;; Set internal encoding to EUC-JP
+mbstring.internal_encoding = EUC-JP    
+
+;; Do not print invalid characters
+mbstring.substitute_character = none   
+       </programlisting>
+      </example>
+     </para>
     </sect2>
+
+    <sect2 id="mb-ja-basic">
+     <title>Basics for Japanese multi-byte character</title>
+     <para>
+      Most Japanese characters need more than 1 byte per character. In
+      addition, several character encoding schemas are used under a
+      Japanese environment. There are EUC-JP, Shift_JIS(SJIS) and
+      ISO-2022-JP(JIS) character encoding. As Unicode becomes popular,
+      UTF-8 is used also. To develop Web applications for a Japanese
+      environment, it is important to use the character set for the
+      task in hand, whether HTTP input/output, RDBMS and E-mail.
+     </para>
+     <para>
+      <itemizedlist>
+       <listitem>
+       <simpara>Storage for a character can be up to four
+        bytes</simpara>
+       </listitem>
+       <listitem>
+       <simpara>
+        A multi-byte character is usually twice of the width compared
+        to single-byte characters. Wider characters are called
+        "zen-kaku" - meaning full width, narrower characters are
+        called "han-kaku" - meaning half width. "zen-kaku" characters
+        are usually fixed width.
+       </simpara>
+       </listitem>
+       <listitem>
+       <simpara>
+        Some character encoding defines shift(escape) sequence for
+        entering/exiting multi-byte character strings.
+       </simpara>
+       </listitem>
+       <listitem>
+       <simpara>
+         ISO-2022-JP must be used for SMTP/NNTP.
+       </simpara>
+       </listitem>
+       <listitem>
+       <para>
+        &quot;i-mode&quot; web site is supposed to use SJIS.
+       </para>
+       </listitem>
+      </itemizedlist>
+     </para>
+    </sect2>
+
+    <sect2 id="mb-ref">
+     <title>References</title>
+     <para>
+      Multi-byte character encoding and its related issues are very
+      complex. It is impossible to cover in sufficient detail
+      here. Please refer to the following URLs and other resources for
+      further readings.
+      <itemizedlist>
+       <listitem>
+       <para>
+        Unicode/UTF/UCS/etc
+       </para>
+       <para>
+         <literal>http://www.unicode.org/</literal>
+       </para>
+       </listitem>
+       <listitem>
+       <para>
+        Japanese/Korean/Chinese character
+        information
+       </para>
+       <para>
+        <literal>
+        ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
+        </literal>
+       </para>
+       </listitem>
+      </itemizedlist>
+     </para>
+    </sect2>
+
    </sect1>
   </partintro>
 
+  <refentry id="function.mb-language">
+   <refnamediv>
+    <refname>mb_language</refname>
+    <refpurpose>
+     Set/Get current language
+    </refpurpose>
+   </refnamediv>
+   <refsect1>
+    <title>Description</title>
+    <funcsynopsis>
+     <funcprototype>
+      <funcdef>string
+       <function>mb_language</function></funcdef>
+      <paramdef>string
+       <parameter><optional>language</optional></parameter></paramdef>
+     </funcprototype>
+    </funcsynopsis>
+    <para>
+     <function>mb_language</function> sets language. If
+     <parameter>language</parameter> is omitted, it returns current
+     language as string.
+    </para>
+    <para>
+     <parameter>language</parameter> setting is used for encoding
+     e-mail messages. Valid languages are "Japanese",
+     "ja","English","en" and "uni"
+     (UTF-8). <function>mb_send_mail</function> uses this setting to
+     encode e-mail.
+    </para>
+    <para> Language and its setting is ISO-2022-JP/Base64 for
+    Japanese, UTF-8/Base64 for uni, ISO-8859-1/quoted printable for
+    English.
+    </para>
+    <para>
+     Return Value: If <parameter>language</parameter> is set and
+     <parameter>language</parameter> is valid, it returns
+     TRUE. Otherwise, it returns FALSE. When
+     <parameter>language</parameter> is omitted, it returns language
+     name as string. If no language is set previously, it returns
+     FALSE.
+    </para>
+    <para>
+     See also <function>mb_send_mail</function>.
+    </para>
+   </refsect1>
+  </refentry>
+
+  <refentry id="function.mb-parse-str">
+   <refnamediv>
+    <refname>mb_parse_str</refname>
+    <refpurpose>
+      Parse GET/POST/COOKIE data and set global variable
+    </refpurpose>
+   </refnamediv>
+   <refsect1>
+    <title>Description</title>
+    <funcsynopsis>
+     <funcprototype>
+      <funcdef>string
+       <function>mb_parse_str</function>
+      </funcdef>
+      <paramdef>string
+       <parameter>encoded_string</parameter>
+      </paramdef>
+      <paramdef>array
+       <parameter><optional>result</optional></parameter>
+      </paramdef>
+     </funcprototype>
+    </funcsynopsis>
+    <para>
+     <function>mb_parse_str</function> parses GET/POST/COOKIE data and
+     sets global variables. Since PHP does not provide raw POST/COOKIE
+     data, it can only used for GET data for now. It preses URL
+     encoded data, detects encoding, converts coding to internal
+     encoding and set values to <parameter>result</parameter> array or
+     global variables.
+    </para>
+    <para>
+     <parameter>encoded_string</parameter>: URL encoded data.
+    </para>
+    <para>
+     <parameter>result</parameter>: Array contains decoded and
+     character encoding converted values.
+    </para>
+    <para>
+     Return Value: It returns TRUE for success or FALSE for failure.
+    </para>
+    <para>
+     See also <function>mb_detect_order</function>,
+     <function>mb_internal_encoding</function>.
+    </para>
+   </refsect1>
+  </refentry>
+
   <refentry id="function.mb-internal-encoding">
    <refnamediv>
     <refname>mb_internal_encoding</refname>
@@ -211,7 +647,7 @@
      <parameter>encoding</parameter>: Character encoding name
     </para>
     <para>
-     Return Value: If encoding is
+     Return Value: If <parameter>encoding</parameter> is
      set,<function>mb_internal_encoding</function> returns
      <literal>TRUE</literal> for success, otherwise returns
      <literal>FALSE</literal>. If <parameter>encoding</parameter> is
@@ -232,7 +668,7 @@
     <para>
      See also <function>mb_http_input</function>,
      <function>mb_http_output</function>,
-     <function>mb_detect_order</function>
+     <function>mb_detect_order</function>.
     </para>
    </refsect1>
   </refentry>
@@ -270,7 +706,7 @@
     <para>
      See also <function>mb_internal_encoding</function>,
      <function>mb_http_output</function>,
-     <function>mb_detect_order</function>
+     <function>mb_detect_order</function>.
     </para>
    </refsect1>
   </refentry>
@@ -294,9 +730,10 @@
      If <parameter>encoding</parameter> is set,
      <function>mb_http_output</function> sets HTTP output character
      encoding to <parameter>encoding</parameter>. Output after this
-     function is converted to <parameter>encoding</parameter>. 
-     <function>mb_http_output</function> returns TRUE for success and
-     FALSE for failure.
+     function is converted to <parameter>encoding</parameter>.
+     <function>mb_http_output</function> returns
+     <literal>TRUE</literal> for success and <literal>FALSE</literal>
+     for failure.
     </para>
     <para>
      If <parameter>encoding</parameter> is omitted,
@@ -306,7 +743,7 @@
     <para>
      See also <function>mb_internal_encoding</function>,
      <function>mb_http_input</function>,
-     <function>mb_detect_order</function>
+     <function>mb_detect_order</function>.
     </para>
    </refsect1>
   </refentry>
@@ -331,11 +768,12 @@
     <para>
      <function>mb_detect_order</function> sets automatic character
      encoding detection order to <parameter>encoding-list</parameter>.
-     It returns TRUE for success, FALSE for failure.
+     It returns <literal>TRUE</literal> for success,
+     <literal>FALSE</literal> for failure.
     </para>
     <para>
      <parameter>encoding-list</parameter> is array or comma separated
-     list of character encodings. ("auto" is expanded to
+     list of character encoding. ("auto" is expanded to
      "ASCII, JIS, UTF-8, EUC-JP, SJIS")
     </para>
     <para>
@@ -346,6 +784,42 @@
      This setting affects <function>mb_detect_encoding</function> and
      <function>mb_send_mail</function>.
     </para>
+    <note>
+     <para>
+      <literal>mbstring</literal> currently implements following
+      encoding detection filters. If there is a invalid byte sequence
+      for following encoding, encoding detection will fail.
+     </para>
+     <simpara>
+       <literal>UTF-8</literal>, <literal>UTF-7</literal>,
+       <literal>ASCII</literal>,
+       <literal>EUC-JP</literal>,<literal>SJIS</literal>,
+       <literal>eucJP-win</literal>, <literal>SJIS-win</literal>,
+       <literal>JIS</literal>, <literal>ISO-2022-JP</literal> 
+     </simpara>
+     <para>
+      For <literal>ISO-8859-*</literal>, <literal>mbstring</literal>
+      always detects as <literal>ISO-8859-*</literal>.
+     </para>
+     <para>
+      For <literal>UTF-16</literal>, <literal>UTF-32</literal>,
+      <literal>UCS2</literal> and <literal>UCS4</literal>, encoding
+      detection will fail always.
+     </para>
+     <para>
+      <example>
+       <title>Useless detect order example</title>
+       <programlisting>
+; Always detect as ISO-8859-1
+detect_order = ISO-8859-1, UTF-8
+
+; Always detect as UTF-8, since ASCII/UTF-7 values are 
+; valid for UTF-8
+detect_order = UTF-8, ASCII, UTF-7
+       </programlisting>
+      </example>
+     </para>
+    </note>
     <para>
      <example>
       <title><function>mb_detect_order</function> examples</title>
@@ -368,7 +842,7 @@
      See also <function>mb_internal_encoding</function>,
      <function>mb_http_input</function>,
      <function>mb_http_output</function>
-     <function>mb_send_mail</function>
+     <function>mb_send_mail</function>.
     </para>
    </refsect1>
   </refentry>
@@ -393,7 +867,7 @@
      substitution character when input character encoding is invalid
      or character code is not exist in output character
      encoding. Invalid characters may be substituted null(no output),
-     string or hex value (Unicode character code value).
+     string or integer value (Unicode character code value).
     </para>
     <para>
      This setting affects <function>mb_detect_encoding</function>
@@ -410,16 +884,17 @@
       </listitem>
       <listitem>
        <simpara>
-       &quot;long&quot; :  Output hex value  (Example: U+3000,JIS+7E7E)
+       &quot;long&quot; : Output character code value (Example:
+       U+3000,JIS+7E7E)
        </simpara>
       </listitem>
      </itemizedlist>
     </para>
     <para>
      Return Value: If <parameter>substchar</parameter> is set, it
-     returns TRUE for success, otherwise returns FALSE. If
-     <parameter>substchar</parameter> is not set, it returns Unicode
-     value or
+     returns <literal>TRUE</literal> for success, otherwise returns
+     <literal>FALSE</literal>. If <parameter>substchar</parameter> is
+     not set, it returns Unicode value or
      &quot;<literal>none</literal>&quot;/&quot;<literal>long</literal>&quot;.
     </para>
     <para>
@@ -461,9 +936,29 @@
      <function>ob_start</function> callback
      function. <function>mb_output_handler</function> converts
      characters in output buffer from internal character encoding to
-     HTTP output character encoding.
+     HTTP output character encoding. 
+     </para>
+    <para>
+     4.0.7 or later version, this hanlder adds charset HTTP header
+     when following conditions are met:
     </para>
     <para>
+     <itemizedlist>
+      <listitem>
+       <simpara>Does not set <literal>Content-Type</literal> by
+       header()</simpara>
+      </listitem>
+      <listitem>
+       <simpara>Default MIME type begins with
+       <literal>text/</literal></simpara>
+      </listitem>
+      <listitem>
+       <simpara><literal>http_output</literal> setting is other than
+       pass</simpara>
+      </listitem>
+     </itemizedlist>
+    </para>
+    <para>
      <parameter>contents</parameter> : Output buffer contents
     </para>
     <para>
@@ -483,8 +978,8 @@
     </para>
     <note>
      <para>
-      If you want to output some binary data such as image from php
-      script, you must set output encoding to "pass" using
+      If you want to output some binary data such as image from PHP
+      script, you must set output encoding to &quot;pass&quot; using
       <function>mb_http_output</function>.
      </para>
     </note>
@@ -520,7 +1015,7 @@
 $outputenc = "sjis-win";
 mb_http_output($outputenc);
 ob_start("mb_output_handler");
-Header("Content-Type: text/html; charset=" . mb_preferred_mime_name($outputenc));
+header("Content-Type: text/html; charset=" . mb_preferred_mime_name($outputenc));
       </programlisting>
      </example>
     </para>
@@ -550,6 +1045,11 @@
      counted as 1.
     </para>
     <para>
+     <parameter>encoding</parameter> is character encoding for
+     <parameter>str</parameter>. If <parameter>encoding</parameter> is
+     omitted, internal character encoding is used.
+    </para>
+    <para>
      See also <function>mb_internal_encoding</function>, 
      <function>strlen</function>.
     </para>
@@ -567,7 +1067,7 @@
     <title>Description</title>
     <funcsynopsis>
      <funcprototype>
-      <funcdef>string <function>mb_strpos</function></funcdef>
+      <funcdef>int <function>mb_strpos</function></funcdef>
       <paramdef>string <parameter>haystack</parameter></paramdef>
       <paramdef>string <parameter>needle</parameter></paramdef>
       <paramdef>int 
@@ -605,7 +1105,7 @@
     </para>
     <para>
      <parameter>encoding</parameter> is character encoding name. If it
-     is not specified, internal character encoding is used.
+     is omitted, internal character encoding is used.
     </para>
     <para>
      See also <function>mb_strpos</function>,
@@ -626,7 +1126,7 @@
     <title>Description</title>
     <funcsynopsis>
      <funcprototype>
-      <funcdef>string <function>mb_strrpos</function></funcdef>
+      <funcdef>int <function>mb_strrpos</function></funcdef>
       <paramdef>string <parameter>haystack</parameter></paramdef>
       <paramdef>string <parameter>needle</parameter></paramdef>
       <paramdef>string 
@@ -649,7 +1149,7 @@
      0. Second character position is 1. 
     </para>
     <para>
-     If <parameter>encoding</parameter> is not set, internal encoding
+     If <parameter>encoding</parameter> is omitted, internal encoding
      is assumed. <function>mb_strrpos</function> accepts
      <literal>string</literal> for <parameter>needle</parameter> where
      <function>strrpos</function> accepts only character.
@@ -709,7 +1209,7 @@
      omitted, internal character encoding is used.
     </para>
     <para>
-     See also <function>mb_struct</function>, 
+     See also <function>mb_strcut</function>, 
      <function>mb_internal_encoding</function>.
     </para>
    </refsect1>
@@ -822,7 +1322,7 @@
     <title>Description</title>
     <funcsynopsis>
      <funcprototype>
-      <funcdef>string <function>mb_strmwidth</function></funcdef>
+      <funcdef>string <function>mb_strimwidth</function></funcdef>
       <paramdef>string <parameter>str</parameter></paramdef>
       <paramdef>int <parameter>start</parameter></paramdef>
       <paramdef>int <parameter>width</parameter></paramdef>
@@ -833,7 +1333,7 @@
      </funcprototype>
     </funcsynopsis>
     <para>
-     <function>mb_strmwidth</function> truncates string
+     <function>mb_strimwidth</function> truncates string
      <parameter>str</parameter> to specified
      <parameter>width</parameter>. It returns truncated string.
     </para>
@@ -1164,6 +1664,12 @@
      before conversion for success, FALSE for failure.
     </para>
     <para>
+     <function>mb_convert_variables</function> join strings in Array
+     or Object to detect encoding, since encoding detection tends to
+     fail for short strings. Therefore, it is impossible to mix
+     encoding in single array or object.
+    </para>
+    <para>
      It <parameter>from-encoding</parameter> is specified by 
      array or comma separated string, it tries to detect encoding from
      <parameter>from-coding</parameter>. When
@@ -1172,7 +1678,9 @@
     </para>
     <para>
      <parameter>vars (3rd and larger)</parameter> is reference to
-     variable to be converted. String, Array and Object are accepted. 
+     variable to be converted. String, Array and Object are accepted.
+     <function>mb_convert_variables</function> assumes all parameters
+     have the same encoding.
     </para>
     <para>
      <example>
@@ -1296,7 +1804,8 @@
      convert.
     </para>
     <para>
-     <parameter>encoding</parameter> is character encoding.
+     <parameter>encoding</parameter> is character encoding. If it is
+     omitted, internal character encoding is used.
     </para>
     <para>
      <example>
@@ -1323,7 +1832,7 @@
    <refnamediv>
     <refname>mb_send_mail</refname>
     <refpurpose>
-     Send mail with ISO-2022-JP character code. (Japanese specific)
+     Send encoded mail.
     </refpurpose>
    </refnamediv>
    <refsect1>
@@ -1344,7 +1853,8 @@
     </funcsynopsis>
     <para>
      <function>mb_send_mail</function> sends email. Headers and
-     message are converted and encoded in ISO-2022-JP.
+     message are converted and encoded according to
+     <function>mb_language</function> setting. 
      <function>mb_send_mail</function> is wrapper
      function of <function>mail</function>. See
      <function>mail</function> for details.
@@ -1361,21 +1871,23 @@
      <parameter>message</parameter> is mail message.
     </para>
     <para>
-     string <parameter>additional_headers</parameter> is inserted at
-     the end of the header. This is typically used to add
-     extra headers.  Multiple extra headers are separated with a
+     <parameter>additional_headers</parameter> is inserted at
+     the end of the header. This is typically used to add extra
+     headers.  Multiple extra headers are separated with a
      newline(\n).
     </para>
     <para>
-     It returns TRUE for success, otherwise it returns FALSE.
+     <parameter>additional_parameter</parameter> is a MTA command line
+     parameter. It is useful when setting the correct Return-Path
+     header when using sendmail.
     </para>
     <para>
-     <parameter>additional_parameter</parameter> is added this
-     data to the call to the mailer by PHP. This is useful when
-     setting the correct Return-Path header when using sendmail.
+     It returns <literal>TRUE</literal> for success, otherwise it
+     returns <literal>FALSE</literal>.
     </para>
     <para>
-     See also: <function>mail</function>.
+     See also: <function>mb_language</function>,
+     <function>mail</function>.
     </para>
    </refsect1>
   </refentry>

Reply via email to