jeroen Wed May 23 15:13:58 2001 EDT Modified files: /phpdoc/en/language types.xml Log: updated the strings section a bit.
Index: phpdoc/en/language/types.xml diff -u phpdoc/en/language/types.xml:1.27 phpdoc/en/language/types.xml:1.28 --- phpdoc/en/language/types.xml:1.27 Tue May 22 16:18:54 2001 +++ phpdoc/en/language/types.xml Wed May 23 15:13:58 2001 @@ -478,107 +478,216 @@ <sect1 id="language.types.string"> <title>Strings</title> <para> - Strings can be specified using one of two sets of delimiters. + A <type>string</type> is series of characters. In PHP, + a character is the same as a byte, that is, there are exactly + 256 different characters possible. This also implies that PHP + has no native support of Unicode. + <!-- how about unicode? will we support that eventually? Are + there current any ways to work with unicode? + --> </para> - <para> - If the string is enclosed in double-quotes ("), variables within - the string will be expanded (subject to some parsing - limitations). As in C and Perl, the backslash ("\") character can - be used in specifying special characters: - <table> - <title>Escaped characters</title> - <tgroup cols="2"> - <thead> - <row> - <entry>sequence</entry> - <entry>meaning</entry> - </row> - </thead> - <tbody> - <row> - <entry><literal>\n</literal></entry> - <entry>linefeed (LF or 0x0A (10) in ASCII)</entry> - </row> - <row> - <entry><literal>\r</literal></entry> - <entry>carriage return (CR or 0x0D (13) in ASCII)</entry> - </row> - <row> - <entry><literal>\t</literal></entry> - <entry>horizontal tab (HT or 0x09 (9) in ASCII)</entry> - </row> - <row> - <entry><literal>\\</literal></entry> - <entry>backslash</entry> - </row> - <row> - <entry><literal>\$</literal></entry> - <entry>dollar sign</entry> - </row> - <row> - <entry><literal>\"</literal></entry> - <entry>double-quote</entry> - </row> - <row> - <entry><literal>\[0-7]{1,3}</literal></entry> - <entry> - the sequence of characters matching the regular - expression is a character in octal notation - </entry> - </row> - <row> - <entry><literal>\x[0-9A-Fa-f]{1,2}</literal></entry> - <entry> - the sequence of characters matching the regular - expression is a character in hexadecimal notation - </entry> - </row> - </tbody> - </tgroup> - </table> - </para> - - <para> - If you attempt to escape any other character, both the backslash - and the character will be output. In PHP 3, a warning will - be issued at the <literal>E_NOTICE</literal> level when this - happens. In PHP 4, no warning is generated. - </para> + <note> + <simpara> + It is no problem for a string to become very large. + There is no practical bound to the size + of strings imposed by PHP, so there is no reason at all + to worry about long strings. + </simpara> + </note> + <sect2 id="language.types.string.syntax"> + <title>Syntax</title> + <para> + A string literal can be specified in three different + ways. + <itemizedlist> + + <listitem> + <simpara> + <link linkend="language.types.string.syntax.single">single quoted</link> + </simpara> + </listitem> + <listitem> + <simpara> + <link linkend="language.types.string.syntax.double">double quoted</link> + </simpara> + </listitem> + <listitem> + <simpara> + <link linkend="language.types.string.syntax.heredoc">heredoc syntax</link> + </simpara> + </listitem> - <para> - The second way to delimit a string uses the single-quote ("'") - character. When a string is enclosed in single quotes, the only - escapes that will be understood are "\\" and "\'". This is for - convenience, so that you can have single-quotes and backslashes in - a single-quoted string. Variables will <emphasis>not</emphasis> be - expanded inside a single-quoted string. - </para> + </itemizedlist> + </para> + <sect3 id="language.types.string.syntax.single"> + <title>Single quoted</title> + <para> + The easiest way to specify a simple string is to + enclose it in single quotes (the character <literal>'</literal>). + </para> + <para> + To specify a literal single + quote, you will need to escape it with a backslash + (<literal>\</literal>), like in many other languages. + If a backslash needs to occur before a single quote or at + the end of the string, you need to double it. + Note that if you try to escape any + other character, the backslash too will be printed! So + usually there is no need to escape the backslash itself. + <note> + <simpara> + In PHP 3, a warning will + be issued at the <literal>E_NOTICE</literal> level when this + happens. + </simpara> + </note> + <note> + <simpara> + Unlike the two other syntaxes, variables will <emphasis>not</emphasis> + be expanded when they occur in single quoted strings. + </simpara> + </note> + <informalexample> + <programlisting role="php"> +echo 'this is a simple string'; +echo 'You can also have embedded newlines in strings, +like this way.'; +echo 'Arnold once said: "I\'ll be back"'; +// output: ... "I'll be back" +echo 'Are you sure you want to delete C:\\*.*?'; +// output: ... delete C:\*.*? +echo 'Are you sure you want to delete C:\*.*?'; +// output: ... delete C:\*.*? +echo 'I am trying to include at this point: \n a newline'; +// output: ... this point: \n a newline + </programlisting> + </informalexample> + </para> + </sect3> + <sect3 id="language.types.string.syntax.double"> + <title>Double quoted</title> + <para> + If the string is enclosed in double-quotes ("), + PHP understands more escape sequences for special + characters: + </para> + <table> + <title>Escaped characters</title> + <tgroup cols="2"> + <thead> + <row> + <entry>sequence</entry> + <entry>meaning</entry> + </row> + </thead> + <tbody> + <row> + <entry><literal>\n</literal></entry> + <entry>linefeed (LF or 0x0A (10) in ASCII)</entry> + </row> + <row> + <entry><literal>\r</literal></entry> + <entry>carriage return (CR or 0x0D (13) in ASCII)</entry> + </row> + <row> + <entry><literal>\t</literal></entry> + <entry>horizontal tab (HT or 0x09 (9) in ASCII)</entry> + </row> + <row> + <entry><literal>\\</literal></entry> + <entry>backslash</entry> + </row> + <row> + <entry><literal>\$</literal></entry> + <entry>dollar sign</entry> + </row> + <row> + <entry><literal>\"</literal></entry> + <entry>double-quote</entry> + </row> + <row> + <entry><literal>\[0-7]{1,3}</literal></entry> + <entry> + the sequence of characters matching the regular + expression is a character in octal notation + </entry> + </row> + <row> + <entry><literal>\x[0-9A-Fa-f]{1,2}</literal></entry> + <entry> + the sequence of characters matching the regular + expression is a character in hexadecimal notation + </entry> + </row> + </tbody> + </tgroup> + </table> + <para> + Again, if you try to escape any other character, the + backspace will be printed too! + </para> + <para> + But the most important pre of double-quoted strings + is the fact that variable names will be expanded. + See <link linkend="language.types.string.parsing">string + parsing</link> for details. + </para> + </sect3> + + <sect3 id="language.types.string.syntax.heredoc"> + <title>Heredoc</title> + <simpara> + Another way to delimit strings is by using here doc syntax + ("<<<"). One should provide an identifier after + <literal><<<</literal>, then the string, and then the + same identifier to close the quotation. + </simpara> + <simpara> + The closing identifier <emphasis>must</emphasis> begin in the + first column of the line. Also, the identifier used must follow + the same naming rules as any other label in PHP: it must contain + only alphanumeric characters and underscores, and must start with + a non-digit character or underscore. + </simpara> + + <warning> + <simpara> + It is very important to note that the line with the closing + identifier contains no other characters, except + <emphasis>possibly</emphasis> a <literal>;</literal>. + That means especially that the identifier + <emphasis>may not be indented</emphasis>, and there + may not be any spaces or tabs after or before the <literal>;</literal>. + </simpara> + <simpara> + Probably the nastiest gotcha is that there may also + not be a carriage return (<literal>\r</literal>) at the end of + the line, only + a form feed, a.k.a. newline (<literal>\n</literal>). + Since Microsoft Windows uses the sequence + <literal>\r\n</literal> as a line + terminator, your heredoc may not work if you write your + script in a windows editor. However, most programming + editors provide a way to save your files with UNIX + line terminator. + <!-- + FTP will sometimes automatically convert \r\n to \n while + transferring your files to your webserver (which + is *nix, of course) + --> + </simpara> + </warning> - <simpara> - Another way to delimit strings is by using here doc syntax - ("<<<"). One should provide an identifier after - <literal><<<</literal>, then the string, and then the - same identifier to close the quotation. - </simpara> - - <simpara> - The closing identifier <emphasis>must</emphasis> begin in the - first column of the line. Also, the identifier used must follow - the same naming rules as any other label in PHP: it must contain - only alphanumeric characters and underscores, and must start with - a non-digit character or underscore. - </simpara> - - <para> - Here doc text behaves just like a double-quoted string, without - the double-quotes. This means that you do not need to escape quotes - in your here docs, but you can still use the escape codes listed - above. Variables are expanded, but the same care must be taken - when expressing complex variables inside a here doc as with - strings. - <example> - <title>Here doc string quoting example</title> - <programlisting> + <para> + Here doc text behaves just like a double-quoted string, without + the double-quotes. This means that you do not need to escape quotes + in your here docs, but you can still use the escape codes listed + above. Variables are expanded, but the same care must be taken + when expressing complex variables inside a here doc as with + strings. + <example> + <title>Here doc string quoting example</title> + <programlisting> <?php $str = <<<EOD Example of string @@ -606,68 +715,22 @@ This should print a capital 'A': \x41 EOT; ?> - </programlisting> - </example> - </para> - - <note> - <para> - Here doc support was added in PHP 4. - </para> - </note> - <para> - Strings may be concatenated using the '.' (dot) operator. Note - that the '+' (addition) operator will not work for this. Please - see <link linkend="language.operators.string">String - operators</link> for more information. - </para> - <para> - Characters within strings may be accessed by treating the string - as a numerically-indexed array of characters, using C-like - syntax. See below for examples. - </para> - <para> - <example> - <title>Some string examples</title> - <programlisting role="php"> -<?php -/* Assigning a string. */ -$str = "This is a string"; - -/* Appending to it. */ -$str = $str . " with some more text"; - -/* Another way to append, includes an escaped newline. */ -$str .= " and a newline at the end.\n"; - -/* This string will end up being '<p>Number: 9</p>' */ -$num = 9; -$str = "<p>Number: $num</p>"; - -/* This one will be '<p>Number: $num</p>' */ -$num = 9; -$str = '<p>Number: $num</p>'; - -/* Get the first character of a string */ -$str = 'This is a test.'; -$first = $str[0]; - -/* Get the last character of a string. */ -$str = 'This is still a test.'; -$last = $str[strlen($str)-1]; -?> - </programlisting> - </example> - </para> - <sect2 id="language.types.string.parsing"> - <title>String parsing</title> - <!-- - I used simpara all over, because I don't know when - to use para. There will also probably some typo's - and misspellings. - --> + </programlisting> + </example> + </para> + + <note> + <para> + Here doc support was added in PHP 4. + </para> + </note> + + </sect3> + <sect3 id="language.types.string.parsing"> + <title>Variable parsing</title> <simpara> - When a string is specified in double quotes, variables are + When a string is specified in double quotes or with + heredoc, variables are parsed within it. </simpara> <simpara> @@ -685,10 +748,10 @@ and can by recognised by the curly braces surrounding the expression. </simpara> - <sect3 id="language.types.string.parsing.simple"> + <sect4 id="language.types.string.parsing.simple"> <title>Simple syntax</title> <simpara> - If a $ is encoutered, the parser will + If a <literal>$</literal> is encoutered, the parser will greedily take as much tokens as possible to form a valid variable name. Enclose the the variable name in curly braces if you want to explicitely specify the end of the @@ -696,10 +759,10 @@ </simpara> <informalexample> <programlisting role="php"> - $beer = 'Heineken'; - echo "$beer's taste is great"; // works, "'" is an invalid character for varnames - echo "He drunk some $beers"; // won't work, 's' is a valid character for varnames - echo "He drunk some ${beer}s"; // works +$beer = 'Heineken'; +echo "$beer's taste is great"; // works, "'" is an invalid character for varnames +echo "He drunk some $beers"; // won't work, 's' is a valid character for varnames +echo "He drunk some ${beer}s"; // works </programlisting> </informalexample> <simpara> @@ -720,29 +783,29 @@ </simpara> <informalexample> <programlisting role="php"> - $fruits = array( 'strawberry' => 'red' , 'banana' => 'yellow' ); - echo "A banana is $fruits[banana]."; - echo "This square is $square->width meters broad."; - echo "This square is $square->width00 centimeters broad."; // won't work, - // for a solution, see the <link linkend="language.types.string.parsing.complex">complex syntax</link>. - - <!-- XXX this won't work: - echo "This square is $square->{width}00 centimeters broad."; - // XXX: php developers: it would be consequent to make this work. - // XXX: like the $obj->{expr} syntax outside a string works, - // XXX: analogously to the ${expr} syntax for variable var's. - --> - +$fruits = array( 'strawberry' => 'red' , 'banana' => 'yellow' ); +echo "A banana is $fruits[banana]."; +echo "This square is $square->width meters broad."; +echo "This square is $square->width00 centimeters broad."; // won't work, + // for a solution, see the <link +linkend="language.types.string.parsing.complex">complex syntax</link>. + +<!-- XXX this won't work: +echo "This square is $square->{width}00 centimeters broad."; +// XXX: php developers: it would be consequent to make this work. +// XXX: like the $obj->{expr} syntax outside a string works, +// XXX: analogously to the ${expr} syntax for variable var's. +--> + </programlisting> </informalexample> <simpara> For anything more complex, you should use the complex syntax. </simpara> - </sect3> - <sect3 id="language.types.string.parsing.complex"> + </sect4> + <sect4 id="language.types.string.parsing.complex"> <title>Complex (curly) syntax</title> <simpara> - I didn't call this complex because the syntax is complex, + This isn't called complex because the syntax is complex, but because you can include complex expressions this way. </simpara> <simpara> @@ -756,29 +819,110 @@ </simpara> <informalexample> <programlisting role="php"> - $great = 'fantastic'; - echo "This is { $great}"; // won't work, outputs: This is { fantastic} - echo "This is {$great}"; // works, outputs: This is fantastic - echo "This square is {$square->width}00 centimeters broad."; - echo "This works: {$arr[4][3]}"; - echo "This is wrong: {$arr[foo][3]}"; // for the same reason - // as $foo[bar] is wrong outside a string. - <!-- XXX see the still-to-write explaination in the arrays-section. --> - echo "You should do it this way: {$arr['foo'][3]}"; - echo "You can even write {$obj->values[3]->name}"; - echo "This is the value of the var named $name: {${$name}}"; - - <!-- <xxx> maybe it's better to leave this out?? --> - // this works, but i disencourage its use, since this is NOT - // involving functions, rather than mere variables, arrays and objects. - $beer = 'Heineken'; - echo "I'd like to have another {${ strrev('reeb') }}, hips"; - <!-- </xxx> --> - +$great = 'fantastic'; +echo "This is { $great}"; // won't work, outputs: This is { fantastic} +echo "This is {$great}"; // works, outputs: This is fantastic +echo "This square is {$square->width}00 centimeters broad."; +echo "This works: {$arr[4][3]}"; +echo "This is wrong: {$arr[foo][3]}"; // for the same reason + // as <link linkend="language.types.array.foo-bar">$foo[bar]</link + > is wrong outside a string. +echo "You should do it this way: {$arr['foo'][3]}"; +echo "You can even write {$obj->values[3]->name}"; +echo "This is the value of the var named $name: {${$name}}"; +<!-- <xxx> maybe it's better to leave this out?? +// this works, but i disencourage its use, since this is NOT +// involving functions, rather than mere variables, arrays and objects. +$beer = 'Heineken'; +echo "I'd like to have another {${ strrev('reeb') }}, hips"; + </xxx> --> </programlisting> </informalexample> - </sect3> - </sect2> + </sect4> + </sect3> + + <sect3 id="language.types.string.substr"> + <title>String access by character</title> + <para> + Characters within strings may be accessed by specifying the + zero-based offset of the desired character after the string + in curly braces. + </para> + <note> + <simpara> + For backwards compatibility, you can still use the array-braces. + However, this syntax is deprecated as of PHP 4. + </simpara> + </note> + <para> + <example> + <title>Some string examples</title> + <programlisting role="php"> +<!-- TODO: either move these examples to a example section, +as with arrays, or distribute them under the applicable +sections. --> +<?php +/* Assigning a string. */ +$str = "This is a string"; + +/* Appending to it. */ +$str = $str . " with some more text"; + +/* Another way to append, includes an escaped newline. */ +$str .= " and a newline at the end.\n"; + +/* This string will end up being '<p>Number: 9</p>' */ +$num = 9; +$str = "<p>Number: $num</p>"; + +/* This one will be '<p>Number: $num</p>' */ +$num = 9; +$str = '<p>Number: $num</p>'; + +/* Get the first character of a string */ +$str = 'This is a test.'; +$first = $str{0}; + +/* Get the last character of a string. */ +$str = 'This is still a test.'; +$last = $str{strlen($str)-1}; +?> + </programlisting> + </example> + </para> + </sect3> + + </sect2><!-- end syntax --> + + <sect2 id="language.types.string.useful-funcs"> + <title>Useful functions</title><!-- and operators --> + <para> + Strings may be concatenated using the '.' (dot) operator. Note + that the '+' (addition) operator will not work for this. Please + see <link linkend="language.operators.string">String + operators</link> for more information. + </para> + <para> + There are a lot of useful functions for string modification. + </para> + <simpara> + See the <link linkend="ref.strings">string functions section</link> + for general functions, the regular expression functions for + advanced find&replacing (in two tastes: + <link linkend="ref.pcre">Perl</link> and + <link linkend="ref.regex">POSIX extended</link>). + </simpara> + <simpara> + There are also <link linkend="ref.url">functions for URL-strings</link>, + and functions to encrypt/decrypt strings + (<link linkend="ref.mcrypt">mcrypt</link> and + <link linkend="ref.mhash">mhash</link>). + </simpara> + <simpara> + Finally, if you still didn't find what you're looking for, + see also the <link linkend="ref.ctype">character type functions</link>. + </simpara> + </sect2> <sect2 id="language.types.string.conversion"> <title>String conversion</title> @@ -832,7 +976,7 @@ </para> </sect2> - </sect1> + </sect1><!-- end string --> <sect1 id="language.types.array"> <title>Arrays</title>