ID: 11904
User Update by: [EMAIL PROTECTED]
Status: Assigned
Bug Type: Unknown/Other Function
Operating system: Linux 2.4.6 Slackware
PHP Version: 4.0.6
Description: ext/standard/string.c patch for nl2br()
I see ... newline on Mac is just a carriage return '\r'
(#000D). Newline on Unix is just line feed '\n' (#x000A).
And then newline on Dos/Windows is carriage return line
feed "\r\n". This is more complicated if a file is
treated as 16-bit unicode instead of ascii 8-bit and I'm
NOT an expert on character encodings - other systems like
mainframes use something called NEL (#x0085) to represent
endofline. The general handling of all these sequences is
to convert them all to a plain line feed '\n'.
"\r\n" -> '\n'
'\r' -> '\n'
'\r'NEL -> '\n'
NEL -> '\n'
In the nl2br() function, before newline 2 br substitution
occurs, a "newline normalization" can be added to:
1) replace all "\r\n" to '\n' (#xA)
2) replace all '\r' followed by [NEL] (if unicode?) to
'\n' (#x000A)
3) replace all remaining (Mac?) '\r' to '\n'
Then do the actual nl2br sub replacing all '\n' with
"\n<br>" (5 characters).
The general standard for a new line is '\n', too bad not
all systems use just it.
Previous Comments:
---------------------------------------------------------------------------
[2001-07-05 12:56:36] [EMAIL PROTECTED]
Before committing it as it is, note that we are not using
<br>, but a <br /> now, and the patch seems to ignore that
fact. Aslo, shouldn't it be <br />rn instead for that matter?
Shouldn't we define what 'nl' in 'nl2br()' stands for? r?
n? rn? If just n, then no patch needed. If any of the
three, then we need to handle rn properly, and also handle
r somehow (just a bit more code).
Just a thought... I want to be consistent, so mac users
won't be unhappy, like it happened recently...
---------------------------------------------------------------------------
[2001-07-05 10:41:45] [EMAIL PROTECTED]
I'll commit this after I check the patch.
Derick
---------------------------------------------------------------------------
[2001-07-05 08:56:03] [EMAIL PROTECTED]
I suggest the following patch for string.c to fix the
nl2br() function.
Instead of replacing "n" with "<br />n", I think
"n<br>" is better. For example, in a string, there might
be a "rn" and when nl2br() is applied, it would become
"r<br />n". Having a "r" by itself and not paired with
a "n" causes problems for me because most programs afaik
expect either just a "n" (unix systems) or "rn"
(microsoft systems) and know how to deal with them. But
when a "r" is by itself, strange things sometimes happen
and the file will print corrupted. But applying the patch
below, "rn" would become "rn<br>" which keeps the
carriage return and newline pair intact and adds the
break... unix and windows programs understand the file
still and html renderers don't care about formatting.
*** string.c.orig Thu Jul 5 08:41:46 2001
--- string.c Thu Jul 5 08:38:04 2001
***************
*** 2511,2517 ****
convert_to_string_ex(str);
!
php_char_to_str((*str)->value.str.val,(*str)->value.str.len,'n',"<br
/>n",7,return_value);
}
/* }}} */
--- 2511,2517 ----
convert_to_string_ex(str);
!
php_char_to_str((*str)->value.str.val,(*str)->value.str.len,'n',"n<br>",5,return_value);
}
/* }}} */
---------------------------------------------------------------------------
Full Bug description available at: http://bugs.php.net/?id=11904
--
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]