ID: 11904 User Update by: [EMAIL PROTECTED] Status: Assigned Bug Type: Unknown/Other Function Operating system: Linux 2.4.6 Slackware PHP Version: 4.0.6 Description: ext/standard/string.c patch for nl2br() I see ... newline on Mac is just a carriage return '\r' (#000D). Newline on Unix is just line feed '\n' (#x000A). And then newline on Dos/Windows is carriage return line feed "\r\n". This is more complicated if a file is treated as 16-bit unicode instead of ascii 8-bit and I'm NOT an expert on character encodings - other systems like mainframes use something called NEL (#x0085) to represent endofline. The general handling of all these sequences is to convert them all to a plain line feed '\n'. "\r\n" -> '\n' '\r' -> '\n' '\r'NEL -> '\n' NEL -> '\n' In the nl2br() function, before newline 2 br substitution occurs, a "newline normalization" can be added to: 1) replace all "\r\n" to '\n' (#xA) 2) replace all '\r' followed by [NEL] (if unicode?) to '\n' (#x000A) 3) replace all remaining (Mac?) '\r' to '\n' Then do the actual nl2br sub replacing all '\n' with "\n<br>" (5 characters). The general standard for a new line is '\n', too bad not all systems use just it. Previous Comments: --------------------------------------------------------------------------- [2001-07-05 12:56:36] [EMAIL PROTECTED] Before committing it as it is, note that we are not using <br>, but a <br /> now, and the patch seems to ignore that fact. Aslo, shouldn't it be <br />rn instead for that matter? Shouldn't we define what 'nl' in 'nl2br()' stands for? r? n? rn? If just n, then no patch needed. If any of the three, then we need to handle rn properly, and also handle r somehow (just a bit more code). Just a thought... I want to be consistent, so mac users won't be unhappy, like it happened recently... --------------------------------------------------------------------------- [2001-07-05 10:41:45] [EMAIL PROTECTED] I'll commit this after I check the patch. Derick --------------------------------------------------------------------------- [2001-07-05 08:56:03] [EMAIL PROTECTED] I suggest the following patch for string.c to fix the nl2br() function. Instead of replacing "n" with "<br />n", I think "n<br>" is better. For example, in a string, there might be a "rn" and when nl2br() is applied, it would become "r<br />n". Having a "r" by itself and not paired with a "n" causes problems for me because most programs afaik expect either just a "n" (unix systems) or "rn" (microsoft systems) and know how to deal with them. But when a "r" is by itself, strange things sometimes happen and the file will print corrupted. But applying the patch below, "rn" would become "rn<br>" which keeps the carriage return and newline pair intact and adds the break... unix and windows programs understand the file still and html renderers don't care about formatting. *** string.c.orig Thu Jul 5 08:41:46 2001 --- string.c Thu Jul 5 08:38:04 2001 *************** *** 2511,2517 **** convert_to_string_ex(str); ! php_char_to_str((*str)->value.str.val,(*str)->value.str.len,'n',"<br />n",7,return_value); } /* }}} */ --- 2511,2517 ---- convert_to_string_ex(str); ! php_char_to_str((*str)->value.str.val,(*str)->value.str.len,'n',"n<br>",5,return_value); } /* }}} */ --------------------------------------------------------------------------- Full Bug description available at: http://bugs.php.net/?id=11904 -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]