> So for reasons of performance, simplicity, and practicality, I would > say str_pad should: > > 1) of course surrogates must not be broken up > 2) The pad string can have combining characters. > 3) The length the user specifies should be a character count.
I presume that "character" above refers to codepoint, and not UChar. > 4) The string can be truncated to the user's requested character > length. The string will be trimmed from the right one unicode utf-8 > character (not grapheme, not byte) at a time until the length limit is > met. (So a combining character is one character for this purpose.) Shouldn't characters/codepoints be trimmed at both ends, rather than just at the right end ? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
