Hello,

  As indicated below, the "strlen(tuf8_decode())" and the "/u" regex 
  modifier do not work as per my understanding.  

  1) What is my misunderstanding?  

      <?php
      
          $the_string = '&#1052;&#1072;&#1088;&#1080;&#1085;&#1072; 
&#1054;&#1088;&#1083;&#1086;&#1074;&#1072;';
          echo "<p>author (85 bytes):$the_string," . strlen($the_string) . ',' 
. strlen( utf8_decode( $the_string ) ) . ',' .
strlen( utf8_decode( utf8_encode($the_string) ) ) . ',' .  "</p>";
          // all the number echoed are 85, I expected at least one to be 13

          
          $max_length = 20;
          $is_short = preg_match( '/^.{1,$max_length}$/u', uft8_encode( 
$the_string ) ) );
          // expect the above to return 1
          
          $max_length = 10;
          $is_short = preg_match( '/^.{1,$max_length}$/u', uft8_encode( 
$the_string ) ) );
          // expect the above to return 0
      
      ?>

  More generally, given a string $the_string:

  2) how to determine what encoding is being used?

  3) how to determine the number of visible characters?

  4) if it has more than N visible characters, how to 
     truncate it after N visible characters?

  Thanks!


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to