Re: Help me...

2003-09-07 Thread Skip Tavakkolian
Hi,

As I mentioned in my original posting, I am not a PHP programmer; But,
it seems that for keyboard input functions you WILL have to convert
UTF-8 to HTML encoding if you want to use your function; for example
by calling a function like htmlentities()  or utf8_decode() before
calling utf8_to_int().  The pseudo code looks like this:

if (input is in UTF-8) {# for example input is from a Linux keyboard
$str = htmlentities($str, ENT_COMPAT, UTF-8 );
}
$intvalue = utf8_ot_int($str);

I've included the function again; I added the extra code to check for x in set y
test.  Also note the use of the built-in strtr() function.  These builtins usually
are more efficient (written in C) and are faster.   This code has not been tested.

function utf8_to_int($str)
{
$transtbl = array (
#1776; = '0',
#1777; = '1',
#1778; = '2',
#1779; = '3',
#1780; = '4',
#1781; = '5',
#1782; = '6',
#1783; = '7',
#1784; = '8',
#1785; = '9'
);

foreach ($transtbl as $key = $value) {
if ($key == $str) { # it has an HTML 
encoded numeric
$str = strtr($str, $transtbl);  # convert all of them to ASCII 
[0-9]
return (int) $str;  # convert to whole 
thing to integer
}
}
return (int) $str;
}

One last thing, this function (and your version) has a fatal flaw. Do you
want to guess what it is? 

Hint:  What happens if the string is ABCDE?  What is the difference in the
return value from the function for the strings  ABCD and #1776;?

-Fariborz
---BeginMessage---
Hi dears.
Mr.Tavakkolian was helping me until my function completed.
This function converts utf8(digit) to integer.
$farsi_table_linux=array(NONE,
#1776;,
#1777;,
#1778;,
#1779;,
#1780;,
#1781;,
#1782;,
#1783;,
#1784;,
#1785;);
function search_index_array($str)
{
 global $farsi_table;
 
 for ($i=0;$i11;$i++) 
 { 
  if ($farsi_table[$i]==$str ) 
   return $i; 
 } 
 return FALSE;
}// end of search_index_array

function utf8_to_int($str)
{
 $len=strlen($str);
 $out=;
 $char=explode(;,$str);
 for ($i=0;$i$len;$i++)
  { 
   $char[$i].=;;
   if (search_index_array($char[$i])!=False)
 $out.=search_index_array($char[$i])-1;
  }//end of for ($i)
 return $out; 
}//end of utf8_to_int
When i call utf8_to_int(#1776;.#1785;) ,this return 09 ,But When i enter a 
number(utf8) via keyboard,This function return 0.
Please guide me that i how enter a number via keyboard(persian)  i get true answer.
--regards

_
Thank you for choosing LinuxQuestions.
http://www.linuxquestions.org
___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb
---End Message---
___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb


Re: Help.....

2003-09-05 Thread Skip Tavakkolian
The online document below has the information you need.

http://us4.php.net/manual/en/function.utf8-encode.php

The only thing that the code segment you sent seems to do is use
the (int) cast to convert a string to an integer.  It first tries
to match the strings '#1776;' through '#1785;' against the
input strings.  If it matches, it casts the string to integer.
PHP uses the standard C library function strtod() to do the cast.
I suspect that PHP converts the HTML encoding back to UTF8 encoding
(probably utf8_encode function)  before calling strtod(); Basically '#1776;'
becomes 0x6F0, etc..  Strtod() then tries to convert the string, paying attention
to the locale and interprets the 0x6F0-0x6F9 unicode values as numeric 0-9;
I don't know why it wouldn't work on all browsers.
 
If you want details, continue reading.

The script is very obfuscated.  Starting with this array:

 ?php
 $farsi_table=array(4758678,  3835495459, #Zero 
  3835495559, #one 
  3835495659, #two 
3835495759, #three 
38354955564859, #four 
38354955564959, #five 
38354955565059, #six 
38354955565159, #seven 
38354955565259, #eight 
38354955565359 #nine 
  );

If you take the string 3835495459 (The #Zero element) and then
convert it to two character sequences 38 35 49 55 55 54 59 and then
lookup the those ordinal positions in an ASCII table -- the 38th ASCII
character is '', the 35th is '#', the 49th is '1', the 55th is '7',
etc.  -- you'll see that these strings are just representing the ascii
values for ' # 1 7 7 6 ;' in decimal!   Why not just leave those as #1776;, etc.

The string '#1776;' as you all know, is the HTML notation for
specifying a character that can't be typed for whatever reason.  In
this case this is the representation of the UNICODE value 0x6F0 which
is extended arabic-indic digit zero.  The others follow the same
logic, all the way to #1785;.  The function utf8_to_int() is in fact
not a UTF8-to-int conversion, but an HTML-encoding-of-UTF8-to-int
conversion.

I think the following will do what you want, and it might help avoid
problems with strtod() versions that might not be unicode safe.

function utf8_to_int($str)
{
$transtbl = array (
#1776; = '0',
#1777; = '1',
#1778; = '2',
#1779; = '3',
#1780; = '4',
#1781; = '5',
#1782; = '6',
#1783; = '7',
#1784; = '8',
#1785; = '9'
);

# Could add the following line to make sure that
# the string is HTML encoded first
# $str = htmlentities($str, ENT_COMPAT, UTF-8 );

$str = strtr($str, $transtbl);
return (int) $str;
}

Disclaimer: I have not tested the code above and in fact I've never
written a PHP script before tonight.  I spent a few hours reading the
language manual and looking at the sources tonight.  I can't guarantee
the results :)

-Fariborz

___
FarsiWeb mailing list
[EMAIL PROTECTED]
http://lists.sharif.edu/mailman/listinfo/farsiweb