Re: Help me...
Hi, As I mentioned in my original posting, I am not a PHP programmer; But, it seems that for keyboard input functions you WILL have to convert UTF-8 to HTML encoding if you want to use your function; for example by calling a function like htmlentities() or utf8_decode() before calling utf8_to_int(). The pseudo code looks like this: if (input is in UTF-8) {# for example input is from a Linux keyboard $str = htmlentities($str, ENT_COMPAT, UTF-8 ); } $intvalue = utf8_ot_int($str); I've included the function again; I added the extra code to check for x in set y test. Also note the use of the built-in strtr() function. These builtins usually are more efficient (written in C) and are faster. This code has not been tested. function utf8_to_int($str) { $transtbl = array ( #1776; = '0', #1777; = '1', #1778; = '2', #1779; = '3', #1780; = '4', #1781; = '5', #1782; = '6', #1783; = '7', #1784; = '8', #1785; = '9' ); foreach ($transtbl as $key = $value) { if ($key == $str) { # it has an HTML encoded numeric $str = strtr($str, $transtbl); # convert all of them to ASCII [0-9] return (int) $str; # convert to whole thing to integer } } return (int) $str; } One last thing, this function (and your version) has a fatal flaw. Do you want to guess what it is? Hint: What happens if the string is ABCDE? What is the difference in the return value from the function for the strings ABCD and #1776;? -Fariborz ---BeginMessage--- Hi dears. Mr.Tavakkolian was helping me until my function completed. This function converts utf8(digit) to integer. $farsi_table_linux=array(NONE, #1776;, #1777;, #1778;, #1779;, #1780;, #1781;, #1782;, #1783;, #1784;, #1785;); function search_index_array($str) { global $farsi_table; for ($i=0;$i11;$i++) { if ($farsi_table[$i]==$str ) return $i; } return FALSE; }// end of search_index_array function utf8_to_int($str) { $len=strlen($str); $out=; $char=explode(;,$str); for ($i=0;$i$len;$i++) { $char[$i].=;; if (search_index_array($char[$i])!=False) $out.=search_index_array($char[$i])-1; }//end of for ($i) return $out; }//end of utf8_to_int When i call utf8_to_int(#1776;.#1785;) ,this return 09 ,But When i enter a number(utf8) via keyboard,This function return 0. Please guide me that i how enter a number via keyboard(persian) i get true answer. --regards _ Thank you for choosing LinuxQuestions. http://www.linuxquestions.org ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb ---End Message--- ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb
Re: Help.....
The online document below has the information you need. http://us4.php.net/manual/en/function.utf8-encode.php The only thing that the code segment you sent seems to do is use the (int) cast to convert a string to an integer. It first tries to match the strings '#1776;' through '#1785;' against the input strings. If it matches, it casts the string to integer. PHP uses the standard C library function strtod() to do the cast. I suspect that PHP converts the HTML encoding back to UTF8 encoding (probably utf8_encode function) before calling strtod(); Basically '#1776;' becomes 0x6F0, etc.. Strtod() then tries to convert the string, paying attention to the locale and interprets the 0x6F0-0x6F9 unicode values as numeric 0-9; I don't know why it wouldn't work on all browsers. If you want details, continue reading. The script is very obfuscated. Starting with this array: ?php $farsi_table=array(4758678, 3835495459, #Zero 3835495559, #one 3835495659, #two 3835495759, #three 38354955564859, #four 38354955564959, #five 38354955565059, #six 38354955565159, #seven 38354955565259, #eight 38354955565359 #nine ); If you take the string 3835495459 (The #Zero element) and then convert it to two character sequences 38 35 49 55 55 54 59 and then lookup the those ordinal positions in an ASCII table -- the 38th ASCII character is '', the 35th is '#', the 49th is '1', the 55th is '7', etc. -- you'll see that these strings are just representing the ascii values for ' # 1 7 7 6 ;' in decimal! Why not just leave those as #1776;, etc. The string '#1776;' as you all know, is the HTML notation for specifying a character that can't be typed for whatever reason. In this case this is the representation of the UNICODE value 0x6F0 which is extended arabic-indic digit zero. The others follow the same logic, all the way to #1785;. The function utf8_to_int() is in fact not a UTF8-to-int conversion, but an HTML-encoding-of-UTF8-to-int conversion. I think the following will do what you want, and it might help avoid problems with strtod() versions that might not be unicode safe. function utf8_to_int($str) { $transtbl = array ( #1776; = '0', #1777; = '1', #1778; = '2', #1779; = '3', #1780; = '4', #1781; = '5', #1782; = '6', #1783; = '7', #1784; = '8', #1785; = '9' ); # Could add the following line to make sure that # the string is HTML encoded first # $str = htmlentities($str, ENT_COMPAT, UTF-8 ); $str = strtr($str, $transtbl); return (int) $str; } Disclaimer: I have not tested the code above and in fact I've never written a PHP script before tonight. I spent a few hours reading the language manual and looking at the sources tonight. I can't guarantee the results :) -Fariborz ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb
Re: [farsiweb]unicode fields in database
ON SAT NOVEMBER 9 2002 [EMAIL PROTECTED] WROTE: Still, I'm curious. How come everyone on this discussion board is using the Latin alphabet? :-) Well, most discussions here are also in English. Should everyone in Iran switch over to English? Going back to the original point, would you explain how switching to the Latin alphabet solves the Farsi text sorting problem in a database? ___ FarsiWeb mailing list [EMAIL PROTECTED] http://lists.sharif.edu/mailman/listinfo/farsiweb