Hello,
I tried to convert html special characters to their real character. For example, converting ” to " . I had the string $str = "“ test ” ניסיון "; The string contain also Hebrew letters. 1. first I did: $str = decode_entities($str); It convert the special characters okay. The problem is that the Hebrew came not okay. So when I print the value of the $str I get the hebrew as ×ס××× 2. Then I decided to write a regular expression that change only the html special characters. I wrote: $str = "“ test ” ניסיון "; $str =~ s/(&#(?=[0-9])*.{2,5};)/decode_entities($1)/ge; Even that it should work only on the matches sub string, it's seem that it happen also on the Hebrew letters. The Hebrew letters came again as ×ס××× Part 1 and 2 give the same output. 3. I decide to check the regular expression, I remove the 'e' in the end of the regular expression so I can see the conversion. I wrote: $str = "“ test ” ניסיון "; $str =~ s/(&#(?=[0-9])*.{2,5};)/decode_entities($1)/g; The output was: decode_entities(“) test decode_entities(”) ניסיון The Hebrew came out okay, of course. 4. I can do : $str =~ s/“|”/"/g; Which don't effect the Hebrew, and convert the html characters. The problem that there are other html special characters that exist in the data. I would like to do something more generic that will work also for the future. Any ideas are welcome!! Shlomit.
_______________________________________________ Perl mailing list Perl@perl.org.il http://mail.perl.org.il/mailman/listinfo/perl