Hi Perl Gurus, I am using functions decode_entities() & decode_utf8() to decode the html codes and UTF (latin characters) respectively. (from module use Encode). The functions which i mentioned above works upto ASCII Decimals 255 and above that it works differently. This is the URL i referred to know the list of html codes and latin characters [http://www.ascii.cl/htmlcodes.htm].
Attached the sample script. Where i give the input values which i got from a XML SOAP response for decoding (The SOAP response doesn't gives the HTML numbers or HTML codes as in the above said URL list). The script gives me what i expected for array values from arr_val[0] to arr_val[4] ((i.e) upto ASCII Decimals range 0-255) but for arr_val[5] (which have ASCII Decimals greater than 255) the decoded values are different. Given the list of array variable values and their expected values. The decoding fails for array variable arr_val[5]. Similarly i would need to encode also. $arr_val[0] = '!"#$%&'()*+,-./ 0123456789:;<=>?' ; expected decoded values -- !"#$%&'()*+,-./ 0123456789:;<=>? $arr_val[1] = '@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~' ; expected decoded values -- @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ $arr_val[2] = '�...@Ã~aÃ~bÃ~cÃ~dÃ~eÃ~fÃ~gÃ~hÃ~iÃ~jÃ~kÃ~lÃ~mÃ~nÃ~oÃ~pÃ~qÃ~rÃ~sÃ~tÃ~uÃ~vÃ~wÃ~xÃ~yÃ~zÃ~[Ã~\Ã~]Ã~^Ã~_' ; expected decoded values -- ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞß $arr_val[3] = 'Ã| áâãäåæçèéêëìÃîïðñòóôõö÷øùúûüýþÿ' ; expected decoded values -- àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ $arr_val[4] = '¡¢£¤¥¦§¨©ª«¬Â®¯°±²³´µ¶·¸¹º»¼½¾¿' ; expected decoded values -- ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ $arr_val[5] = 'others Å~RÅ~SÅ| šŸÆ~r...@~sâ~@~t...@~xâ~@~y...@~zâ~@~\...@~]â~@~^â~@| �...@¡â~@¢...@¦' ; expected decoded values -- others ŒœŠšŸƒ–—‘’‚“”„†‡•…‰€™ Could you please help to know what i am missing or doing wrong. I'll greatly appreciate the help. Thanks Saravanan Balaji.
#!/ms/dist/perl5/bin/perl5.8 -I ../ use MSDW::Version 'HTML-Parser' => '3.56', # HTML::Entities may be used by HTTP::Response ; use Encode; use strict; use Data::Dumper; use HTML::Entities; use HTML::Entities qw(encode_entities_numeric); my @arr_val = (); $arr_val[0] = '!"#$%&'()*+,-./ 0123456789:;<=>?' ; $arr_val[1] = '@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~' ; $arr_val[2] = '懒旅呐魄壬仕掏蜗醒矣哉肿刭谯茌捱' ; $arr_val[3] = '噌忏溴骁栝觌祉铒瘃蝮趱鲼��������' ; $arr_val[4] = '、¥ウЖ┆�����氨渤吹斗腹夯冀究' ; $arr_val[5] = 'others ������������������' ; my $bcp_in_file = "/tmp/testbcp.in" ; my $out_str = "" ; if (!(open ( TEMP_OUT, ">$bcp_in_file" ) )) ##REVISIT## { print "Error: cannot open the file \n"; } foreach my $temp_var (@arr_val) { print "\nProcessing value [$temp_var] \n"; decode_entities($temp_var) ; print "After HTML decode [$temp_var] \n"; my $temp_var2 = decode_utf8($temp_var); print "After UTF8 decode [$temp_var2] \n\n"; print TEMP_OUT $temp_var2 ; #my $temp_var3 = encode_utf8($temp_var2); #print "After UTF8 encode [$temp_var3] \n"; #my $temp_var4 = encode_entities($temp_var3, '"&<>' ); #print "After HTML encode [$temp_var4] \n"; } 1; ############ End of Script #################