a little more help understanding diacritical encoding

Steve Pruitt Thu, 25 Sep 2003 10:03:37 -0700

Thanks for the excellent responses.  I now understand how C3 and 89 are derived.  I 
tried getting everything set the way I intrepreted what the list responses said to do. 
 The scenario is:
I have a page with some diacritical characters displayed and a input text box and a 
submit button.  I copy and past one of the displayed characters into the input box and 
then submit.  What is submitted gets echoed back.  The pages use style sheets so I cut 
and pasted the relevant tags, etc.


I thought I found the problem.  My response had a character encoding of null.  I read 
null defaults to 8859-1 which seemed consistent with my echoed page.  So, I explicitly 
set the response character encoding to UTF-8 via the setContentType method.

I used a TCP tunneler to see what my request and responses look like.  My browser is 
set to utf-8 also.

>From the tunneler my request had the following posted data:  v904=%C3%89   this is 
>correct according to how the utf encoding algo was explained.

The http response had the following:

Content-Type: text/html; charset=UTF-8   this is correct.

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">  is a child in the 
<head> tag

<span class="text29">&#201; &#234; &#235; &#237; &#238; &#239; &#240; &#241; &#243; 
&#244; &#245; &#246;</span>  these are the listed characters on the previous page I 
cut and past from they are listed on this page just for reference - (#201 = C9) is É.

<span class="text17">Accented Characters from&nbsp;&nbsp;previous 
form:&nbsp;&nbsp;&#195;&#137; </span>
this is echoed back.  #195 = C3 and #137 = 89.  These, of course, are displayed as Ă?.

I checked the browser to be sure and its encoding is still set to utf-8 and it is.  
This is everything I know to check.  What am I missing?

a little more help understanding diacritical encoding

Reply via email to