Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tomcat Wiki" for change 
notification.

The "FAQ/CharacterEncoding" page has been changed by ChristopherSchultz.
http://wiki.apache.org/tomcat/FAQ/CharacterEncoding?action=diff&rev1=9&rev2=10

--------------------------------------------------

  
  If a character encoding is not specified, the Servlet specification requires 
that an encoding of ISO-8859-1 is used. The character encoding for the body of 
an HTTP message (request ''or'' response) is specified in the `Content-Type` 
header field. An example of such a header is `Content-Type: text/html; 
charset=ISO-8859-1` which explicitly states that the default (ISO-8859-1) is 
being used.
  
+ References: 
[[http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1|HTTP 1.1 
Specification, Section 3.7.1]]
+ 
  <<Anchor(Q2)>>'''How do I change how GET parameters are interpreted?'''
  
  Tomcat will use ISO-8859-1 as the default character encoding of the entire 
URL, including the query string ("GET parameters").
@@ -26, +28 @@

  
   1. Set the `URIEncoding` attribute on the <Connector> element in server.xml 
to something specific (e.g. `URIEncoding="UTF-8"`).
   1. Set the `useBodyEncodingForURI` attribute on the <Connector> element in 
server.xml to `true`. This will cause the Connector to use the request body's 
encoding for GET parameters.
+ 
+ References: [[http://tomcat.apache.org/tomcat-6.0-doc/config/http.html|Tomcat 
6 HTTP Connector]], 
[[http://tomcat.apache.org/tomcat-6.0-doc/config/http.html|Tomcat 6 AJP 
Connector]]
  
  <<Anchor(Q3)>>'''How do I change how POST parameters are interpreted?'''
  
@@ -92, +96 @@

  
   1. [[http://jcp.org/aboutJava/communityprocess/mrel/jsr154/index2.html|Java 
Servlet Specification 2.5]]
   1. [[http://jcp.org/aboutJava/communityprocess/final/jsr154/index.html|Java 
Servlet Specification 2.4]]
-  1. [[http://www.w3.org/Protocols/rfc2616/rfc2616.txt|HTTP 1.1 Protocol]]] 
([[http://www.w3.org/Protocols/rfc2616/rfc2616.html|hyperlinked version]])
+  1. [[http://www.w3.org/Protocols/rfc2616/rfc2616.txt|HTTP 1.1 Protocol]] 
([[http://www.w3.org/Protocols/rfc2616/rfc2616.html|hyperlinked version]])
   1. [[http://www.ietf.org/rfc/rfc2396.txt|URI Syntax]]
   1. [[http://www.w3.org/Protocols/rfc822/|ARPA Internet Text Messages]]
   1. [[http://www.w3.org/TR/html4|HTML 4]]
  
+ ''Default encoding for request and response bodies''
+ 
+ See 'Default Encoding for POST' below.
+ 
  ''Default encoding for GET''
  
- The character set for HTTP query strings (that's the technical term for 'GET 
parameters') can be found in sections 2 and 2.1 the "URI Syntax" specification. 
The character set is defined to be 
[[http://en.wikipedia.org/wiki/ASCII|US-ASCII]]. Any character that does not 
map to US-ASCII must be encoded in some way. Section 2.1 of the URI Syntax 
specification says that characters outside of US-ASCII must be encoded using 
`%` escape sequences: each character is encoded as a literal `%` followed by 
the two hexadecimal codes which indicate its character code. Thus, `a` 
(US-ASCII character code 0x97) is equivalent to `%97`.
+ The character set for HTTP query strings (that's the technical term for 'GET 
parameters') can be found in sections 2 and 2.1 the "URI Syntax" specification. 
The character set is defined to be 
[[http://en.wikipedia.org/wiki/ASCII|US-ASCII]]. Any character that does not 
map to US-ASCII must be encoded in some way. Section 2.1 of the URI Syntax 
specification says that characters outside of US-ASCII must be encoded using 
`%` escape sequences: each character is encoded as a literal `%` followed by 
the two hexadecimal codes which indicate its character code. Thus, `a` 
(US-ASCII character code 0x97) is equivalent to `%97`. There ''is no default 
encoding for URIs'' specified anywhere, which is why there is a lot of 
confusion when it comes to decoding these values.
  
  Some notes about the character encoding of URIs:
   1. ISO-8859-1 and ASCII are compatible for character codes 0x20 to 0x7E, so 
they are often used interchangeably. Most of the web uses ISO-8859-1 as the 
default for query strings.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to