RE: UTF-8 setting for Japanese characters
Thanks David. The links discussing these UTF-8 issues were very educative indeed. Taking a clue from them, I contacted our web hoster and asked them to change the default setting of charset=ISO-8859-1 which was overriding ALL meta tag charset settings. Things work now, and UTF-8 is recognized correctly. Thanks for the information and help. Best wishes Praveen -Original Message- From: David Crossley [mailto:cross...@apache.org] Sent: Tuesday, December 15, 2009 7:25 AM To: user@forrest.apache.org Subject: Re: UTF-8 setting for Japanese characters David Crossley wrote: > Dr. Praveen Bhatia wrote: > > > > Clearly, the charset is not getting set to UTF-8 in spite of settings that I > > did in forrest.properties, web.xml forrest.xconf, sitemap.xmap (xml > > serializer and html serializer). > > > > What settings I could be missing? > > Some time in the past we had similar issues for our forrest.a.o site. > > See $FORREST_HOME/site-author/content/.htaccess > > # > # FIXME: Do we still need this? See FOR-877 > AddDefaultCharset UTF-8 > # > > That issue links to a some other issues which might provide > some background. Ah, following through from http://issues.apache.org/jira/browse/FOR-877 to the linked issue: https://issues.apache.org/bugzilla/show_bug.cgi?id=23421 provides very educational reading on this matter. -David
Re: UTF-8 setting for Japanese characters
David Crossley wrote: > Dr. Praveen Bhatia wrote: > > > > Clearly, the charset is not getting set to UTF-8 in spite of settings that I > > did in forrest.properties, web.xml forrest.xconf, sitemap.xmap (xml > > serializer and html serializer). > > > > What settings I could be missing? > > Some time in the past we had similar issues for our forrest.a.o site. > > See $FORREST_HOME/site-author/content/.htaccess > > # > # FIXME: Do we still need this? See FOR-877 > AddDefaultCharset UTF-8 > # > > That issue links to a some other issues which might provide > some background. Ah, following through from http://issues.apache.org/jira/browse/FOR-877 to the linked issue: https://issues.apache.org/bugzilla/show_bug.cgi?id=23421 provides very educational reading on this matter. -David
Re: UTF-8 setting for Japanese characters
Dr. Praveen Bhatia wrote: > > Clearly, the charset is not getting set to UTF-8 in spite of settings that I > did in forrest.properties, web.xml forrest.xconf, sitemap.xmap (xml > serializer and html serializer). > > What settings I could be missing? Some time in the past we had similar issues for our forrest.a.o site. See $FORREST_HOME/site-author/content/.htaccess # # FIXME: Do we still need this? See FOR-877 AddDefaultCharset UTF-8 # That issue links to a some other issues which might provide some background. -David
Re: UTF-8 setting for Japanese characters
Dr. Praveen Bhatia wrote: > Hello, May i respectfully suggest that in the future you start a new mail thread for each new topic, rather than reply to a discussion from someone else. This greatly assists people who will later read these mail archives. -David
RE: UTF-8 setting for Japanese characters
Hello, Further information that I could garner on this. I wrote a program to read the Response header from the server and this is the result: Message = GET http://www.sumpurn.com/com.sumpurn.web/index.html HTTP/1.0 HTTP/1.1 200 OK Date: Mon, 14 Dec 2009 10:30:06 GMT Server: Apache/2.0.63 (CentOS) X-Cocoon-Version: 2.2.0-dev Set-Cookie: JSESSIONID=07213F0EE92415A0E5B8B4D3BCDA0107; Path=/com.sumpurn.web Content-Length: 9665 Connection: close Content-Type: text/html; charset=ISO-8859-1 Clearly, the charset is not getting set to UTF-8 in spite of settings that I did in forrest.properties, web.xml forrest.xconf, sitemap.xmap (xml serializer and html serializer). What settings I could be missing? Best wishes Praveen -Original Message- From: Dr. Praveen Bhatia [mailto:praveen.bha...@sumpurn.com] Sent: Monday, December 14, 2009 5:59 PM To: user@forrest.apache.org Subject: UTF-8 setting for Japanese characters Hello, On my forrest 0.8 based website, I have done settings for UTF-8 to make a Japanese website. On local tomcat and jetty, it works fine showing the Japanese characters correctly. (My machines is Japanese Vista m/c) The problem is when it is uploaded on to the shared server (linux with tomcat apache), the browser is not seeing them as UTF-8 encoded for display. The correct UTF-8 Japanese characters can however be seen if the browser encoding is chosen for EACH page to UTF-8 again and again. (The html file generated is also having a meta data as follows: So generation seems to be ok till here. This behavior is observable on my forrest website www.sumpurn.com (or www.sumpurn.com/com.sumpurn.web/index.html) where we will first get garbled data, but it would become OK if for EACH page the browser encoding is set to UTF-8 (The characters are entirely in Japanese UTF-8 ) I followed all the instructions given in forrest for UTF-8 and the instructions given in cocoon website http://cocoon.apache.org/2.2/1366_1_1.html#theory for UTF-8. However, I am yet unable to make it work. My gut feeling is that apache server's http header is sending non-UTF encoding to the browser, and that needs to be set via forrest/cocoon/apache tomcat. Could someone please guide me as to what other settings are required to be done? Thanks Best wishes Praveen