Hi,


as external javascript files can be of any charset, one possible solution is to provide dynamic charset detection.
There is an implementation using Mozilla charset library :



<%@ page import="java.io.*" %> <%@ page import="java.net.*" %> <%@ page import="org.jahia.utils.fileparsers.CharsetDetection" %>

...

<%
CharsetDetection charsetDet = new CharsetDetection();
String path = getServletContext().getRealPath("/jsp/jahia/htmleditors/htmlarea-3.0-rc1/lang/"+editorLang+".js");
File jsFile = new File(path);
String charSet = null;
URL url = null;
try {
url = jsFile.toURL();
charsetDet.charsetDetection(jsFile.toURL());
charSet = charsetDet.getCharset();
} catch (MalformedURLException e) {
}


%>
<script type="text/javascript" src="<%=request.getContextPath()%>/jsp/jahia/htmleditors/htmlarea-3.0-rc1/lang/<%=editorLang%>.js" <% if (charSet != null){%>charset="<%=charSet%>"<%}%> ></script>



Thanks a lot for all usefull feedbacks.


Regards, Khue Nguyen



----- Original Message ----- From: "Clemens D�pmeier" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Monday, January 17, 2005 10:28 AM
Subject: Re: Jahia 4.05, HTMLArea does not work in german language



Hi, I just tried the conversion of the german language files of HTMLArea from internal UTF-8 encoding to ISO-8859-1. Because ISO-8859-1 contains as subset also german umlauts, I encoded these characters in native format (this means, I don't have used an escape encoding \uxxx for these characters). Now the HTMLArea editor also shows up in the german language and the javascript error message about wrong encoding of the files in Mozilla has vanished. Note, that my default encoding set in Mozilla is ISO-8859-1 and the HTML file referencing the javascript language files of HTMLArea (The Editor popup window HTML) don't specify any language encoding. So I think my mozilla uses ISO-8859-1 here.

So, I searched the web and found the following:

-----------------------
The problem is that your web server is serving JavaScript files with the
header:

  Content-Type: text/javascript

and the embedding web page does not specify an encoding for the
javascript file. So, this does not tell the web browser what character
encoding the file is using. It has to guess. Firefox's guess is "the
same encoding as the HTML page that referenced it".

To fix, either make the server specify a character encoding for JS files
by sending a header such as:

  Content-Type: text/javascript;charset=utf-8

or set the right character encoding in the HTML file.
--------------------------

So, I take a look at how the HTMLArea language javascript files are
referenced in the HTML code embedding the HTML Area editor and found the
followind:

--------------
<script type="text/javascript">
  _editor_url = "/jahia/jsp/jahia/htmleditors/htmlarea-3.0-rc1/";
  _editor_lang = "de";

</script>

<script type="text/javascript"
src="/jahia/jsp/jahia/htmleditors/htmlarea-3.0-rc1/htmlarea.js"></script>
<script type="text/javascript"
src="/jahia/jsp/jahia/htmleditors/htmlarea-3.0-rc1/dialog.js"></script>
<script type="text/javascript"
src="/jahia/jsp/jahia/htmleditors/htmlarea-3.0-rc1/popupwin.js"></script>
<script type="text/javascript"
src="/jahia/jsp/jahia/htmleditors/htmlarea-3.0-rc1/lang/de.js"></script>
-----------------

Look at the last line, the script tag doesn't set an language encoding.
I believe, the last line should be

<script type="text/javascript" charset="UTF-8"
src="/jahia/jsp/jahia/htmleditors/htmlarea-3.0-rc1/lang/de.js"></script>

if the charset encoding of de.js ist UTF-8. Otherwise the browser would
assume, it is in the same encoding as the HTML page and Jahia doesn't
specify any encoding for the HTML page, so mozilla related browser use
the default encoding which the user sets.

So, to fix this, the question is now: If we want to supply the right
charset information for the language javascript files in the
corresponding Jahia jsp page, what is the right way to find out, in what
charset encoding is the file really?

I hope, this rather lengthy explanation clears things up for you. So how
should we fix this issue now?

Best regards,
Clemens D�pmeier

Khue Nguyen wrote:
Hi,

could you please send us the de.js file for debugging.
I just tested with mozilla and couldn't reproduce this issue.

So, my question is:


An easy fix looks like changing the default encoding of all de.js
language files under the HTMLArea directory to ISO-8859, but is this the
right solution? Or should the HTML be modified to supply UTF-8 encoding
for languages where the langauge files are encoded in UTF-8?


if you change the encoding to ISO-8859, characters like �� have to be translated to Unicode characters \uXXXX .

Regards,
Khue Nguyen






Reply via email to