[ http://jira.codehaus.org/browse/MNG-1409?page=comments#action_49947 ] 

Vincent Siveton commented on MNG-1409:
--------------------------------------

This issue appears currently for a Japanese translation and maybe for other 
East Asian languages (CJK charsets).

- Using a VM parameter could be a good starting point -Dfile.encoding=UTF-8 (to 
add to MAVEN_OPTS).

- Java reads bundles stream with the ISO-8859-1 charset.
PropertyResourceBundle class uses Properties internally: the ISO 8859-1 
character encoding is used to load properties. 
Have a look to the API:
http://java.sun.com/j2se/1.4.2/docs/api/java/util/PropertyResourceBundle.html
http://java.sun.com/j2se/1.4.2/docs/api/java/util/Properties.html
So, I propose to correct plexus-i18n and use it instead of 
ResourceBundle.getBundle() calls (I think specifically in 
maven-project-info-reports-plugin subproject). See plexus-i18n.diff.
Another solution could be to use native2ascii in each bundles but IMHO it is 
not really human readable. 

- Xpp3DomBuilder in plexus-util seems to not handle correctly encoding 
parameter in XML header. So, plexus-site-renderer component doesn't generate a 
site descriptor with special characters.
Have a look to plexus-utils.diff and plexus-site-renderer.diff
Another issue could be in the toString() method from Xpp3Dom class: we need to 
add a default encoding. See plexus-utils_2.diff.

- Finally, IMHO, I don't think that the StringInputStream class in plexus-utils 
component has a good implementation because no encoding is defined. Maybe we 
could migrate to the StringInputStream class from Ant project.
http://svn.apache.org/repos/asf/ant/core/trunk/src/main/org/apache/tools/ant/filters/StringInputStream.java

It is hard to debug charset problems and depends on several factors. 
Other ideas are welcome.


> Various encoding problems with InputStream and XML
> --------------------------------------------------
>
>          Key: MNG-1409
>          URL: http://jira.codehaus.org/browse/MNG-1409
>      Project: Maven 2
>         Type: Bug
>   Components: maven-site-plugin
>     Reporter: Naoki Nose

>
>
> There is various encoding problems with InputStream and XML in different 
> components.
> - Property resource file is encoded with UTF-8 , but Java reads bundle with 
> UTF-8.
> - In different components Reader is constructed with default system encoding.
> - MXParser ignores encoding attribute in xml declaration.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to