On Jan 22, 2009, at 4:50 PM, Hervé BOUTEMY wrote:

Sorry, I was working on other things and missed this discussion.
I just commented (and closed as "Not A Bug" :) ) the issue.

I agree that autodetecting is not a bullet-proof feature, but an absolute guarantee is not required in this case. I share Jason van Zyl's view: "If it's right most of the time, and it saves the user from having to know or worry about it then yes I would use it." [1]

Another issue is that without autodetection, supporting more than one type of character encoding for the APT files in a Maven project is impossible.

That said, if autodetection is simply out of the question, let me suggest a different tack. Doxia appears to require ISO-8859-1 for APT files by default. This is a Western-centric encoding that lacks support for Asian languages. It is also deprecated. According to Wikipedia:

"The ISO/IEC working group responsible for maintaining eight-bit coded character sets disbanded and ceased all maintenance of ISO 8859, including ISO 8859-1, in order to concentrate on the Universal Character Set and Unicode." [2]

I would also say that with the increasing popularity of UTF-8, the number of encoding problems encountered by users due to Doxia favoring ISO-8859-1 is already larger than any problems that might occur due to bad autodetection. In other words, autodetection might be wrong some of the time, but for many users, ISO-8859-1 is wrong all of the time.

In light of this, I suggest changing Doxia's APT handling so that it defaults to UTF-8 rather than ISO-8859-1. Not only will this help UTF-8 users (who may be a majority), it will also help increase Maven's acceptance in the Asian world, a trend that is already happening [3].

I can work on a patch for this, if there's a chance it will be accepted.

Trevor

[1] 
http://www.nabble.com/Re%3A--VOTE--POM-Element-for-Source-File-Encoding-p16566779.html
[2] http://en.wikipedia.org/wiki/ISO_8859-1
[3] 
http://blogs.sonatype.com/people/2008/07/apache-maven-the-definitive-chinese-guide/

Reply via email to