Taking this together, one might argue to have UTF-8 the default, not ISO-8859-1.
In general, I completely agree with your preference to Unicode and fail-fast behavior. If I had been involved when the Maven story started, I would have proposed UTF-8 as the default value, no doubt. As for today, I tried to consider consistency with existing behavior. The Maven Site Plugin was already using Latin-1 as the default value for inputEncoding and outputEncoding and so I proposed this for other plugins, too. Indeed, one of the patches (MJAVADOC-165) was just released such that already two plugins teach users this default value. Therefore I fear it might be too late to introduce another default value. If the community believes this change is worth the confusion caused on users, I'm the first one running the other way round ;-)
It should be checked whether plugins really die for invalid UTF-8 sequences, and what the output looks like.
That's a good point. It appears we need to do some extra homework here: The simplisitic use of InputStreamReader and OutputStreamReader will silently convert unmappable byte sequences to a default character ('?', see also [0]). I guess we could nicely hide the required implementation by means of the existing methods in Reader-/WriterFactory from plexus-utils.
Note that ASCII-only sources will compile cleanly no matter the default encoding
Most of time, but UTF-16 or EBCDIC have not even ASCII in common. Benjamin [0] http://java.sun.com/javase/6/docs/api/java/io/OutputStreamWriter.html --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]