On 8-Apr-08, at 11:27 AM, Benjamin Bentmann wrote:
Jason van Zyl wrote:
If it's right most of the time, and it saves the user from having to know
or worry about it then yes I would use it.

Could you elaborate this a little more. Say we start easy and have a build with just about 100 Java source files. Do you suggest to peek at each of
them before passing them to a tool like javac or just a subset and how
should this subset be determined?

It would be reasonable to assume the detection could be based on a subset. For an organization on one project you could reasonable assume the same encoding. That would not be the case in an open source project as tools would vary.

What should be done when the charset
detection reports different encodings for the set of files to process?

What happens when the encoding is different then what is stated? Same problem really, in how to deal with the actual versus declared.

Will
the charset detection happen over and over again for each plugin (javac, javadoc, jxr)? What do you consider "most of time", telling the various ISO-8859 families apart is not really easy. My impression is that usage of JChardet will significantly increase code complexity without giving me a
solid build.

That would depend on what kinds of problems can arise if things are not consistent.



Also, I believe it's a bad idea to free users from worrying about the
encoding.

You have to deal with the very real possibility no one is going to set it, not know what is, and report issues related to encoding even if the whole system works.

I'm all for literal and declarative. In practice this does not happen all the time. I also didn't say use one over the other, but the detection may help in cases where it's not stated. The JChardet library was created for a reason, and this looks like one of them.

For the system you are proposing there would be touch points at which you would look for encoding parameters. If those values are not state you will need a strategy to detect or you will never be able to support any encoding alignment in older versions of Maven without the encoding parameterization.

This would be similar to the doubtful magic the JRE provides with
its default encoding: It encourages developers to ignore the encoding issue, leading to platform-dependent behavior. Platform-dependent Java code is a bad practice and Maven, as far as I heard, aims at promoting best practices.

Of course it is, but that doesn't negate that fact people don't necessarily follow best practices. But you are

1) going to need to deal with versions of Maven that don't support this encoding parameterization, and
2) you're going to have to deal with the case where it's stated wrong

We should know combinations of encoding parameter that will work together and if they aren't stated, or stated wrong it's better to provide some fallback instead of just dying.


File encoding is a parameter affecting your build output just like the
source/target settings used for the compiler and hence should be explicitly
controlled.


Absolutely, but look at all the questions on the mailing list that expect many of these things to just be detected. People using Java 1.5 just expect you to be able to compile 1.5 code. That's not the case. Users in this case expect the right thing to happen. I'm willing to bet you if you asked the average user about encoding, they would have no clue and wonder why it wasn't detected.

It was a suggestion based on experience of typical users.



As we talk about it: What is the agreed file encoding for the Maven sources
(MNGSITE-46)?


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Thanks,

Jason

----------------------------------------------------------
Jason van Zyl
Founder,  Apache Maven
jason at sonatype dot com
----------------------------------------------------------

We all have problems. How we deal with them is a measure of our worth.

-- Unknown



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to