What do if project.build.sourceEncoding is not provided?

Michael Osipov Thu, 13 Nov 2014 14:15:29 -0800

Hi folks,

I'd like to know if we have a general concensus on this:

I am investigating MPIR-242 and figured out the cause. The input streamis obtained from the HTTP URL and no encoding is given, so ISO-8859-1 isprovided as default (yuck!). While I know that some reporting relatedmodules have default values for input/output encoding, this contradictsour general approach to use platform encoding whenproject.build.sourceEncoding is not given.

In that special case, the behavior would be consistent if changed.Setting project.build.sourceEncoding to UTF-8 would solve the problembut is just a workaround. HTML resources carry their encoding with thembut the ProjectInfoReportUtils treats everything as input streams (nothelpful with XML/HTML). I would really like to avoid peeking with apushback input stream.


How is your opinion on this?

I have two solutions in mind for the issue above:

1. Easy: remove ISO-8859-1, assume platform encoding ifproject.build.sourceEncoding is not provided.2. Complex: use an HTML parser (JSoup is awesome and license-compatible[1]) to get correctly encoded content.But how do you know that this URL really points to an HTML file and nota license.txt inspect content type?


[1] http://apache.org/legal/resolved.html#category-a

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

What do if project.build.sourceEncoding is not provided?

Reply via email to