Am 2014-11-14 um 04:02 schrieb Kristian Rosenvold:
Isn't this handled by the content-type headers normally ?
No, for two reasons:
1. The currect code does not inspect the content type
2. The server does send text/html but not the used encoding which is not
necessary because it is located within the file itself
The only option would be inspect the content type header and make
further assumptions.
Michael
2014-11-13 23:15 GMT+01:00 Michael Osipov <[email protected]>:
Hi folks,
I'd like to know if we have a general concensus on this:
I am investigating MPIR-242 and figured out the cause. The input stream is
obtained from the HTTP URL and no encoding is given, so ISO-8859-1 is
provided as default (yuck!). While I know that some reporting related
modules have default values for input/output encoding, this contradicts our
general approach to use platform encoding when project.build.sourceEncoding
is not given.
In that special case, the behavior would be consistent if changed. Setting
project.build.sourceEncoding to UTF-8 would solve the problem but is just a
workaround. HTML resources carry their encoding with them but the
ProjectInfoReportUtils treats everything as input streams (not helpful with
XML/HTML). I would really like to avoid peeking with a pushback input
stream.
How is your opinion on this?
I have two solutions in mind for the issue above:
1. Easy: remove ISO-8859-1, assume platform encoding if
project.build.sourceEncoding is not provided.
2. Complex: use an HTML parser (JSoup is awesome and license-compatible [1])
to get correctly encoded content.
But how do you know that this URL really points to an HTML file and not a
license.txt inspect content type?
[1] http://apache.org/legal/resolved.html#category-a
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]