Le vendredi 14 novembre 2014 17:58:44 Michael Osipov a écrit :
> Am 2014-11-14 um 17:47 schrieb Hervé BOUTEMY:
> > since it is the encoding of a downloaded license, it has nothing to do
> > with
> > encoding of project sources: using ${project.build.sourceEncoding} is IMHO
> > wrong algorithm (which happen to give good results since a lot of people
> > use UTF-8)
> > 
> > then I'd go either for a parameter for the goal, or JSoup that does the
> > magic to detect effective content encoding
> 
> While this seems sound what about if the ressource is plain text and no
> encoding can be deduced?
true: our only bet is parameter

> 
> The parameter won't help if there are several licenses with several
> encodings used.
looks like the parameter can be either simple or complex: need a syntax

or just ignore: is it theory or reality?

> 
> > Le vendredi 14 novembre 2014 10:37:22 Michael Osipov a écrit :
> >> Am 2014-11-14 um 04:02 schrieb Kristian Rosenvold:
> >>> Isn't this handled by the content-type headers normally ?
> >> 
> >> No, for two reasons:
> >> 
> >> 1. The currect code does not inspect the content type
> >> 2. The server does send text/html but not the used encoding which is not
> >> necessary because it is located within the file itself
> >> 
> >> The only option would be inspect the content type header and make
> >> further assumptions.
> >> 
> >> Michael
> >> 
> >>> 2014-11-13 23:15 GMT+01:00 Michael Osipov <micha...@apache.org>:
> >>>> Hi folks,
> >>>> 
> >>>> I'd like to know if we have a general concensus on this:
> >>>> 
> >>>> I am investigating MPIR-242 and figured out the cause. The input stream
> >>>> is
> >>>> obtained from the HTTP URL and no encoding is given, so ISO-8859-1 is
> >>>> provided as default (yuck!). While I know that some reporting related
> >>>> modules have default values for input/output encoding, this contradicts
> >>>> our
> >>>> general approach to use platform encoding when
> >>>> project.build.sourceEncoding
> >>>> is not given.
> >>>> 
> >>>> In that special case, the behavior would be consistent if changed.
> >>>> Setting
> >>>> project.build.sourceEncoding to UTF-8 would solve the problem but is
> >>>> just
> >>>> a
> >>>> workaround. HTML resources carry their encoding with them but the
> >>>> ProjectInfoReportUtils treats everything as input streams (not helpful
> >>>> with
> >>>> XML/HTML). I would really like to avoid peeking with a pushback input
> >>>> stream.
> >>>> 
> >>>> How is your opinion on this?
> >>>> 
> >>>> I have two solutions in mind for the issue above:
> >>>> 
> >>>> 1. Easy: remove ISO-8859-1, assume platform encoding if
> >>>> project.build.sourceEncoding is not provided.
> >>>> 2. Complex: use an HTML parser (JSoup is awesome and license-compatible
> >>>> [1]) to get correctly encoded content.
> >>>> But how do you know that this URL really points to an HTML file and not
> >>>> a
> >>>> license.txt inspect content type?
> >>>> 
> >>>> [1] http://apache.org/legal/resolved.html#category-a
> >>>> 
> >>>> Michael
> >>>> 
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
> >>>> For additional commands, e-mail: dev-h...@maven.apache.org
> >>> 
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
> >>> For additional commands, e-mail: dev-h...@maven.apache.org
> >> 
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
> >> For additional commands, e-mail: dev-h...@maven.apache.org
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
> > For additional commands, e-mail: dev-h...@maven.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
> For additional commands, e-mail: dev-h...@maven.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

Reply via email to