Just to be clear, you are favoring:

Alternative:
1. If paremeter is set, use that regardless of the rest
2. if not set, obtain the content type
3. Check whether is contains charset qualifier, yes use, use that
4. If not check whether this is an HTML file and pass to JSoup (do magic)
5. If nothing else works, assume UTF-8

Michael

Am 2014-11-14 um 19:09 schrieb Hervé BOUTEMY:
I prefer the alternative
and if no parameter is set, just keep it stupid simple: assume UTF-8

IMHO, this will give good results and will be easy to explain

anything more complex is harder to maintain and to explain in case magic does
not do what was dreamt of

Regards,

Hervé

Le vendredi 14 novembre 2014 18:43:02 Michael Osipov a écrit :
Am 2014-11-14 um 18:07 schrieb Hervé BOUTEMY:
[..]

The parameter won't help if there are several licenses with several
encodings used.

looks like the parameter can be either simple or complex: need a syntax

or just ignore: is it theory or reality?

Pure theory.

My approach would be this:

provide a license paramter: licenseEncoding

1. Obtain the content type
2. Check whether is contains charset qualifier, yes use, use that
3. If not check whether this is an HTML file and pass to JSoup (do magic)
4. If nothing else can be determined use the parameter
5. If paremeter is not set, assume UTF-8

Alternative:

1. If paremeter is set, use that regardless of the rest
2. If not, continue with first approach and omit 4

WDYT?

Michael




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

Reply via email to