[
https://issues.apache.org/jira/browse/RAT-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085076#comment-13085076
]
Sebb commented on RAT-96:
-------------------------
Sounds like RAT has some of the code already; but rather than comparing against
the default encoding, I'm suggesting checking against basic ASCII.
This is because ASCII is a proper sub-set of many other encodings.
It's a lot harder to mangle ASCII characters inadvertently; it's quite easy to
mangle characters outside the ASCII range.
BTW, checking against the platform encoding seems wrong - surely the checks
should be against the actual encoding used by the project?
This is generally ISO-8859-1 or UTF-8. Probably very difficult to deternine
automatically, so might need to be provided, perhaps with UTF-8 as default?
> Check source files for unexpected encodings
> -------------------------------------------
>
> Key: RAT-96
> URL: https://issues.apache.org/jira/browse/RAT-96
> Project: RAT
> Issue Type: New Feature
> Reporter: Sebb
>
> Idea for possible enhancement:
> Source files with characters in encodings other than ASCII can easily get
> mangled, so it might be worth offering a tool to report these.
> For example, I have come across Javadoc which uses dashes instead of hyphens,
> and at some point the encoded dash got corrupted.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira