[ 
https://issues.apache.org/jira/browse/RAT-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085076#comment-13085076
 ] 

Sebb commented on RAT-96:
-------------------------

Sounds like RAT has some of the code already; but rather than comparing against 
the default encoding, I'm suggesting checking against basic ASCII.

This is because ASCII is a proper sub-set of many other encodings.
It's a lot harder to mangle ASCII characters inadvertently; it's quite easy to 
mangle characters outside the ASCII range.

BTW, checking against the platform encoding seems wrong - surely the checks 
should be against the actual encoding used by the project?
This is generally ISO-8859-1 or UTF-8. Probably very difficult to deternine 
automatically, so might need to be provided, perhaps with UTF-8 as default?

> Check source files for unexpected encodings
> -------------------------------------------
>
>                 Key: RAT-96
>                 URL: https://issues.apache.org/jira/browse/RAT-96
>             Project: RAT
>          Issue Type: New Feature
>            Reporter: Sebb
>
> Idea for possible enhancement:
> Source files with characters in encodings other than ASCII can easily get 
> mangled, so it might be worth offering a tool to report these.
> For example, I have come across Javadoc which uses dashes instead of hyphens, 
> and at some point the encoded dash got corrupted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to