[ 
https://issues.apache.org/jira/browse/RAT-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17899204#comment-17899204
 ] 

ASF subversion and git services commented on RAT-96:
----------------------------------------------------

Commit 0a9559e1d8726ba16c933ec560f7e5d42400360e in creadur-rat's branch 
refs/heads/master from Claude Warren
[ https://gitbox.apache.org/repos/asf?p=creadur-rat.git;h=0a9559e1 ]

RAT-81: Fixed encoding issue causing text files to not be read properly (#395)

* Fixed encoding issue where text files not in UTF-8 encoding would not be 
properly.

Change adds charset to the metadata when it can be discovered.  If not UTF8 is 
returned.

Added integration test RAT-81 to show reading of UTF8 and IBM037 encoding works.

* Minor fixes

* RAT-81: Add changelog about encoding bugfix

* added logging and removed dead code

* fix for RAT-96

Added mediaType and encoding attributes to XML output.
Added updated DefaultAnalyserFactoryTests to account for change
Added integration tests for RAT-147 and RAT-211 based on code in 
DefaultAnalyserFactoryTests
Updated ReportTest to add dependencies and package jar to classpath for test.
Fixed testing issues in Ant unit caused by addition of mediatype and attributes.
renamed reportTest directories to use a '_' rather than a '-' to account for 
java package names.

* RAT-81: groovify the test code, minor fixes

* RAT-81: Add mediaType and encoding to RAT report, minor fixes

---------

Co-authored-by: P. Ottlinger <pottlin...@apache.org>
Co-authored-by: P. Ottlinger <ottlin...@users.noreply.github.com>

> Check source files for unexpected encodings
> -------------------------------------------
>
>                 Key: RAT-96
>                 URL: https://issues.apache.org/jira/browse/RAT-96
>             Project: Apache Rat
>          Issue Type: Sub-task
>            Reporter: Sebb
>            Assignee: Claude Warren
>            Priority: Major
>             Fix For: 0.17
>
>
> Idea for possible enhancement:
> Source files with characters in encodings other than ASCII can easily get 
> mangled, so it might be worth offering a tool to report these.
> For example, I have come across Javadoc which uses dashes instead of hyphens, 
> and at some point the encoded dash got corrupted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to