Hi *,

after finalizing the analysis on
https://issues.apache.org/jira/browse/RAT-190
it seems that RAT is not explicit enough when it comes to encoding.

CAUSE/BUG BACKGROUND
If mvn is configured to run with a non UTF-8 encoding there will be
problems when matching UTF-8 content with licenses.

PATCH PROPOSAL
I've browsed over some of the code parts and added some "UTF-8" to make
it more explicit that UTF-8 should be the default. What do you think of
that proposal?

YOU FEEDBACK WANTED
1) Is it sufficient enough?
2a) Should we have a RAT configuration option to allow specific setting
of encodings? With UTF-8 as default if not configured/set otherwise.
2) Should we just use UTF-8 as default (hardcoded) and do not give the
user a chance to set the encoding to use.

IMPROVE TESTABILITY?
Since we seem to run with UTF-8 encoding in Jenkins we did not see these
problems before. Does anyone have a good idea on how to test this?
A UTF-8 encoded file should be analysed with mvn -Dfile.encoding!=UTF-8?

Cheers & thanks for any opinions :-)

Phil
Index: 
apache-rat-core/src/main/java/org/apache/rat/document/impl/ArchiveEntryDocument.java
===================================================================
--- 
apache-rat-core/src/main/java/org/apache/rat/document/impl/ArchiveEntryDocument.java
        (Revision 1659933)
+++ 
apache-rat-core/src/main/java/org/apache/rat/document/impl/ArchiveEntryDocument.java
        (Arbeitskopie)
@@ -19,6 +19,10 @@
 
 package org.apache.rat.document.impl;
 
+import org.apache.rat.api.Document;
+import org.apache.rat.api.MetaData;
+import org.apache.rat.api.RatException;
+
 import java.io.ByteArrayInputStream;
 import java.io.File;
 import java.io.IOException;
@@ -26,10 +30,6 @@
 import java.io.InputStreamReader;
 import java.io.Reader;
 
-import org.apache.rat.api.Document;
-import org.apache.rat.api.MetaData;
-import org.apache.rat.api.RatException;
-
 public class ArchiveEntryDocument implements Document {
 
     private byte[] contents;
Index: 
apache-rat-core/src/main/java/org/apache/rat/document/impl/MonolithicFileDocument.java
===================================================================
--- 
apache-rat-core/src/main/java/org/apache/rat/document/impl/MonolithicFileDocument.java
      (Revision 1659933)
+++ 
apache-rat-core/src/main/java/org/apache/rat/document/impl/MonolithicFileDocument.java
      (Arbeitskopie)
@@ -18,6 +18,8 @@
  */
 package org.apache.rat.document.impl;
 
+import org.apache.rat.api.Document;
+
 import java.io.File;
 import java.io.FileInputStream;
 import java.io.FileReader;
@@ -27,9 +29,7 @@
 import java.io.Reader;
 import java.net.URL;
 
-import org.apache.rat.api.Document;
 
-
 public class MonolithicFileDocument extends AbstractMonolithicDocument {
     private static final String UTF_8 = "UTF-8";
 
Index: apache-rat-plugin/src/main/java/org/apache/rat/mp/FilesReportable.java
===================================================================
--- apache-rat-plugin/src/main/java/org/apache/rat/mp/FilesReportable.java      
(Revision 1659933)
+++ apache-rat-plugin/src/main/java/org/apache/rat/mp/FilesReportable.java      
(Arbeitskopie)
@@ -86,7 +86,7 @@
         public Reader reader() throws IOException
         {
             final InputStream in = new FileInputStream( file );
-            return new InputStreamReader( in );
+            return new InputStreamReader( in , "UTF-8");
         }
 
         public String getName()
Index: 
apache-rat-tasks/src/main/java/org/apache/rat/anttasks/ResourceCollectionContainer.java
===================================================================
--- 
apache-rat-tasks/src/main/java/org/apache/rat/anttasks/ResourceCollectionContainer.java
     (Revision 1659933)
+++ 
apache-rat-tasks/src/main/java/org/apache/rat/anttasks/ResourceCollectionContainer.java
     (Arbeitskopie)
@@ -18,13 +18,6 @@
  */ 
 package org.apache.rat.anttasks;
 
-import java.io.File;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.InputStreamReader;
-import java.io.Reader;
-import java.util.Iterator;
-
 import org.apache.rat.api.Document;
 import org.apache.rat.api.MetaData;
 import org.apache.rat.api.RatException;
@@ -35,6 +28,13 @@
 import org.apache.tools.ant.types.ResourceCollection;
 import org.apache.tools.ant.types.resources.FileResource;
 
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.io.Reader;
+import java.util.Iterator;
+
 /**
  * Implementation of IReportable that traverses over a resource
  * collection internally.
@@ -68,7 +68,7 @@
         
         public Reader reader() throws IOException {
             final InputStream in = resource.getInputStream();
-            final Reader result = new InputStreamReader(in);
+            final Reader result = new InputStreamReader(in, "UTF-8");
             return result;
         }
 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to