On 12/7/2010 12:13 PM, Thilo Götz wrote: > The difference between your machine and mine is probably > that I have UTF-8 as platform encoding, and my guess is > that RAT is reading in files it thinks are text files > with the platform encoding. It's a common mistake. You > can do this when your platform encoding is something > forgiving like ISO-8859-*, but not UTF-8. > > And no, this is an Intel laptop, this has nothing to > do with byte order. >
Your analysis sounds correct to me. Do you want to post a Jira to Rat (https://issues.apache.org/jira/browse/RAT)? I posted a Jira for UIMA to mark these UTF-16 files for exclusion from RAT checking. -Marshall > --Thilo > > On 07/12/10 17:49, Marshall Schor wrote: >> >> >> On 12/7/2010 5:22 AM, Thilo Götz wrote: >>> I'm getting a RAT error. Looks to me like RAT >>> is trying to read in files with the default >>> encoding. This crashes on one of our UTF-16 >>> test files for obvious reasons. A RAT bug? >> >> The file in question >> (file:///D:/mavenAlign/uimaj-trunk-data/uimaj-core/src/test/resources/pearTests/encodingTests/UTF16_with_signature.xml) >> >> hasn't been changed since 2006 original import (according to SVN info). >> >> It starts with the line: >> <?xml version="1.0" encoding="UTF-16"?> >> >> This line, itself, is written in bytes, as: >> >> FF FE 3C 00 3F 00 78 00 6D 00 ... >> >> >> The first 2 bytes are the byte order mark, in this case indicating that the >> order of the 2-byte things is the least significant, followed by the more >> significant. For instance, the next two bytes "3c 00" turn into the integer >> value 0x003c. >> >> The chars corresponding to 3C 00 3F 00 78 00 6D 00 are "<?xm" the start of >> the >> version line. >> >> My guess is that the byte order of the machine you're running with is >> reversed >> from the ones we've been testing on. On my laptop (Intel), RAT classifies >> that >> file as a "binary" file, which is not checked further. I think in order for >> it >> to do that it has to be able to read that file using the "default encoding" >> without getting an exception thrown. >> >> ---------- >> >> In any case, this is test data, and could be excluded from the RAT check >> (even >> though these files do have the license headers in them :-) ). >> For the next release, I'll add these files to the RAT exclusion pattern. >> >> For now, you can continue your testing by manually adding these as an >> exclusion >> in the uimaj-core pom.xml file. >> >> To do that, add this line: >> >> <exclude>src/test/resources/pearTests/encodingTests/UTF16*</exclude> >> after line 165, approx. >> >> -Marshall >> >> >> >> >>> >>> This is on Ubuntu, encoding is UTF-8, Java is: >>> >>> java version "1.5.0" >>> Java(TM) 2 Runtime Environment, Standard Edition (build >>> pxi32devifx-20100511a >>> (SR11 FP2 )) >>> IBM J9 VM (build 2.3, J2RE 1.5.0 IBM J9 2.3 Linux x86-32 >>> j9vmxi3223ifx-20100510 (JIT enabled) >>> J9VM - 20100509_57823_lHdSMr >>> JIT - 20091016_1845ifx7_r8 >>> GC - 20091026_AA) >>> JCL - 20100511a >>> >>> Maven version is 3.0.1. >>> >>> Here's the maven output: >>> >>> [ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.6:check >>> (default-cli) on project uimaj-core: Analysis failed: Cannot analyse header: >>> Cannot read header for >>> /home/tgoetz/tmp/uimaj-2.3.1/uimaj-core/src/test/resources/pearTests/encodingTests/UTF16_with_signature.xml: >>> >>> MalformedInputException -> [Help 1] >>> >>> Full stack trace: >>> >>> [ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.6:check >>> (default-cli) on project uimaj-core: Analysis failed: Cannot analyse header: >>> Cannot read header for >>> /home/tgoetz/tmp/uimaj-2.3.1/uimaj-core/src/test/resources/pearTests/encodingTests/UTF16_with_signature.xml: >>> >>> MalformedInputException -> [Help 1] >>> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute >>> goal >>> org.apache.rat:apache-rat-plugin:0.6:check (default-cli) on project >>> uimaj-core: Analysis failed >>> at >>> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:203) >>> at >>> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:148) >>> at >>> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:140) >>> at >>> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) >>> >>> at >>> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) >>> >>> at >>> org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183) >>> >>> at >>> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161) >>> >>> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:316) >>> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:153) >>> at org.apache.maven.cli.MavenCli.execute(MavenCli.java:451) >>> at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:188) >>> at org.apache.maven.cli.MavenCli.main(MavenCli.java:134) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:79) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> >>> at java.lang.reflect.Method.invoke(Method.java:618) >>> at >>> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290) >>> >>> at >>> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230) >>> at >>> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409) >>> >>> at >>> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352) >>> Caused by: org.apache.maven.plugin.MojoExecutionException: Analysis failed >>> at >>> org.apache.rat.mp.AbstractRatMojo.createReport(AbstractRatMojo.java:357) >>> at org.apache.rat.mp.RatCheckMojo.getRawReport(RatCheckMojo.java:89) >>> at org.apache.rat.mp.RatCheckMojo.execute(RatCheckMojo.java:138) >>> at >>> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:107) >>> >>> at >>> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:195) >>> ... 19 more >>> Caused by: org.apache.rat.report.RatReportFailedException: Analysis failed >>> at org.apache.rat.report.xml.XmlReport.report(XmlReport.java:66) >>> at org.apache.rat.mp.FilesReportable.run(FilesReportable.java:69) >>> at org.apache.rat.Report.report(Report.java:292) >>> at org.apache.rat.Report.report(Report.java:272) >>> at >>> org.apache.rat.mp.AbstractRatMojo.createReport(AbstractRatMojo.java:341) >>> ... 23 more >>> Caused by: org.apache.rat.document.RatDocumentAnalysisException: Cannot >>> analyse header >>> at >>> org.apache.rat.report.analyser.DocumentHeaderAnalyser.analyse(DocumentHeaderAnalyser.java:54) >>> >>> at >>> org.apache.rat.document.impl.util.DocumentAnalyserMultiplexer.analyse(DocumentAnalyserMultiplexer.java:37) >>> >>> at >>> org.apache.rat.document.impl.util.ConditionalAnalyser.matches(ConditionalAnalyser.java:44) >>> >>> at >>> org.apache.rat.document.impl.util.ConditionalAnalyser.analyse(ConditionalAnalyser.java:50) >>> >>> at org.apache.rat.report.xml.XmlReport.report(XmlReport.java:64) >>> ... 27 more >>> Caused by: org.apache.rat.analysis.RatHeaderAnalysisException: Cannot read >>> header for >>> /home/tgoetz/tmp/uimaj-2.3.1/uimaj-core/src/test/resources/pearTests/encodingTests/UTF16_with_signature.xml >>> >>> at >>> org.apache.rat.report.analyser.HeaderCheckWorker.read(HeaderCheckWorker.java:96) >>> >>> at >>> org.apache.rat.report.analyser.DocumentHeaderAnalyser.analyse(DocumentHeaderAnalyser.java:50) >>> >>> ... 31 more >>> Caused by: sun.io.MalformedInputException >>> at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java:294) >>> at >>> sun.nio.cs.StreamDecoder$ConverterSD.convertInto(StreamDecoder.java:316) >>> at >>> sun.nio.cs.StreamDecoder$ConverterSD.implRead(StreamDecoder.java:366) >>> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:252) >>> at java.io.InputStreamReader.read(InputStreamReader.java:212) >>> at java.io.BufferedReader.fill(BufferedReader.java:157) >>> at java.io.BufferedReader.readLine(BufferedReader.java:320) >>> at java.io.BufferedReader.readLine(BufferedReader.java:383) >>> at >>> org.apache.rat.report.analyser.HeaderCheckWorker.readLine(HeaderCheckWorker.java:111) >>> >>> at >>> org.apache.rat.report.analyser.HeaderCheckWorker.read(HeaderCheckWorker.java:89) >>> >>> ... 32 more >>> >>> >>> >>> On 07/12/10 01:01, Marshall Schor wrote: >>>> The release candidate is located in the Apache Nexus Staging repository, >>>> here: >>>> https://repository.apache.org/content/repositories/orgapacheuima-061/ >>>> <https://repository.apache.org/content/repositories/orgapacheuima-059/> >>>> >>>> The source-release zip file is located here: >>>> https://repository.apache.org/content/repositories/orgapacheuima-061/org/apache/uima/uimaj/2.3.1/ >>>> >>>> >>>> <https://repository.apache.org/content/repositories/orgapacheuima-059/org/apache/uima/uimaj/2.3.1/> >>>> >>>> >>>> >>>> The binary zip and tar files are located here: >>>> https://repository.apache.org/content/repositories/orgapacheuima-061/org/apache/uima/uimaj-distr/2.3.1/ >>>> >>>> >>>> <https://repository.apache.org/content/repositories/orgapacheuima-059/org/apache/uima/uimaj-distr/2.3.1/> >>>> >>>> >>>> >>>> The list of issues fixed is included in the source and binary >>>> distributions in >>>> the uimaj-distr project in the top level directory "issuesFixed"; this list >>>> includes "Closed/Fixed" and "Resolved/Fixed" issues. >>>> >>>> This is the first release as a top level project; there are many issues >>>> addressed, including removing incubator-related disclaimers. >>>> >>>> The Eclipse update-site for these components is here: >>>> http://people.apache.org/~schor/uima-release-candidates/uimaj-sdk-2.3.1-rc1-eclipse-update-site/ >>>> >>>> >>>> <http://people.apache.org/%7Eschor/uima-release-candidates/uimaj-sdk-2.3.1-rc1-eclipse-update-site/> >>>> >>>> >>>> >>>> Note: only the 2.3.1 jars are populated, please test that. >>>> >>>> Please inspect these artifacts, see if you can build from >>>> source-release.zip >>>> (above), and verify the license/notice files, then vote >>>> >>>> [ ] +1 OK to release >>>> [ ] 0 Don't care >>>> [ ] -1 Not ok to release, because ... >>>> >>>> -Marshall >>>> >>> >>> > >