[ https://issues.apache.org/jira/browse/SUREFIRE-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124469#comment-15124469 ]
Michael Osipov commented on SUREFIRE-1220: ------------------------------------------ So, what is is happening here? You have two VMs running, your Maven VM and the forked Surefire VM. The Maven VM runs with {{Cp1252}} and the Surefire one with {{UTF-8}}. As you know, you cannot narrow down {{UTF-8}} to {{Cp1252}}. Lets assume for now that they are mappable. Surefire has to ouput channels, ({{DirectConsoleOutput}}) and ({{ConsoleOutputFileReporter}}). The first one does {{CharBuffer decode = Charset.defaultCharset().newDecoder().decode( ByteBuffer.wrap( buf, off, len ) ); stream.append( decode );}} and the second one {{fileOutputStream = new FileOutputStream( file ); fileOutputStream.write( buf, off, len );}}. Both use default VM file encoding. Now lets get to the consumption of standard output of the forked VM. {{ForkStarter}} passes {{FORK_STREAM_CHARSET_NAME}} ({{ISO-8859-1}}) to {{executeCommandLineAsCallable}}. A {{StreamPumper}} is created but that {{Charset}} is never passed so the {{InputStreamReader}} is again created with default encoding. The {{ThreadedStreamConsumer}} consumes the specially encoded output from the Surefire Booter. The booter in turn maps all bytes written to {{stderr}} or {{stdout}} to a 7 bit alignment ({{ASCII}}) and properly decodes them in {{ForkClient}} but this one does {{ByteBuffer defaultEncoded = DEFAULT_CHARSET.encode( decodedFromSourceCharset );}} and your output is broken of course. To make a long story short, the encoding and decoding of out {{System.out}}s are done marvelously but at the end, trying to maps chars to an encoding which does not support it simply won't work between two VMs. The result is that there no bug appearantly but {{chcp}} Maven's encoding ({{MAVEN_OPTS=-Dfile.encoding=...}} and the forked one {{<argLine>-Dfile.encoding=...</argLine>}} have to match, everything else is undefined. Why did {{exec}} work? For one simple reason, the output of {{java}} was passed as-is and {{chcp}} and the forked encoding did match. > Surefire never outputs UTF-8 under Windows > ------------------------------------------ > > Key: SUREFIRE-1220 > URL: https://issues.apache.org/jira/browse/SUREFIRE-1220 > Project: Maven Surefire > Issue Type: Bug > Components: Maven Surefire Plugin > Affects Versions: 2.19.1 > Environment: Windows 10, 64-bit > DejaVu Sans font > Reporter: Gili > Attachments: 2016-01-29_113906.png, exec_exec.png, output.exec.txt, > output.test.txt, surefire-1220.zip, test.png > > > I'm having problems getting Surefire to output UTF-8 fonts under Windows. > When I run a unit test that outputs a Guava Range ("10‥20") the TWO DOT > LEADER unicode character always gets rendered as a question mark. > If I run the exact same code outside of Surefire (using a main() entry point) > the UTF-8 character renders just fine. The repro steps are quite simple: > # Create a Maven project. > # Run {code}System.out.println(Range.closed(10, 30));{code} in a Java class > with a main() entry point, and from a JUnit test. > # The main() entry point will output UTF-8 just fine. The JUnit test will > output a question mark in place of the unicode. > Here is my pom.xml file: > {code} > <?xml version="1.0" encoding="UTF-8"?> > <project xmlns="http://maven.apache.org/POM/4.0.0" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 > http://maven.apache.org/xsd/maven-4.0.0.xsd"> > <modelVersion>4.0.0</modelVersion> > <groupId>com.mycompany</groupId> > <artifactId>mavenproject1</artifactId> > <version>1.0-SNAPSHOT</version> > <packaging>jar</packaging> > <build> > <plugins> > <plugin> > <groupId>org.apache.maven.plugins</groupId> > <artifactId>maven-surefire-plugin</artifactId> > <version>2.19.1</version> > <configuration> > <argLine>-Dfile.encoding=UTF-8</argLine> > </configuration> > </plugin> > <plugin> > <groupId>org.codehaus.mojo</groupId> > <artifactId>exec-maven-plugin</artifactId> > <version>1.4.0</version> > <executions> > <execution> > <goals> > <goal>java</goal> > </goals> > </execution> > </executions> > <configuration> > <mainClass>foo.Main</mainClass> > </configuration> > </plugin> > </plugins> > </build> > <dependencies> > <dependency> > <groupId>com.google.guava</groupId> > <artifactId>guava</artifactId> > <version>19.0</version> > </dependency> > <dependency> > <groupId>junit</groupId> > <artifactId>junit</artifactId> > <version>4.12</version> > <scope>test</scope> > </dependency> > </dependencies> > <properties> > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> > <maven.compiler.source>1.8</maven.compiler.source> > <maven.compiler.target>1.8</maven.compiler.target> > </properties> > </project> > {code} > I tried the same thing using TestNG tests and noticed that although output to > console was still wrong, the outputted testng-results.xml file contained the > correct character. > Can you reproduce this on your end? -- This message was sent by Atlassian JIRA (v6.3.4#6332)