[ 
https://issues.apache.org/jira/browse/HADOOP-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663324#comment-13663324
 ] 

Ivan Mitic commented on HADOOP-9287:
------------------------------------

Thanks Chris! 

I spent some time reading over the maven parallel test execution and run into 
the following:
http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html

Looks like there is a way to specify distinct property values using the 
pre-defined surefire.forkNumber property.

I went ahead and tried this out, and it seems to provide the behavior we need. 
For the prototyping purposes I added the following profile to hadoop common 
pom.xml without other changes:
{noformat}
    <profile>
      <id>parallel-tests</id>
      <build>
        <plugins>
          <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-surefire-plugin</artifactId>
            <version>2.14.1</version>
            <configuration>
              <forkMode>perthread</forkMode>
              <threadCount>4</threadCount>
              <reuseForks>false</reuseForks>
              <parallel>classes</parallel>
              <argLine>-Xmx1024m -XX:+HeapDumpOnOutOfMemoryError 
-DminiClusterDedicatedDirs=true</argLine>
              <systemPropertyVariables>
                
<test.build.data>I:\tmp$<curlybrace>surefire.forkNumber<curlybrace></test.build.data>
              </systemPropertyVariables>
            </configuration>
          </plugin>
        </plugins>
      </build>
    </profile>
{noformat}

And the common test runtime decreased on my dev workstation from 17 minutes to 
less than 5 minutes! Nice!

I noticed 4 additional test case failures in parallel mode (out of the total of 
1969), did not investigate yet.

Additional parameter you’ll notice I’m passing above is reuseForks=false. 
Otherwise, the same process is reused across test suites. reuseForks will 
provide additional perf benefit however, it can also cause problems. Default 
{{mvn test}} also does not reuse the process (if I’m not mistaken), so this 
seems like the right thing to do as the first step.

There is one catch though, hadoop tests seem to expect that test.build.data 
exist, so we’ll have to somehow create test folders before we initiate the test 
run. “Brute force” approach is to always create for example N predefined 
test.build.data dirs, test.build.data1, test.build.data2, etc. We might be able 
to do better though.

Let me know what you think. 


                
> Parallel testing hadoop-common
> ------------------------------
>
>                 Key: HADOOP-9287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9287
>             Project: Hadoop Common
>          Issue Type: Test
>          Components: test
>    Affects Versions: 3.0.0
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Andrey Klochkov
>         Attachments: HADOOP-9287.1.patch, HADOOP-9287--N3.patch, 
> HADOOP-9287--N3.patch, HADOOP-9287--N4.patch, HADOOP-9287--N5.patch, 
> HADOOP-9287.patch, HADOOP-9287.patch
>
>
> The maven surefire plugin supports parallel testing feature. By using it, the 
> tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to