[ 
https://issues.apache.org/jira/browse/HADOOP-18335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574456#comment-17574456
 ] 

Steve Vaughan commented on HADOOP-18335:
----------------------------------------

Another aspect that would speed up the process is the parallelization of the 
compilation phase.

I have an open pull request in 
https://issues.apache.org/jira/browse/MAPREDUCE-7386 then enables parallel 
compilation.  With this change you can build the entire distribution (skipping 
tests which aren't safe for parallel execution at the moment), and follow that 
with distributed test runs.  I'm currently doing a full distribution including 
native libraries in under 10 minutes using Maven's "-T 2C".  For testing I'm 
focused on the tests for hadoop-hdfs and hadoop-hdfs-client (and dependent 
modules) which run an additional 13 minutes when distributed across a 
Kuberentes cluster.


> Improve Pre-Commit Time
> -----------------------
>
>                 Key: HADOOP-18335
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18335
>             Project: Hadoop Common
>          Issue Type: Task
>            Reporter: Ayush Saxena
>            Priority: Critical
>              Labels: buid
>
> As of now the complete build time has reached ~24 hours, which makes tracking 
> Jira with root level changes very tough.
> Even at module level, it is on the higher side, kind of 6 hours or more, 
> which is typically equivalent to 1 working day hours:
> Some areas to explore:
>  * Enable Parallel-Test profile for the modules which don't have it.
>  * Explore improvements in the area of our Test setup, increase memory, 
> number of threads, or some parallel executions?
>  * Remove the modules or atleast disable the test suites of modules which are 
> no longer being used or maintained (Separate Mail Chain is there for this)
>  * Spin Two Pre-Commit jobs rather than 1 and split some task between them, 
> kind of one for Java-11 build & Javadoc, Checkstyle and related stuff.
>  * Figure out & remove irrelevant or similar tests.
>  * Improve existing tests, Like reusing MiniDfsClusters or so, rather than a 
> bunch of test spinning one for themselves.
>  * If possible, if there are multiple modules to be tested, run those modules 
> test in parallel (Exploratory: may lead to OOM)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to