[ https://issues.apache.org/jira/browse/MAPREDUCE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575713#comment-17575713 ]
ASF GitHub Bot commented on MAPREDUCE-7386: ------------------------------------------- snmvaughan commented on PR #4415: URL: https://github.com/apache/hadoop/pull/4415#issuecomment-1206331264 I've been focusing on speeding up the build process in order to aid in development. There are several reasons why the tests can't safely run in parallel, which is why I've started looking into short-term opportunities for speedup in the meantime. My plan of attack has been to do the following: 1. Build the distribution in parallel (this pull request). Some time savings 2. Distribute the tests to run in parallel (each in their own containers, but running sequentially within that container). More time savings, but flaky tests are still an issue. This requires some tweaks to a few tests, which I'll be submitting back in other pull requests. 3. Fix tests that break basic conventions that can lead to impacts between tests. Example: Tests that require write access to shared resources (such as files in the classpath). 4. Fix flaky test classes that fail to run consistently even in their own containers. More time savings since you don't have to wait for tests to time out. This is iterative, but once this is completed you can develop knowing that a failure is actually caused by a change. 5. Based on lessons learned, consider additional improvements. Example: Switching to using JUnit support for temporary directories to help with parallel execution In addition, I've been starting small by limiting the test runs to a focused subset. Currently I execute tests for `hadoop-hdfs-client` and `hadoop-hdfs` along with their dependencies with every build. > Maven parallel builds (skipping tests) fail > ------------------------------------------- > > Key: MAPREDUCE-7386 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7386 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build > Affects Versions: 3.4.0, 3.3.9 > Environment: The problem occurred while using the Hadoop development > environment (Ubuntu) > Reporter: Steve Vaughan > Priority: Critical > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Running a parallel build fails during assembly with the following error when > running either package or install: > {code:java} > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal org.apache.maven.plugins:maven-assembly-plugin:2.4:single > (package-mapreduce) on project hadoop-mapreduce: Failed to create assembly: > Artifact: org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.4.0-SNAPSHOT > (included by module) does not have an artifact with a file. Please ensure the > package phase is run before the assembly is generated. {code} > {code:java} > Caused by: org.apache.maven.plugin.MojoExecutionException: Failed to create > assembly: Artifact: > org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.4.0-SNAPSHOT (included > by module) does not have an artifact with a file. Please ensure the package > phase is run before the assembly is generated. {code} > The command executed was: > {code:java} > $ mvn -nsu clean install -Pdist,native -DskipTests -Dtar > -Dmaven.javadoc.skip=true -T 2C {code} > Adding dependencies to the assembly plugin configuration addresses the issue -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org