[
https://issues.apache.org/jira/browse/HADOOP-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574960#comment-13574960
]
Andrey Klochkov commented on HADOOP-9287:
-----------------------------------------
By coincidence I've been working on this recently. As Chris is pointing out,
just turning on parallel testing in Surefire would lead to various problems as
current tests are not ready to be used this way. So what I did is fixed
hadoop-common-project/hadoop-common and hadoop-hdfs-project/hadoop-hdfs to
allow such execution, and the results seem positive.
The amount of changes required to remove contention among tests is not small,
but the changes are straightforward. Parallel execution may be turned on by
activating profile "parallel-tests". Number of forks to use may be tuned using
-DtestsThreadCount (4 is the default).
Most of changes in hadoop-common are related to FileContextTestHelper and
FileSystemTestHelper -- some static methods are transformed into instance
methods, to make tests use different directories by default. Tests which depend
on these classes are changed accordingly.
Most of changes in hadoop-hdfs are related to MiniDFSCluster. Earlier, most of
tests used the same dir to place MiniDFSCluster data. The modifications make
every MiniDFSCluster instance to use a new dir (by default). When several
instances need to use the same dir, it needs to be set explicitly using
MiniDFSCluster.Builds.dfsBaseDir(dfsBaseDir).
As I know MiniDFSCluster is used in other projects like HBase so changing it's
API and default behavior may lead to issues there. So I left all existing
methods intact, marking some of them as deprecated, and introduced an
environment var which switches new behavior on, and by default the old single
dir behavior is active.
Currently it takes 7min to run hadoop-common tests with 4 parallel forks on my
4core laptop, vs 15min in sequential mode. For hdfs it's 42min vs 1hr 39min. It
may give even a bigger improvement if used on a CI node with many cores.
I'm still in process of testing this. In particular, I'm going to verify
projects which depend on Mini cluster infrastructure like HBase, Pig and Hive.
My existing patch is for both hadoop-common and hadoop-hdfs. The tests in these
modules are coupled and changing one without changing the other wouldn't work.
Tsuyoshi, do you mind if I change the title of this task adding HDFS and
reassign it to myself?
> Parallel testing hadoop-auth and hadoop-common
> ----------------------------------------------
>
> Key: HADOOP-9287
> URL: https://issues.apache.org/jira/browse/HADOOP-9287
> Project: Hadoop Common
> Issue Type: Bug
> Components: test
> Affects Versions: 3.0.0
> Reporter: Tsuyoshi OZAWA
> Assignee: Tsuyoshi OZAWA
> Attachments: HADOOP-9287.1.patch
>
>
> The maven surefire plugin supports parallel testing feature. By using it, the
> tests can be run more faster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira