[
https://issues.apache.org/jira/browse/HDFS-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746755#comment-13746755
]
Chris Nauroth commented on HDFS-4491:
-------------------------------------
Hi, Andrey. First, here are a couple of thoughts. These don't necessarily
have to gate commit:
# On HADOOP-9287, there had been discussion of trying to use
{{surefire.forkNumber}} to build a unique {{test.build.data}} per test suite
fork. I'm curious if you've tried this out. It could potentially reduce the
size of this patch significantly, and it would be less brittle for future
tests, because people wouldn't have to remember to use {{PathUtils}} instead of
accessing {{test.build.data}} directly. Here is the original comment:
https://issues.apache.org/jira/browse/HADOOP-9287?focusedCommentId=13663324&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13663324
# I like the idea mentioned by Jason earlier to implement dynamic port
allocation, either using ephemeral ports or coordinating the multiple processes
around a file lock. This could be handled as a follow-up jira though. The
current patch is already big enough. :-)
Next, here are some things that I think we do need to address before commit.
I'm still tracking down a Jenkins problem that's preventing us from getting a
test-patch run on Linux. Meanwhile, I tried a parallel run on Windows and
spotted a few issues:
# {{TestDecommission}} failed in datanode startup while binding to port 50075,
which was already in use. Perhaps we need to tune configured ports or let it
use ephemeral ports?
# {{TestValidateConfigurationSettings}} fails with the parallel-tests profile
on due to
{code}
java.lang.RuntimeException: MiniDFSCluster is in dedicated dirs mode. Switch to
non-deprecated methods.
{code}
# {{TestRenameWithSnapshots}}, {{TestSnapshot}}, and
{{TestFSImageWithSnapshot}} had multiple failures. Is it because they access
{{test.build.data}} directly instead of going through {{PathUtils}}?
# {{TestJournalNodeMXBean}} was failing due to:
{code}
testJournalNodeMXBean(org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean):
expected:<{[]}> but was:<{["ns1":{"Formatted":"true"}]}>.
{code}
I also saw sporadic failures in {{TestQJMWithFaults}}. I'm not sure about root
cause yet. Maybe there is still some unintended sharing in the QJM tests.
This is pretty easy to reproduce by running all of the tests under .qjournal
with the parallel-tests profile on.
> Parallel testing HDFS
> ---------------------
>
> Key: HDFS-4491
> URL: https://issues.apache.org/jira/browse/HDFS-4491
> Project: Hadoop HDFS
> Issue Type: Test
> Components: test
> Affects Versions: 3.0.0
> Reporter: Tsuyoshi OZAWA
> Assignee: Andrey Klochkov
> Attachments: HDFS-4491--n10.patch, HDFS-4491--n10.patch,
> HDFS-4491--n11.patch, HDFS-4491--n2.patch, HDFS-4491--n3.patch,
> HDFS-4491--n4.patch, HDFS-4491--n5.patch, HDFS-4491--n6.patch,
> HDFS-4491--n7.patch, HDFS-4491--n8.patch, HDFS-4491--n9.patch, HDFS-4491.patch
>
>
> Parallel execution of HDFS tests in multiple forks. See HADOOP-9287 for
> details.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira