[
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mickael Olivier updated HDFS-6515:
----------------------------------
Tags: testPageRounder, FsDatasetCache
Target Version/s: 3.0.0
Affects Version/s: 3.0.0
Release Note: Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on
Ubuntu 14.0, on Fedora 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder
-X test
Status: Patch Available (was: Open)
This patch does 2 things that should NOT modify the behavior before applying it
when used with systems with a PAGE_SIZE of 4096 :
1 - Change in TestFsDatasetCache.java
\- private static final long CACHE_CAPACITY = 64 * 1024;
+ private static final long CACHE_CAPACITY = 16 * PAGE_SIZE;
2 - Change in NativeIO.java, class NoMlockCacheManipulator
\- public long getOperatingSystemPageSize() { return 4096; }
+ public long getOperatingSystemPageSize() { return
NativeIO.getOperatingSystemPageSize(); }
The first change is motivated by the fact that on systems with a page size of,
e.g. 65536 bytes, we could only reserve one page in the cache for testing.
The second is motivated by the fact that on systems with a page size of, e.g.
65536 bytes, saying it is 4096 leaded method verifyExpectedCacheUsage to fail
even when the suited number of blocks was reserved (i.e. leading to a timeout)
> testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> -----------------------------------------------------------------------------
>
> Key: HDFS-6515
> URL: https://issues.apache.org/jira/browse/HDFS-6515
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 2.4.0, 3.0.0
> Environment: Linux on PPC64
> Reporter: Tony Reix
> Priority: Blocker
> Labels: test
>
> I have an issue with test :
> testPageRounder
> (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> on Linux/PowerPC.
> On Linux/Intel, test runs fine.
> On Linux/PowerPC, I have:
> testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> Time elapsed: 64.037 sec <<< ERROR!
> java.lang.Exception: test timed out after 60000 milliseconds
> Looking at details, I see that some "Failed to cache " messages appear in the
> traces. Only 10 on Intel, but 186 on PPC64.
> On PPC64, it looks like some thread is waiting for something that never
> happens, generating a TimeOut.
> I'm now using IBM JVM, however I've just checked that the issue also appears
> with OpenJDK.
> I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
> I need help for understanding what the test is doing, what traces are
> expected, in order to understand what/where is the root cause.
--
This message was sent by Atlassian JIRA
(v6.2#6252)