[jira] [Updated] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)

Mickael Olivier (JIRA) Fri, 04 Jul 2014 05:08:26 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mickael Olivier updated HDFS-6515:
----------------------------------

                 Tags: testPageRounder, FsDatasetCache
     Target Version/s: 3.0.0
    Affects Version/s: 3.0.0
         Release Note: Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on 
Ubuntu 14.0, on Fedora 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder 
-X test
               Status: Patch Available  (was: Open)

This patch does 2 things that should NOT modify the behavior before applying it 
when used with systems with a PAGE_SIZE of 4096 :

1 - Change in TestFsDatasetCache.java
\- private static final long CACHE_CAPACITY = 64 * 1024;
+ private static final long CACHE_CAPACITY = 16 * PAGE_SIZE;

2 - Change in NativeIO.java, class NoMlockCacheManipulator

\- public long getOperatingSystemPageSize() { return 4096; }
+ public long getOperatingSystemPageSize() { return 
NativeIO.getOperatingSystemPageSize(); }

The first change is motivated by the fact that on systems with a page size of, 
e.g. 65536 bytes, we could only reserve one page in the cache for testing. 

The second is motivated by the fact that on systems with a page size of, e.g. 
65536 bytes, saying it is 4096 leaded method verifyExpectedCacheUsage to fail 
even when the suited number of blocks was reserved (i.e. leading to a timeout)

> testPageRounder   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-6515
>                 URL: https://issues.apache.org/jira/browse/HDFS-6515
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.4.0, 3.0.0
>         Environment: Linux on PPC64
>            Reporter: Tony Reix
>            Priority: Blocker
>              Labels: test
>
> I have an issue with test :
>    testPageRounder
>   (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
> on Linux/PowerPC.
> On Linux/Intel, test runs fine.
> On Linux/PowerPC, I have:
> testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)  
> Time elapsed: 64.037 sec  <<< ERROR!
> java.lang.Exception: test timed out after 60000 milliseconds
> Looking at details, I see that some "Failed to cache " messages appear in the 
> traces. Only 10 on Intel, but 186 on PPC64.
> On PPC64, it looks like some thread is waiting for something that never 
> happens, generating a TimeOut.
> I'm now using IBM JVM, however I've just checked that the issue also appears 
> with OpenJDK.
> I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 .
> I need help for understanding what the test is doing, what traces are 
> expected, in order to understand what/where is the root cause.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)

Reply via email to