[ https://issues.apache.org/jira/browse/HADOOP-12875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231362#comment-15231362 ]
Chris Nauroth commented on HADOOP-12875: ---------------------------------------- [~vishwajeet.dusane], thank you for providing a patch to make use of the contract tests. I have a few comments in addition to the helpful feedback from Tony. How can I (and other Apache community members) obtain account credentials that we can put into contract-test-options.xml, so that we can run the tests? I have an Azure subscription. I checked manage.windowsazure.com, but I couldn't find an option for provisioning Azure Data Lake access. It's going to be vital for ongoing maintenance that community members have a way to get credentials so that they can test patches against the live service before committing. For configuration of the credentials, I recommend using a technique we've used in hadoop-aws and hadoop-azure to split the credentials into a separate XML file, which then gets XIncluded from the main XML file. We can then place the name of the file with the credentials into .gitignore. This helps prevent accidentally committing someone's private credentials to the Apache repo, which would then compromise the account. Check out hadoop-aws and hadoop-azure for more details on how to do this. {code} System.setProperty("hadoop.home.dir", System.getProperty("user.dir")); {code} Why is this necessary? I'm unclear on the intent of the various "benchmark" tests. They use a mock back-end, so they aren't really providing an accurate benchmark of the true service interaction. There are no assertions, so they aren't verifying functionality beyond making sure things don't throw exceptions. They print timing information to the console, so is the expectation that these tests could be used for manual measurement before and after applying later patches? Your time measurements are using {{System#currentTimeMillis}}, which may be subject to inaccuracy if the system clock changes or NTP makes a negative adjustment in the middle of a test run. Instead, I recommend using {{org.apache.hadoop.util.Time#monotonicNow}}, which is a wrapper over {{System#nanoTime}}, which is guaranteed to be monotonic increasing. {code} @Override protected AbstractFSContract createContract(Configuration configuration) { try { return new AdlStorageContract(configuration); } catch (URISyntaxException e) { return null; } catch (IOException e) { return null; } } {code} If any of these exceptions happens, then returning null is likely to cause a confusing {{NullPointerException}} later. I'd prefer that we fail fast by throwing an unchecked exception, such as {{IllegalStateException}}, with a descriptive error message, and the original exception nested as root cause. It's unusual to see contract test subclasses adding other test cases specific to the file system, like {{readCombinationTest}}. The abstract contract test classes are meant to fully define the test cases, and then the subclasses usually just tweak the contract and skip tests that they aren't able to satisfy yet. For clarity, I suggest refactoring those additional tests into separate suites. > [Azure Data Lake] Support for contract test and unit test cases > --------------------------------------------------------------- > > Key: HADOOP-12875 > URL: https://issues.apache.org/jira/browse/HADOOP-12875 > Project: Hadoop Common > Issue Type: Test > Components: fs, fs/azure, tools > Reporter: Vishwajeet Dusane > Assignee: Vishwajeet Dusane > Attachments: Hadoop-12875-001.patch > > > This JIRA describes contract test and unit test cases support for azure data > lake file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)