[jira] [Updated] (SENTRY-1377) improve handling of failures, both in tests and after-test cleanup, in TestHDFSIntegration.java

Vadim Spector (JIRA) Tue, 13 Dec 2016 07:37:22 -0800

     [ 
https://issues.apache.org/jira/browse/SENTRY-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vadim Spector updated SENTRY-1377:
----------------------------------
    Attachment:     (was: SENTRY-1377.002.patch)

> improve handling of failures, both in tests and after-test cleanup, in 
> TestHDFSIntegration.java
> -----------------------------------------------------------------------------------------------
>
>                 Key: SENTRY-1377
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1377
>             Project: Sentry
>          Issue Type: Sub-task
>          Components: Sentry
>    Affects Versions: 1.8.0
>            Reporter: Vadim Spector
>            Assignee: Vadim Spector
>             Fix For: 1.8.0
>
>         Attachments: SENTRY-1377.001.patch, SENTRY-1377.002.patch
>
>
> There are multiple issues making HDFS sync tests flaky or sometimes failing.
> 1. TestHDFSIntegrationBase.java should provide best-attempt cleanup in 
> cleanAfterTest() method. Currently, if any cleanup operation fails, the rest 
> of cleanup code is not executed. Cleanup logic would not normally fail if 
> corresponding test succeeds. But if some do not, short-circuiting some 
> cleanup logic tends to cascade secondary failures which complicate 
> troubleshooting.
> 2. TestHDFSIntegration*.java classes do not guarantee calling close() method 
> on Connection and Statement objects. It happens because 
> a) no try-finally or try-with-resource is used, so tests can skip close() 
> calls if fail in the middle.
> b) many methods re-open Connection and Statement multiple times, yet provide 
> a single close() at the end. 
> 3. Retry logic uses recursion in some places, as in startHiveServer2() and 
> verifyOnAllSubDirs. Better to implement it via straightforward retry loop. 
> Exception stack trace is more confusing than it needs to be in case of 
> recursive calls. Plus, with NUM_RETRIES == 10, at least theoretically, it 
> creates running out of stack as an unnecessary failure mechanism.
> 4. startHiveServer2() ignores hiveServer2.start() call failure, only logging 
> info message.
> 5. Starting hiveServer2 and Hive metastore in separate threads and then 
> keeping those threads alive seems unnecessary, since both servers' start() 
> methods create servers running on their own threads anyway. It effectively 
> leads to ignoring the start() method failure for both servers. Also, it 
> leaves no guarantee that hiveServer2 will be started after the Hive metastore 
> - both start from independent threads with no cross-thread coordination 
> mechanism in place.
> 6. Thread.sleep() missing in multiple places between HiveServer2 SQL calls 
> changing permissions and verifyOnAllSubDirs() calls verifying that those 
> changes took effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (SENTRY-1377) improve handling of failures, both in tests and after-test cleanup, in TestHDFSIntegration.java

Reply via email to