Michael Ho created IMPALA-7738:
----------------------------------

             Summary: Implement timeouts for HDFS calls
                 Key: IMPALA-7738
                 URL: https://issues.apache.org/jira/browse/IMPALA-7738
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 2.12.0, Impala 3.0, Impala 2.11.0, Impala 2.10.0, 
Impala 2.9.0, Impala 2.8.0, Impala 2.7.0
            Reporter: Michael Ho


Currently, there is no timeout with the various HDFS calls (e.g. hdfsOpen(), 
hdfsRead()) we made in libhdfs.so in either the disk-io-mgr thread or scanner 
thread context. Various users of Impala have complaint in the past about hung 
queries which eventually boiled down to stuck hdfs calls. HDFS maintainers have 
been slow to find the root cause of those hangs. To make this kind of stuck 
queries problem easier to identify in the future, we should just enforce a 
timeout in various hdfs calls so the queries will fail when certain HDFS calls 
take longer than a designated timeout period.

There may be multiple layers which this timeout can be enforced:
 * at Impala level, we can have a fixed sized thread pool which handles all 
hdfs calls. The existing hdfs calls will be a wrapper with a timeout.
 * at libhdfs.so, enforce a timeout at places in the HDFS client code which may 
block forever.

The second option is probably beyond the charter of Apache Impala project.

cc'ing [[email protected]], [~joemcdonnell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to