https://issues.apache.org/jira/browse/IMPALA-7738 can mitigate this to a large degree (it will cancel the query and log the error if the timeout is exceeded). It is not a 100% solution since the thread pool is fixed in size and can in theory get exhausted.
On Sun, Nov 1, 2020 at 9:18 PM hexianqing <hexianqing...@126.com> wrote: > Hi all, > I hit the issue that queries Stuck on Failed HDFS Calls and not Timing out > several times when the Namenode is heavily loaded。 > In Impala Known Issues, it is described as follows: > "In Impala 3.2 and higher, if the following error appears multiple times > in a short duration while running a query, it would mean that the > connection between the impalad and the HDFS NameNode is in a bad state and > hence the impalad would have to be restarted: > "hdfsOpenFile() for <filename> at backend <hostname:port> failed to finish > before the <hdfs_operation_timeout_sec> second timeout " > In Impala 3.1 and lower, the same issue would cause Impala to wait for a > long time or hang without showing the above error message. > Apache Issue: HADOOP-15720( > https://issues.apache.org/jira/browse/HADOOP-15720) > Affected Versions: All versions of Impala > Workaround: Restart the impalad in the bad state." > I wonder if there is a way to avoid this or is there a plan to fix it, > thank you! >