Joe McDonnell created IMPALA-7946:
-------------------------------------

             Summary: SynchronousThreadPool::SynchronousOffer() can return a 
timeout Status with the wrong time limit
                 Key: IMPALA-7946
                 URL: https://issues.apache.org/jira/browse/IMPALA-7946
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 3.2.0
            Reporter: Joe McDonnell
            Assignee: Joe McDonnell


A recent core build failed on custom_cluster/test_hdfs_timeout.py with this 
test output:
{noformat}
custom_cluster/test_hdfs_timeout.py:82: in test_hdfs_open_timeout
    assert len(re.findall(error_pattern, str(ex))) > 0
E   assert 0 > 0
E    +  where 0 = len([])
E    +    where [] = <function findall at 0x7f09aaa4e938>('hdfsOpenFile\\(\\) 
for.*failed to finish before the 5 second timeout', 'ImpalaBeeswaxException:\n 
Query aborted:hdfsOpenFile() for 
hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11/091101.txt 
failed to finish before the 4 second timeout\n\n')
E    +      where <function findall at 0x7f09aaa4e938> = re.findall
E    +      and   'ImpalaBeeswaxException:\n Query aborted:hdfsOpenFile() for 
hdfs://localhost:20500/test-warehouse/alltypes/year=2009/month=11/091101.txt 
failed to finish before the 4 second timeout\n\n' = 
str(ImpalaBeeswaxException()){noformat}
When executing SynchronousOffer(), two different operation count towards the 
timeout. The first is submitting the task by calling Offer with the 
SynchronousWorkItem. The second is waiting for the task to complete by calling 
SynchronousWorkItem::Wait(). If the first part task takes any measurable time, 
then SynchronousOffer() modifies the timeout that it passes into 
SynchronousWorkItem::Wait() so that the total timeout is respected. The 
enforcement of the new timeout is correct, but it results in an incorrect error 
message (in this case, showing 4 seconds rather than 5).

This should pass in the original timeout and the current elapsed time. This 
would allow for correct enforcement with a correct error message.

This issue is flaky.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to