Gautam Gopalakrishnan created IMPALA-8708:
---------------------------------------------
Summary: Impala should ignore deleted files
Key: IMPALA-8708
URL: https://issues.apache.org/jira/browse/IMPALA-8708
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 3.2.0
Reporter: Gautam Gopalakrishnan
When querying an S3 backed table that is being modified (e.g. distcp content
from another cluster) and Impala is able to determine that a file in that table
has been deleted (e.g. using the S3guard feature in CDH), queries still fail
with a {{FileNotFound}} exception.
Performing a metadata refresh after the copy completes does resolve the
problem. However this doesn't help during the copy phase. Requesting an
enhancement where Impala can ignore files if knows that they've been deleted.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)