Quanlong Huang created IMPALA-13303:
---------------------------------------

             Summary: File listing could still be recursive even if 
impala.disable.recursive.listing is true
                 Key: IMPALA-13303
                 URL: https://issues.apache.org/jira/browse/IMPALA-13303
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


During the development of IMPALA-13117, I found the table property 
"impala.disable.recursive.listing" is not respected during the initial metadata 
loading, i.e. not reloading from REFRESH or HMS events.

To reproduce the issue, rewrite this test statement from REFRESH to INVALIDATE 
METADATA:
https://github.com/apache/impala/blob/0a45cb5ae6d1345a7d531c22d174c99ea7cedea0/tests/metadata/test_recursive_listing.py#L126
The test should still pass but it actually fails.

A simpler way to reproduce the issue is:
{code:sql}
create table my_tbl (i int) stored as textfile;
describe formatted my_tbl; // Get the table location, e,g, 
hdfs://localhost:20500/test-warehouse/my_tbl
{code}
Upload 3 files to that table location: dir1/data.txt, dir2/data.txt, data.txt. 
Then alter the table property:
{code:sql}
alter table my_tbl set tblproperties('impala.disable.recursive.listing'='true');
refresh my_tbl;
show files in my_tbl;{code}
Only the last file, data.txt, should be shown in the results. The other two 
files are in subdirs so should be ignored since recursively listing is disabled.

This feature is added in IMPALA-8454. Though rarely used in production, it'd be 
nice to fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to