Quanlong Huang created IMPALA-13303:
---------------------------------------
Summary: File listing could still be recursive even if
impala.disable.recursive.listing is true
Key: IMPALA-13303
URL: https://issues.apache.org/jira/browse/IMPALA-13303
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
Assignee: Quanlong Huang
During the development of IMPALA-13117, I found the table property
"impala.disable.recursive.listing" is not respected during the initial metadata
loading, i.e. not reloading from REFRESH or HMS events.
To reproduce the issue, rewrite this test statement from REFRESH to INVALIDATE
METADATA:
https://github.com/apache/impala/blob/0a45cb5ae6d1345a7d531c22d174c99ea7cedea0/tests/metadata/test_recursive_listing.py#L126
The test should still pass but it actually fails.
A simpler way to reproduce the issue is:
{code:sql}
create table my_tbl (i int) stored as textfile;
describe formatted my_tbl; // Get the table location, e,g,
hdfs://localhost:20500/test-warehouse/my_tbl
{code}
Upload 3 files to that table location: dir1/data.txt, dir2/data.txt, data.txt.
Then alter the table property:
{code:sql}
alter table my_tbl set tblproperties('impala.disable.recursive.listing'='true');
refresh my_tbl;
show files in my_tbl;{code}
Only the last file, data.txt, should be shown in the results. The other two
files are in subdirs so should be ignored since recursively listing is disabled.
This feature is added in IMPALA-8454. Though rarely used in production, it'd be
nice to fix it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]