Stamatis Zampetakis created HIVE-27100: ------------------------------------------
Summary: Remove unused data/files from repo Key: HIVE-27100 URL: https://issues.apache.org/jira/browse/HIVE-27100 Project: Hive Issue Type: Task Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis Some files under [https://github.com/apache/hive/tree/master/data/files] are not referenced anywhere else in the repo and can be removed. Removing them makes it easier to see what is actually tested. Other minor benefits: * faster checkout times; * smaller source/binary releases. The script that was used to find which files are not referenced can be found below: {code:bash} for f in `ls data/files`; do echo -n "$f "; grep -a -R "$f" --exclude-dir=".git" --exclude-dir=target --exclude=\*.q.out --exclude=\*.class --exclude=\*.jar | wc -l | grep " 0$"; done {code} +Output+ {noformat} cbo_t4.txt 0 cbo_t5.txt 0 cbo_t6.txt 0 compressed_4line_file1.csv.bz2 0 empty2.txt 0 filterCard.txt 0 fullouter_string_big_1a_old.txt 0 fullouter_string_small_1a_old.txt 0 futurama_episodes.avro 0 in9.txt 0 map_null_schema.avro 0 regex-path-2015-12-10_03.txt 0 regex-path-201512-10_03.txt 0 regex-path-2015121003.txt 0 sample.json 0 sample-queryplan-in-history.txt 0 sample-queryplan.txt 0 smbbucket_2.txt 0 smb_bucket_input.txt 0 SortDescCol1Col2.txt 0 SortDescCol2Col1.txt 0 sortdp.txt 0 srcsortbucket1outof4.txt 0 srcsortbucket2outof4.txt 0 srcsortbucket4outof4.txt 0 {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)