I have large number of .txt files that are individually tarballed (i.e. compressed from .txt to .txt.tar.gz). (I received them like this.)
Each .txt file is actually a psv file, but, for some reason, it is saved as txt. I have created a custom 'txt' storage configuration and I can query uncompressed .txt, structured as .psv without a problem. select * from dfs.`/<my path here>/<myfile>_08072015.txt`; When I try to query compressed file I get an error: 0: jdbc:drill:zk=local> select * from dfs.`/<my path here>/<myfile>_08072015.txt.tar.gz`; Aug 20, 2015 10:43:01 AM org.apache.calcite.sql.validate.SqlValidatorException <init> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException <init> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 17: Table 'dfs/<my path here>/<myfile>_08072015.txt.tar.gz' not found Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010] (state=,code=0) 0: jdbc:drill:zk=local> Is it possible to fix this, or is Drill just not able to recognize custom format inside the tar/gz compressed file? Please advise.
