Or, use _08072015.txt.gz instead of _08072015.txt.tar.gz as shown in http://drill.apache.org/docs/querying-plain-text-files/#query-the-gz-file-directly/
Kristine Hahn Sr. Technical Writer 415-497-8107 @krishahn skype:krishahn On Thu, Aug 20, 2015 at 8:29 AM, Kristine Hahn <[email protected]> wrote: > To be more exact re: the pointer to the doc, the renaming example is in > step 3 of > http://drill.apache.org/docs/querying-plain-text-files/#download-and-set-up-the-data > . > > Kristine Hahn > Sr. Technical Writer > 415-497-8107 @krishahn skype:krishahn > > > On Thu, Aug 20, 2015 at 8:27 AM, Kristine Hahn <[email protected]> wrote: > >> You need to rename the tar/gz to use tbl extension to query PSV if you >> use the default dfs storage configuration. See TSV example on >> http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file >> . >> >> Kristine Hahn >> Sr. Technical Writer >> 415-497-8107 @krishahn skype:krishahn >> >> >> On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <[email protected]> wrote: >> >>> I have large number of .txt files that are individually tarballed (i.e. >>> compressed from .txt to .txt.tar.gz). >>> (I received them like this.) >>> >>> Each .txt file is actually a psv file, but, for some reason, it is saved >>> as >>> txt. >>> >>> I have created a custom 'txt' storage configuration and I can query >>> uncompressed .txt, structured as .psv without a problem. >>> >>> select * from dfs.`/<my path here>/<myfile>_08072015.txt`; >>> >>> When I try to query compressed file I get an error: >>> >>> 0: jdbc:drill:zk=local> select * from dfs.`/<my path >>> here>/<myfile>_08072015.txt.tar.gz`; >>> >>> Aug 20, 2015 10:43:01 AM >>> org.apache.calcite.sql.validate.SqlValidatorException <init> >>> >>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table >>> 'dfs./<my >>> path here>/<myfile>_08072015.txt.tar.gz' not found >>> >>> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException >>> <init> >>> >>> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, >>> column 15 to line 1, column 17: Table 'dfs/<my path >>> here>/<myfile>_08072015.txt.tar.gz' >>> not found >>> >>> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table >>> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found >>> >>> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010] >>> (state=,code=0) >>> >>> 0: jdbc:drill:zk=local> >>> >>> >>> Is it possible to fix this, or is Drill just not able to recognize custom >>> format inside the tar/gz compressed file? >>> >>> Please advise. >>> >> >> >
