Hi Vishal, Pull request with the fix for DRILL-5733 <https://issues.apache.org/jira/browse/DRILL-5733> is opened and will be merged soon.
Kind regards, Volodymyr Vysotskyi On Tue, Feb 4, 2020 at 11:11 PM Vishal Jadhav (BLOOMBERG/ 731 LEX) < vjad...@bloomberg.net> wrote: > It works fine on my local file system, but fails on HDFS. > Not sure, I am running into the issue mentioned here - > https://issues.apache.org/jira/browse/DRILL-5733 > > From: user@drill.apache.org At: 02/04/20 15:48:23To: Vishal Jadhav > (BLOOMBERG/ 731 LEX ) , user@drill.apache.org > Subject: Re: Drill + parquet > > Please look into logs for more details. > Not sure why you see these errors but Drill can perfectly query singe > files, > subset of files and directories. > > select * from dfs.tmp.`*.parquet` limit 4; > select * from dfs.tmp.`0_0_0.parquet`; > > Kind regards, > Arina > > > On Feb 4, 2020, at 7:10 PM, Nitin Pawar <nitinpawar...@gmail.com> wrote: > > > > as the error says .. it expects a directory to query > > also the document has not been modified for more than 3 years so not sure > > if it up to date > > > > On Tue, Feb 4, 2020 at 10:30 PM Vishal Jadhav (BLOOMBERG/ 731 LEX) < > > vjad...@bloomberg.net> wrote: > > > >> I was following the help pages from here. > >> https://drill.apache.org/docs/querying-parquet-files/ > >> As per it, I can query an individual parquet file, why is it failing > with > >> the 'not a directory' error. > >> > >> > >> From: user@drill.apache.org At: 02/04/20 11:28:25To: Vishal Jadhav > >> (BLOOMBERG/ 731 LEX ) , user@drill.apache.org > >> Subject: Re: Drill + parquet > >> > >> Parquet is default file format for apache drill > >> so you do not need to give a parquet file for a drill query. Instead > give > >> the folder path which contains the files. > >> > >> eg: select * from hdfs_storage>.<workspace>.`folder1` will query all the > >> parquet files in folder1 > >> > >> On Tue, Feb 4, 2020 at 9:55 PM Vishal Jadhav (BLOOMBERG/ 731 LEX) < > >> vjad...@bloomberg.net> wrote: > >> > >>> Hello Drillers, > >>> > >>> Need some help with the hdfs + parquet files. > >>> > >>> I have configured the HDFS storage with parquet & csv format plugins. > >>> > >>> I can query the - <hdfs_storage>.<csv_ws_name>.`*.csv` correctly. > Also, I > >>> have a similar directory structure for the parquet files (in a > different > >>> directory), But, not able to query it. > >>> > >>> Show files works fine. > >>> (1) The following query works fine - > >>> show files from <hdfs_storage>.<workspace> > >>> > >>> (2) select * from <hdfs_storage>.<workspace>.`*.parquet` limit 4 > >>> Fails with - > >>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > >>> NoSuchElementException > >>> > >>> (3) select * from hdfs_storage>.<workspace>.`xyz.parquet`; > >>> fails with - > >>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > >>> RemoteException:/path/xyz.parquet (is not a directory) > >>> > >>> Please let me know, if I am doing something wrong here. > >>> > >>> Thank you! > >>> - Vishal > >> > >> > >> -- > >> Nitin Pawar > >> > >> > >> > > > > -- > > Nitin Pawar > > >