Question on querying parquet files

Johannes Zillmann Thu, 14 Apr 2016 05:53:28 -0700

Hey there,

2 more questions on querying parquet files.


(1) Cache files locally ?
So far i tested Drill/Parquet only with a local file-system. If Drill loads a 
file from HDFS, how much of a overhead is that... Does it load the file from 
HDFS for the first query only and then keeps the file cached locally or does it 
touch HDFS for each query ?

(2) Parallelization on a single Parquet File ?
In case i query a single file, does Drill split the workload across its 
drillbits or does that only happen querying multiple files ?

Johannes

Question on querying parquet files

Reply via email to