Performance of Drill SQL for Hadoop when Drill is outside Hadoop cluster

Shashanka Kuntala Sat, 02 Jan 2016 12:39:57 -0800

I have a use-case where 100s of TB of data is in HDFS. Installing Drill on all 
nodes of the HDFS is not an option.  If I have a separate Apache Drill cluster 
(external to HDFS), how will  Apache Drill SQL perform with large data sets ?  
Specifically I would like to know if Drill submits MapReduce jobs on HDFS or 
does Drill extract all data from HDFS cluster into Drill cluster before 
applying filters/joins ? Will Drill pushdown SQL into HDFS ?

Performance of Drill SQL for Hadoop when Drill is outside Hadoop cluster

Reply via email to