I have a use-case where 100s of TB of data is in HDFS. Installing Drill on all nodes of the HDFS is not an option. If I have a separate Apache Drill cluster (external to HDFS), how will Apache Drill SQL perform with large data sets ? Specifically I would like to know if Drill submits MapReduce jobs on HDFS or does Drill extract all data from HDFS cluster into Drill cluster before applying filters/joins ? Will Drill pushdown SQL into HDFS ?
- Performance of Drill SQL for Hadoop when Drill is outsid... Shashanka Kuntala
- Re: Performance of Drill SQL for Hadoop when Drill ... Tomer Shiran
- Re: Performance of Drill SQL for Hadoop when Dr... Ted Dunning
- Re: Performance of Drill SQL for Hadoop whe... Jason Altekruse