Re: MS Windows: Hadoop binaries required to run drill?

2016-01-02 Thread Peder Jakobsen | gmail
Hi Jacques, yes when I copy all these files over manually from my Linux machine, everything works as expected on Windows 7 32bit. ODBC drivers and Drill explorer also work fine. So what do you think is causing some of these files not to be written on startup? I have permission on all folders.

Performance of Drill SQL for Hadoop when Drill is outside Hadoop cluster

2016-01-02 Thread Shashanka Kuntala
I have a use-case where 100s of TB of data is in HDFS. Installing Drill on all nodes of the HDFS is not an option.  If I have a separate Apache Drill cluster (external to HDFS), how will  Apache Drill SQL perform with large data sets ?   Specifically I would like to know if Drill submits

Re: Performance of Drill SQL for Hadoop when Drill is outside Hadoop cluster

2016-01-02 Thread Ted Dunning
Tomer's answer was excellent, but he didn't address this issue. HDFS doesn't have enough smarts to allow pushdown of SQL predicates. The closest you can come is to use intelligent partitioning (your intelligence, not that of HDFS, btw). In that case Drill will avoid reading files that it can

Re: Performance of Drill SQL for Hadoop when Drill is outside Hadoop cluster

2016-01-02 Thread Jason Altekruse
Hi Shashanka, Drill does have the ability to avoid reading part of your data by using partitioning. This currently works best using partitioned parquet files. Drill includes an auto-partitioning feature available for use with the CREATE TABLE AS statement that works when outputting to the parquet