Re: Query Using Stats

2014-05-16 Thread Edward Capriolo
Hive does not know that the values of column `seconds` and partition `range` or related. Hive can only use the WHERE clause to remove partitions that do not match the range criteria. All the data inside the partition is not ordered in any way so the minimum seconds and maximum seconds could be in

Re: Query Using Stats

2014-05-16 Thread Bryan Jeffrey
Prasanth, I had the correct flag enabled (see query in original email). Issue is that it does not appear to be correctly using partition stats for the calculation. Table is an orc table. It appears in the log that stats are being calculated, but does not appear to be working when queries are run a

Re: Query Using Stats

2014-05-16 Thread Prasanth Jayachandran
Bryan, The flag you are looking for is hive.compute.query.using.stats. By default this optimization is disabled. You might need to enable it to use it. Also the min/max/sum metadata are not looked up from the file but instead from metastore. Although file formats like ORC contains stats, they a