Why Hive still read so much "records" even with a filter pushdown enabled and 
the returned dataset would be a very small amount ( 4k out of  30billion 

The "RECORDS_IN" counter of Hive which still showed the 30billion count and 
also the output in the map reduce log like this :

org.apache.hadoop.hive.ql.exec.MapOperator: MAP[4]: records read - 100000

BTW, I am using parquet as stoarg format and the filter pushdown did work as i 
see this in log :

AM INFO: parquet.filter2.compat.FilterCompat: Filtering using predicate: 
eq(myid, 223)



Reply via email to