Re: Why the filter push down does not reduce the read data record count

2018-02-24 Thread Sun, Keith
filter is not an easy thing and I will try that according to the deck. Thanks ! From: Furcy Pin mailto:pin.fu...@gmail.com>> Sent: Friday, February 23, 2018 3:37:52 AM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: Why the filter push down does not reduce the read

Re: Why the filter push down does not reduce the read data record count

2018-02-23 Thread Furcy Pin
gt; *To:* user@hive.apache.org > *Subject:* Re: Why the filter push down does not reduce the read data > record count > > Hi, > > Unless your table is partitioned or bucketed by myid, Hive generally > requires to read through all the records to find the records that match > yo

Re: Why the filter push down does not reduce the read data record count

2018-02-23 Thread Sun, Keith
filter push down does not reduce the read data record count Hi, Unless your table is partitioned or bucketed by myid, Hive generally requires to read through all the records to find the records that match your predicate. In other words, Hive table are generally not indexed for single record

Re: Why the filter push down does not reduce the read data record count

2018-02-23 Thread Furcy Pin
Hi, Unless your table is partitioned or bucketed by myid, Hive generally requires to read through all the records to find the records that match your predicate. In other words, Hive table are generally not indexed for single record retrieval like you would expect RDBMs tables or Vertica tables to

Why the filter push down does not reduce the read data record count

2018-02-23 Thread Sun, Keith
Hi, Why Hive still read so much "records" even with a filter pushdown enabled and the returned dataset would be a very small amount ( 4k out of 30billion records). The "RECORDS_IN" counter of Hive which still showed the 30billion count and also the output in the map reduce log like this :