-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28417/
-----------------------------------------------------------
(Updated Nov. 25, 2014, 12:13 a.m.)
Review request for drill.
Changes
-------
In case hive metadata has 0 as row count, getScanStats() uses the inputSplit
size to estimate the number of rows
Bugs: DRILL-1742
https://issues.apache.org/jira/browse/DRILL-1742
Repository: drill-git
Description
-------
HiveScan.getSplits() already gets the table and partitions metadata using
MetaStoreUtils.
We compute the total number of rows using the numRows property and store the
computed number of rows in rowCount attribute which is later returned by
getScanStats().
Diffs (updated)
-----
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScan.java
ddbc100
Diff: https://reviews.apache.org/r/28417/diff/
Testing (updated)
-------
created several partitioned and non-partitioned tables, loaded data in hive.
used explain plan to check the number of rows when the whole table is queried
and also when specific partitions are queried (to make sure the row count takes
hive partition pruning into account)
Thanks,
abdelhakim deneche