Github user gparai commented on a diff in the pull request:
https://github.com/apache/drill/pull/802#discussion_r108822233
--- Diff:
contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/BinaryTableGroupScan.java
---
@@ -115,8 +112,11 @@ private void init() {
try (Admin admin = formatPlugin.getConnection().getAdmin();
RegionLocator locator =
formatPlugin.getConnection().getRegionLocator(tableName)) {
hTableDesc = admin.getTableDescriptor(tableName);
- tableStats = new MapRDBTableStats(getHBaseConf(),
hbaseScanSpec.getTableName());
-
+ // Fetch rowCount only once and cache it in hbaseScanSpec.
+ if (hbaseScanSpec.getRowCount() == hbaseScanSpec.ROW_COUNT_UNKNOWN) {
+ MapRDBTableStats tableStats = new MapRDBTableStats(getHBaseConf(),
hbaseScanSpec.getTableName());
+ hbaseScanSpec.setRowCount(tableStats.getNumRows());
--- End diff --
This looks weird. We create the TableStats relying on some information from
the ScanSpec and then proceed to modify the same ScanSpec with the information
retrieved from TableStats. Please look at below comment as well regarding Spec
mutability.
Should we instead just overload the MapRDBTableStats constructor to allow
passing numRows - since that is what the existing constructor endup doing but
makes a call to DB Client? So instead of populating the ScanSpec we populate
the tableStats using this new constructor?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---