[
https://issues.apache.org/jira/browse/DRILL-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993814#comment-15993814
]
ASF GitHub Bot commented on DRILL-5429:
---------------------------------------
Github user gparai commented on a diff in the pull request:
https://github.com/apache/drill/pull/817#discussion_r114429838
--- Diff:
contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonTableGroupScan.java
---
@@ -100,30 +104,46 @@ public GroupScan clone(List<SchemaPath> columns) {
return newScan;
}
+ public JsonTableGroupScan clone(JsonScanSpec scanSpec) {
+ JsonTableGroupScan newScan = new JsonTableGroupScan(this);
+ newScan.scanSpec = scanSpec;
+ newScan.computeRegionsToScan();
+ return newScan;
+ }
+
+ private void computeRegionsToScan() {
+ boolean foundStartRegion = false;
+
+ regionsToScan = new TreeMap<TabletFragmentInfo, String>();
+ for (TabletInfo tabletInfo : tabletInfos) {
+ TabletInfoImpl tabletInfoImpl = (TabletInfoImpl) tabletInfo;
+ if (!foundStartRegion && !isNullOrEmpty(scanSpec.getStartRow()) &&
!tabletInfoImpl.containsRow(scanSpec.getStartRow())) {
+ continue;
+ }
+ foundStartRegion = true;
+ regionsToScan.put(new TabletFragmentInfo(tabletInfoImpl),
tabletInfo.getLocations()[0]);
+ if (!isNullOrEmpty(scanSpec.getStopRow()) &&
tabletInfoImpl.containsRow(scanSpec.getStopRow())) {
+ break;
+ }
+ }
+ }
+
private void init() {
logger.debug("Getting tablet locations");
try {
Configuration conf = new Configuration();
- Table t = MapRDB.getTable(scanSpec.getTableName());
- TabletInfo[] tabletInfos = t.getTabletInfos(scanSpec.getCondition());
- tableStats = new MapRDBTableStats(conf, scanSpec.getTableName());
- boolean foundStartRegion = false;
- regionsToScan = new TreeMap<TabletFragmentInfo, String>();
+ // Fetch table and tabletInfo only once and cache.
+ table = MapRDB.getTable(scanSpec.getTableName());
+ tabletInfos = table.getTabletInfos(scanSpec.getCondition());
+
+ // Calculate totalRowCount for the table
for (TabletInfo tabletInfo : tabletInfos) {
- TabletInfoImpl tabletInfoImpl = (TabletInfoImpl) tabletInfo;
- if (!foundStartRegion
- && !isNullOrEmpty(scanSpec.getStartRow())
- && !tabletInfoImpl.containsRow(scanSpec.getStartRow())) {
- continue;
- }
- foundStartRegion = true;
- regionsToScan.put(new TabletFragmentInfo(tabletInfoImpl),
tabletInfo.getLocations()[0]);
- if (!isNullOrEmpty(scanSpec.getStopRow())
- && tabletInfoImpl.containsRow(scanSpec.getStopRow())) {
- break;
- }
+ totalRowCount += tabletInfo.getEstimatedNumRows();
}
+
--- End diff --
Please add your explanation as a comment
> We should recompute regionsToScan as it depends upon scanSpec
> Improve query performance for MapR DB JSON Tables
> -------------------------------------------------
>
> Key: DRILL-5429
> URL: https://issues.apache.org/jira/browse/DRILL-5429
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization, Storage - MapRDB
> Affects Versions: 1.10.0
> Reporter: Padma Penumarthy
> Assignee: Padma Penumarthy
> Fix For: 1.11.0
>
>
> For MapR DB JSON Tables, cache (per query) and reuse table and tabletInfo,
> instead of fetching the same information multiple times from DB server.
> Also, getting tableStats is an expensive operation. We can avoid doing that
> and instead, get total rowCount from tabletInfo instead.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)