[
https://issues.apache.org/jira/browse/IMPALA-12896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17826895#comment-17826895
]
Wenzhe Zhou commented on IMPALA-12896:
--------------------------------------
Planner.checkForSmallQueryOptimization() use MaxRowsProcessedVisitor to find
the maxRowsProcessed_ from the nodes in the plan tree. For DataSourceScanNode,
its numRows (caller.getInputCardinality()) equals 0, there is no stats and does
not have simple 'limit' for most queries. So MaxRowsProcessedVisitor.visit()
set valid_ as false. This causes Planner to create distributed plan for query
on JDBC tables. The merged patch change MaxRowsProcessedVisitor.visit() for
DataSourceScanNode and estimate numRows as 0. If all the scan nodes are
DataSourceScanNode, then maxRowsProcessed_ will be determined by non scan
nodes. It's more likely to make Planner to create one fragment plan.
Should we create distributed plan for queries with join on multiple JDBC
tables?
> Avoid JDBC table to be set as transactional table
> -------------------------------------------------
>
> Key: IMPALA-12896
> URL: https://issues.apache.org/jira/browse/IMPALA-12896
> Project: IMPALA
> Issue Type: Sub-task
> Components: Frontend
> Reporter: Wenzhe Zhou
> Assignee: Wenzhe Zhou
> Priority: Major
> Fix For: Impala 4.4.0
>
>
> Found following issues in downstream integration.
> 1) JDBC tables created in some deployment environment were set as
> transactional tables by default. This caused catalogd failed to load the
> metadata for JDBC tables. We have to explicitly set table properties with
> "transactional=false" for JDBC tables.
> 2) FileSystemUtil.copyFileFromUriToLocal() function wrote log message only
> for IOException. We should write log message for all types of exceptions so
> that we can captures errors which caused failures to load JDBC drivers.
> 3) The operations on JDBC table are processed only on coordinator. The
> processed rows should be estimated as 0 for DataSourceScanNode by planner so
> that coordinator-only query plans are generated for simple queries on JDBC
> tables and queries could be executed without invoking executor nodes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]