[
https://issues.apache.org/jira/browse/TRAFODION-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109165#comment-16109165
]
Hans Zeller commented on TRAFODION-2700:
----------------------------------------
You can try the test case, it produces a parallel query. The case you describe
is more realistic and that's what we saw in the actual problem seen in a
customer scenario.
> Query that selects only a single salt value gets parallel plan
> --------------------------------------------------------------
>
> Key: TRAFODION-2700
> URL: https://issues.apache.org/jira/browse/TRAFODION-2700
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Affects Versions: 2.1-incubating
> Environment: any
> Reporter: Hans Zeller
> Assignee: Hans Zeller
>
> For some queries we saw parallel plans where the parallelism didn't really
> help, because the WHERE predicate selected only a single salt values. The
> overhead isn't huge, but it can add up when executing many such queries.
> Example:
> create table ts(a integer not null primary key, b char(2000)) salt using 4
> partitions;
> explain select count(*) from ts <<+ cardinality 1e7>> where a =1;
> The problem, I think, is in method
> SimpleFileScanOptimizer::scmComputeCostVectorsForHbase(), file
> core/sql/optimizer/ScmCostMethod.cpp. This computes separate degrees of
> parallelism for the region server side and the client side and scales the
> costs incurred on each side separately.
> However, if there are more ESPs (clients) than regions, some ESPs have
> nothing to do, limiting the parallelism. On the other hand, if there are more
> regions than ESPs, each ESP reads regions sequentially, so that limits the
> DoP on the region server side.
> Therefore, my suggested fix is to use the minimum of those two DoPs to
> compute the cost.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)