Hans Zeller created TRAFODION-2700:
--------------------------------------
Summary: Query that selects only a single salt value gets parallel
plan
Key: TRAFODION-2700
URL: https://issues.apache.org/jira/browse/TRAFODION-2700
Project: Apache Trafodion
Issue Type: Bug
Components: sql-cmp
Affects Versions: 2.1-incubating
Environment: any
Reporter: Hans Zeller
Assignee: Hans Zeller
For some queries we saw parallel plans where the parallelism didn't really
help, because the WHERE predicate selected only a single salt values. The
overhead isn't huge, but it can add up when executing many such queries.
Example:
create table ts(a integer not null primary key, b char(2000)) salt using 4
partitions;
explain select count(*) from ts <<+ cardinality 1e7>> where a =1;
The problem, I think, is in method
SimpleFileScanOptimizer::scmComputeCostVectorsForHbase(), file
core/sql/optimizer/ScmCostMethod.cpp. This computes separate degrees of
parallelism for the region server side and the client side and scales the costs
incurred on each side separately.
However, if there are more ESPs (clients) than regions, some ESPs have nothing
to do, limiting the parallelism. On the other hand, if there are more regions
than ESPs, each ESP reads regions sequentially, so that limits the DoP on the
region server side.
Therefore, my suggested fix is to use the minimum of those two DoPs to compute
the cost.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)