[ 
https://issues.apache.org/jira/browse/TRAFODION-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109252#comment-16109252
 ] 

Hans Zeller commented on TRAFODION-2700:
----------------------------------------

You are right, the example shown in this case is not good, since it's a unique 
access, and the cardinality hint contradicts that.

The code does an analysis of the partition key predicates and determines 
correctly that only one region is involved (for unique or non-unique queries). 
The DoP on the client side (# of ESPs) is not dependent on the predicate 
analysis, so it's some number determined by the cardinality.

> Query that selects only a single salt value gets parallel plan
> --------------------------------------------------------------
>
>                 Key: TRAFODION-2700
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2700
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 2.1-incubating
>         Environment: any
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>
> For some queries we saw parallel plans where the parallelism didn't really 
> help, because the WHERE predicate selected only a single salt values. The 
> overhead isn't huge, but it can add up when executing many such queries.
> Example:
> create table ts(a integer not null primary key, b char(2000)) salt using 4 
> partitions;
> explain  select count(*) from ts <<+ cardinality 1e7>> where a =1;
> The problem, I think, is in method 
> SimpleFileScanOptimizer::scmComputeCostVectorsForHbase(), file 
> core/sql/optimizer/ScmCostMethod.cpp. This computes separate degrees of 
> parallelism for the region server side and the client side and scales the 
> costs incurred on each side separately.
> However, if there are more ESPs (clients) than regions, some ESPs have 
> nothing to do, limiting the parallelism. On the other hand, if there are more 
> regions than ESPs, each ESP reads regions sequentially, so that limits the 
> DoP on the region server side.
> Therefore, my suggested fix is to use the minimum of those two DoPs to 
> compute the cost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to