Hans Zeller created TRAFODION-2700:
--------------------------------------

             Summary: Query that selects only a single salt value gets parallel 
plan
                 Key: TRAFODION-2700
                 URL: https://issues.apache.org/jira/browse/TRAFODION-2700
             Project: Apache Trafodion
          Issue Type: Bug
          Components: sql-cmp
    Affects Versions: 2.1-incubating
         Environment: any
            Reporter: Hans Zeller
            Assignee: Hans Zeller


For some queries we saw parallel plans where the parallelism didn't really 
help, because the WHERE predicate selected only a single salt values. The 
overhead isn't huge, but it can add up when executing many such queries.

Example:

create table ts(a integer not null primary key, b char(2000)) salt using 4 
partitions;
explain  select count(*) from ts <<+ cardinality 1e7>> where a =1;

The problem, I think, is in method 
SimpleFileScanOptimizer::scmComputeCostVectorsForHbase(), file 
core/sql/optimizer/ScmCostMethod.cpp. This computes separate degrees of 
parallelism for the region server side and the client side and scales the costs 
incurred on each side separately.

However, if there are more ESPs (clients) than regions, some ESPs have nothing 
to do, limiting the parallelism. On the other hand, if there are more regions 
than ESPs, each ESP reads regions sequentially, so that limits the DoP on the 
region server side.

Therefore, my suggested fix is to use the minimum of those two DoPs to compute 
the cost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to