[ 
https://issues.apache.org/jira/browse/TRAFODION-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706320#comment-14706320
 ] 

Atanu Mishra commented on TRAFODION-237:
----------------------------------------

Ravisha Neelakanthappa (ravisha-neelakanthappa) wrote on 2014-05-05:    #1
IThere are three problems :
1. Optimizer doesn't choose MDAM plan for range predicates on key column.

For Range predicate, default plan for SALT table should be MDAM plan. This plan 
was not being considered because checkMDAMadditionalRestriction() method was 
deciding MDAM plan didn't make sense. I have added a logic if the leading key 
colum is salted column, then ignore missing predicate on _SALT_ and consider 
MDAM plan.

2. Explain info of MDAM plan doesn't display disjuncts information
explain code FileScan::addLocalExpr() of RelExpr.cpp, thinks plan is subset 
scan even though MDAM key is not NULL. The reason being setting up both subset 
scan key and mdam key in HbaseAccess object. If the partitioing function is 
Range, HbaseAccess::preCodeGen() creates searchKey, since the first check is 
"if (getSearchKeyPtr() != NULL)", we think it's a subset scan. The correct way 
should be to check getMdamKeyPtr() first as it is being done in many places in 
generator code – this what I have done.

3. Parallel MDAM plan with range partitioning function return every row twice.
Actually user should get 50 rows, but we get 100 rows, 50 from each range 
partition. If Dop is 4 range parts, then we would get 200 rows.

>>prepare xx from
select count(*)
from shb1
where shb1.uniq < 50
for read uncommitted access;+>+>+>+>

--- SQL command prepared.
>>explain options 'f' xx;

LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------

4 . 5 root 1.00E+000
3 . 4 sort_partial_aggr_ro 1.00E+000
2 . 3 esp_exchange 1:2(range) 1.00E+000
1 . 2 sort_partial_aggr_le 1.00E+000
. . 1 trafodion_scan SHB1 5.00E+001

--- SQL operation complete.
>>set statistics on;
>>execute xx;

(EXPR)
--------------------

                 100

--- 1 row(s) selected.

Start Time 2014/05/05 14:29:01.384941
End Time 2014/05/05 14:29:02.768792
Elapsed Time 00:00:01.383851
Compile Time 00:00:05.954896
Execution Time 00:00:01.383851

Table Name Records Records Hdfs Hdfs I/O Hdfs Access
                      Accessed Used I/Os Bytes Time(usec)
TRAFODION.HBASE.SHB1
                           100 100 0 0 215341

--- SQL operation complete.
>>

Hans is going to fix the third problem

Ravisha Neelakanthappa (ravisha-neelakanthappa) wrote on 2014-05-15:    #2
I have fixed incorrect results issue by adding partition key predicates as 
additional mdam disjuncts. This happens at preCodeGen phase. With this change, 
I get correct results i.e 50 rows
>>execute xx;

(EXPR)
--------------------

                  50

--- 1 row(s) selected.

Start Time 2014/05/06 23:09:07.880913
End Time 2014/05/06 23:09:09.365614
Elapsed Time 00:00:01.484701
Compile Time 00:00:06.210960
Execution Time 00:00:01.484701

Table Name Records Records Hdfs Hdfs I/O Hdfs Access
                      Accessed Used I/Os Bytes Time(usec)
TRAFODION.HBASE.SHB1
                            50 50 0 0 190726

--- SQL operation complete

Changed in trafodion:
status: In Progress → Fix Committed
Stacey Johnson (sjohnson-w) on 2014-06-10
information type:       Proprietary → Public
Julie Thai (julie-y-thai) wrote on 2014-08-12:  #3
Verified on cluster, build 20140730_0830.

Changed in trafodion:
status: Fix Committed → Fix Released


> LP Bug: 1311871 - compiler doesn't set begin and end key for salted tables
> --------------------------------------------------------------------------
>
>                 Key: TRAFODION-237
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-237
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Ravisha Neelakanthappa
>            Assignee: Ravisha Neelakanthappa
>            Priority: Critical
>
> The compiler doesn't set begin and end key on salted table. Because of this, 
> scan on Habase tables with range predicates on key columns reads all the 
> rows. Same problem exist for both serial plan (with single partition 
> function) and parallel plan with range partition function.
> To reproduce :
> set schema trafodion.hbase;
> drop table shb2;
> create table shb2
>   (uniq int not null,
>    c10K int ,
>    c1K   int,
>    c100  int,
>    c10   int,
>    c1    int,
>    primary key (uniq)
>   ) salt using 4 partitions ;
> upsert with no rollback
> into shb2
> select
> 0 + (10000 * x10000) + (1000 * x1000) + (100 * x100) + (10 * x10) +( 1 * x1),
> 0 + (1000 * x1000) + (100 * x100) + (10 * x10) + (1 * x1),
> 0 + (100 * x100) + (10 * x10) + (1 * x1),
> 0 + (10 * x10) + (1 * x1),
> 0 + (1 * x1),
> 0
> from (values(0))t
> transpose 0,1,2,3,4,5,6,7,8,9 as x10000
> transpose 0,1,2,3,4,5,6,7,8,9 as x1000
> transpose 0,1,2,3,4,5,6,7,8,9 as x100
> transpose 0,1,2,3,4,5,6,7,8,9 as x10
> transpose 0,1,2,3,4,5,6,7,8,9 as x1
> ;
> update statistics for table shb2 on every column;
> prepare xx from 
> select count(*), uniq, c10K, c1K, c100, c10, c1
> from shb1 
> where shb1.uniq < 50 
> group by uniq, c10K, c1K, C100, c10, c1
> for read uncommitted access;
> >>explain options 'f' xx;
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
> ---- ---- ---- --------------------  --------  --------------------  ---------
> 3    .    4    root                                                  1.00E+000
> 2    .    3    firstn                                                1.00E+000
> 1    .    2    hash_groupby                                          5.00E+001
> .    .    1    trafodion_scan                  SHB1                  5.00E+001
> --- SQL operation complete.
> TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
> TABLE_NAME ............... SHB1
> REQUESTS_IN .............. 1
> ROWS_OUT ................ 50
> EST_OPER_COST ............ 0.03
> EST_TOTAL_COST ........... 0.03
> DESCRIPTION
>   max_card_est .......... 50
>   fragment_id ............ 0
>   parent_frag ............ (none)
>   fragment_type .......... master
>   scan_type .............. subset scan of table TRAFODION.HBASE.SHB1
>   columns ................ all
>   begin_keys(incl)
>   end_keys(incl)
>   key_columns ............ _SALT_, UNIQ
>   executor_predicates .... (UNIQ < 50)
>   part_key_predicates .... (UNIQ < 50)
> >> execute xx;
> --- 50 row(s) selected.
> Start Time             2014/04/16 15:32:38.308978
> End Time               2014/04/16 15:32:45.236699
> Elapsed Time                      00:00:06.927721
> Compile Time                      00:00:00.093071
> Execution Time                    00:00:06.927721
> Table Name             Records        Records      Hdfs       Hdfs I/O    
> Hdfs Access
>                       Accessed           Used      I/Os          Bytes     
> Time(usec)
> TRAFODION.HBASE.SHB1
>                         100000             50         0              0        
> 6923951
> --- SQL operation complete.
> -- with Range partition plan
> >>explain options 'f' xx;
> LC   RC   OP   OPERATOR              OPT       DESCRIPTION           CARD
> ---- ---- ---- --------------------  --------  --------------------  ---------
> 4    .    5    root                                                  5.00E+001
> 3    .    4    esp_exchange                    1:12(hash2)           5.00E+001
> 2    .    3    hash_groupby                                          5.00E+001
> 1    .    2    esp_exchange                    12(hash2):2(range)    5.00E+001
> .    .    1    trafodion_scan                  SHB1                  5.00E+001
> --- SQL operation complete.
> TRAFODION_SCAN ============================  SEQ_NO 1        NO CHILDREN
> TABLE_NAME ............... SHB1
> REQUESTS_IN .............. 1
> ROWS_OUT ................ 50
> EST_OPER_COST ............ 0.03
> EST_TOTAL_COST ........... 0.03
> DESCRIPTION
>   max_card_est .......... 50
>   fragment_id ............ 3
>   parent_frag ............ 2
>   fragment_type .......... esp
>   scan_type .............. subset scan of table TRAFODION.HBASE.SHB1
>   key_columns ............ _SALT_, UNIQ
>   executor_predicates .... (UNIQ < 50)
>   part_key_predicates .... (UNIQ < 50)
>   begin_key .............. (_SALT_ = \:_sys_HostVarLo0),
>                              (UNIQ = \:_sys_HostVarLo1)
>   end_key ................ (_SALT_ = \:_sys_HostVarHi0),
>                              (UNIQ = \:_sys_HostVarHi1)
> --- SQL operation complete.
> >>
> >> execute xx;
> --- 50 row(s) selected.
> Start Time             2014/04/23 20:32:37.949880
> End Time               2014/04/23 20:32:42.787625
> Elapsed Time                      00:00:04.837745
> Compile Time                      00:00:00.112411
> Execution Time                    00:00:04.837745
> Table Name             Records        Records      Hdfs       Hdfs I/O    
> Hdfs Access
>                       Accessed           Used      I/Os          Bytes     
> Time(usec)
> TRAFODION.HBASE.SHB1
>                         100000             50         0              0        
> 4485532
> --- SQL operation complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to