Hi Eric, I don't know. But I have a first guess, and that is that getRepeatCountForOperatorsInDP2() is stubbed in Trafodion, since the DP2 component refers to the predecessor product and is no longer present.
Dave -----Original Message----- From: Eric Owhadi [mailto:[email protected]] Sent: Tuesday, March 15, 2016 3:34 PM To: [email protected] Subject: something weird about MDAM compiler costing: isMultipleProbe() returning false... In an attempt to implement small scanner optimization for MDAM scan, I investigated the compiler costing of MDAM plan, with the hope of finding out there, how many scan splits and probes would a particular MDAM plan result in. The immediate result would have been that I could easily guestimate the size of each scan and determine if it is below the Hbase Block size and therefore turn on or off small scanner accordingly. I have measured a 1.39X boost in performance in MDAM scan for situation where scans are good fit for small scanner… so I am motivated to get this feature in J So I wrote a simple test: create table t132helper (a int not null, primary key(a)); insert into t132helper values(1); create table t132 (k1 int not null, k2 int not null, a int not null, b int not null, c char(1000), primary key (k1,k2)) ATTRIBUTES ALIGNED FORMAT ; upsert using load into t132 select x1000*1000+ x10000*10000 + x100000*100000 , x1+x10*10+x100*100, x1+x10*10+x100*100+ x1000*1000+ x10000*10000 + x100000*100000 , x1+x10*10+x100*100+ x1000*1000+ x10000*10000 + x100000*100000 , 'yo bro' from t132helper transpose 0,1,2,3,4,5,6,7,8,9 as x1 transpose 0,1,2,3,4,5,6,7,8,9 as x10 transpose 0,1,2,3,4,5,6,7,8,9 as x100 transpose 0,1,2,3,4,5,6,7,8,9 as x1000 transpose 0,1,2,3,4,5,6,7,8,9 as x10000 transpose 0,1,2,3,4,5,6,7,8,9 as x100000; update statistics for table t132 on every column; then explain select sum(a) from t132 where k2 between 0 and 100 and k1>500000; this correctly result in an MDAM plan but: ------------------------------------------------------------------ PLAN SUMMARY MODULE_NAME .............. DYNAMICALLY COMPILED STATEMENT_NAME ........... XX PLAN_ID .................. 212324835567955274 ROWS_OUT ................. 1 EST_TOTAL_COST ........... 1.47 STATEMENT ................ select sum(a) from t132 where k2 between 0 and 100 and k1>500000; ------------------------------------------------------------------ NODE LISTING ROOT ====================================== SEQ_NO 3 ONLY CHILD 2 REQUESTS_IN .............. 1 ROWS_OUT ................. 1 EST_OPER_COST ............ 0 EST_TOTAL_COST ........... 1.47 DESCRIPTION max_card_est ........... 1 fragment_id ............ 0 parent_frag ............ (none) fragment_type .......... master statement_index ........ 0 affinity_value ......... 0 max_max_cardinalit 50,399 total_overflow_size .... 0.00 KB xn_access_mode ......... read_only xn_autoabort_interval 0 auto_query_retry ....... enabled plan_version ....... 2,600 embedded_arkcmp ........ used IS_SQLCI ............... ON LDAP_USERNAME GENERATE_EXPLAIN ....... ON ObjectUIDs ............. 3134597362287934668 select_list ............ sum(TRAFODION.SCH.T132.A) SORT_SCALAR_AGGR ========================== SEQ_NO 2 ONLY CHILD 1 REQUESTS_IN .............. 1 ROWS_OUT ................. 1 EST_OPER_COST ............ 0.01 EST_TOTAL_COST ........... 1.47 DESCRIPTION max_card_est ........... 1 fragment_id ............ 0 parent_frag ............ (none) fragment_type .......... master aggregates ............. sum(TRAFODION.SCH.T132.A) TRAFODION_SCAN ============================ SEQ_NO 1 NO CHILDREN TABLE_NAME ............... T132 REQUESTS_IN .............. 1 ROWS_OUT ............ 50,399 EST_OPER_COST ............ 1.47 EST_TOTAL_COST ........... 1.47 DESCRIPTION max_card_est ..... 101,000 fragment_id ............ 0 parent_frag ............ (none) fragment_type .......... master scan_type .............. subset scan limited by mdam of table TRAFODION.SCH.T132 object_type ............ Trafodion cache_size ........ 10,000 probes ................. 1 rows_accessed ..... 50,399 column_retrieved ....... #1:1 key_columns ............ K1, K2 mdam_disjunct .......... (K1 > 500000) and (K2 >= 0) and (K2 <= 100) part_key_predicates .... (K1 > 500000) --- SQL operation complete. >> See how probes…..1 in the explain? This is not correct, this MDAM plan will result in at least 500 probes, and 500 scans of each 101 row. Following its execution with debugger shows indeed that it is doing the expected probing/scanning. I was hoping that this probe….1 at explain was just a “cosmetic” bug, and that I would find in the compiler code at one point an accurate tentative to measure how many probes/slice of scan an mdam would result into… But no, navigating inside the compiler code, look like a whole set of code dealing with evaluating probe count and scan count get bypassed because this function get evaluated to false: NABoolean FileScanOptimizer::isMultipleProbes() const { CostScalar repeatCount = getContext().getPlan()->getPhysicalProperty()-> getDP2CostThatDependsOnSPP()->getRepeatCountForOperatorsInDP2() ; CollIndex numCols = getContext().getInputLogProp()->getColStats().entries(); return (repeatCount.isGreaterThanOne()) OR (numCols > 0); } Not sure what is the logic inside trying to do, but I know that I would love to step into the code that get bypassed because of this false J. Anyone familiar with this? Thanks in advance for the help, Eric
