Re: [Maria-developers] Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/

2011-03-02 Thread Sergey Petrunya
Hello Igor,

Here's the promised proof of deadcode:

On Fri, Feb 11, 2011 at 08:49:37PM -0800, Igor Babaev wrote:
 === modified file 'sql/sql_select.cc'
 --- a/sql/sql_select.cc   2011-02-06 04:57:03 +
 +++ b/sql/sql_select.cc   2011-02-12 04:41:22 +
 @@ -8419,6 +8438,11 @@
sort_by_tab-type= JT_ALL;
sort_by_tab-read_first_record= join_init_read_record;
  }
 +else if (sort_by_tab-type == JT_HASH_NEXT)
 +{
 +  sort_by_tab-type= JT_HASH;
 +  sort_by_tab-read_first_record= join_init_read_record;
 +}
}
break;
  }

Statement: the above else-if branch can never be taken, i.e. equality

   sort_by_tab-type == JT_HASH_NEXT   (1)

can never be true.

Proof: 
First, let's see what JOIN_TABs can be in sort_by_tab. sort_by_tab is assigned 
in this statement:

   JOIN_TAB *sort_by_tab= join-group  join-simple_group 
  join-group_list ?
   join-join_tab+join-const_tables :
   join-get_sort_by_join_tab();

i.e. possible values are
  1.  join-join_tab+join-const_tables
  2.  join-get_sort_by_join_tab()

get_sort_by_join_tab is defined in sql_select.h as:

  JOIN_TAB *get_sort_by_join_tab()
  {
return (need_tmp || !sort_by_table || skip_sort_order ||
((group || tmp_table_param.sum_func_count)  !group_list)) ?
  NULL : join_tab+const_tables;
  }

i.e. it can return:
  1. NULL (which is not interesting)
  2. join-join_tab + const_tables.

this draws us to conclusion that sort_by_tab can be
 - NULL 
 - join-join_tab + const_tables.

so equality (1) can be rewritten as 
   
   join-join_tab[const_tables].type == JT_HASH_NEXT   (2)

Now, let's explore the question, what join_tab elements can have 
type==JT_HASH_NEXT. The value JT_HASH_NEXT
is assigned in only one place:

  sql_select.cc|8399| tab-type= tab-type == JT_ALL ? JT_NEXT : JT_HASH_NEXT;

this line is inside the case block of code:
 
  break;
case JT_ALL:
case JT_HASH:
   
 tab-type= tab-type == JT_ALL ? JT_NEXT : JT_HASH_NEXT;

which means that tab-type can become JT_HASH_NEXT only if it was JT_HASH 
before.

Let's see when a join_tab can get type==JT_HASH. There's only one assignment 
that can assign type=JT_HASH.
It is at sql_select.cc:8279 :

if (tab-cache  tab-cache-get_join_alg() == JOIN_CACHE::BNLH_JOIN_ALG)
  tab-type= JT_HASH;

i.e. tab-type can become JT_HASH only if tab-cache!=NULL.  tab-cache can be 
set to non-NULL value only inside
check_join_cache_usage(). However, the first lines in check_join_cache_usage() 
have this code:

 
  uint i= tab - join-join_tab;

  join-return_tab= 0;

  if (cache_level == 0 || i == join-const_tables || !prev_tab)
return 0;

from which we can conclude that 

  join-join_tab[join-const_tables].cache == NULL 

will always hold. Going back on our chain of reasoning, we get that

  join-join_tab[join-const_tables].cache == NULL   at all times=
  join-join_tab[join-const_tables].type != JT_HASH  at all times  =
  join-join_tab[join-const_tables].type != JT_HASH_NEXT  at all times  =
   equality (2) is false always = original equality (1) is false always,  
q.e.d.




BR
 Sergey
-- 
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/

2011-02-22 Thread Sergey Petrunya
On Mon, Feb 21, 2011 at 03:20:13PM -0800, Igor Babaev wrote:
  === modified file 'sql/sql_select.cc'
  --- a/sql/sql_select.cc2011-02-06 04:57:03 +
  +++ b/sql/sql_select.cc2011-02-12 04:41:22 +
  ...
  @@ -8419,6 +8438,11 @@
 sort_by_tab-type= JT_ALL;
 sort_by_tab-read_first_record= join_init_read_record;
   }
  +else if (sort_by_tab-type == JT_HASH_NEXT)
  +{
  +  sort_by_tab-type= JT_HASH;
  +  sort_by_tab-read_first_record= join_init_read_record;
  +}
 }
 break;
   }
  I have a question only to this part of the patch: I don't understand how the
  above code could work. 
  
  The if statement above the one you've added checks if sort_by_tab was using
  [ordered] index scan to produce records in orderer, and if yes, switches it 
  to
  a full scan (because use of join buffering will break ordering anyway).
  
  I suppose the part you've added does something similar, but I totally fail 
  to
  understand what exactly it does.
 
 The above code says:
 if hash join is used do not use full index scan to look through the
 joined table, rather use full table scan in this case.
 
I've put a DBUG_ASSERT(0) right after the sort_by_tab-type= JT_HASH; line
and ran the testsuite. The assertion didn't fire. I did this because I suspect 
(although can't provide a sound proof right away) that the part inside the 
else if () { ...} is deadcode.

Do you think you could easily prove me wrong by giving a testcase? If
constructing testcase is complicated, let's meet online and discuss my
suspicions about the deadcode.

  
  Now, some comments on the new EXPLAIN output:
  
  
  ISSUE1. 
  
  Consider a query:
  MariaDB [j1] explain select * from t3, t4 where t3.col1=t4.col2;
  ++-+---+--+---+---+-++--+--+
  | id | select_type | table | type | possible_keys | key   | key_len 
  | ref| rows | Extra|
  ++-+---+--+---+---+-++--+--+
  |  1 | SIMPLE  | t3| ALL  | NULL  | NULL  | NULL
  | NULL   | 1000 | Using where  |
  |  1 | SIMPLE  | t4| hash_ALL | NULL  | #hash#$hj | 5   
  | j1.t3.col1 | 1000 | Using where; Using join buffer (flat, BNLH join) |
  ++-+---+--+---+---+-++--+--+
  2 rows in set (0.00 sec)
  
  AFAIU #hash#$hj means the hash key on the hashtable that's used for
  hash join. I think the name is too convoluted.  The hashtable has only one
  key, if we need to tell it from t4's indexes it would be sufficient to use
  #hashkey#, without the cryptic $hj.
  
  As for # characters: EXPLAIN already shows internal tables/indexes:
 
 If you look into #maria-call log you'll see that I followed strictly
 Monty's instructions:
 to use the prefix #hash# before the index name I used to use, i.e.
 used_index should be substituted for #hash#used_index.
 
 The only deviation I afforded myself was: I used $hj instead of hj_key.
 
Ok. I still think that my objections are valid and current EXPLAIN output is 
not the best but we could discuss/address this outside the scope of this
bugfix.

skip
  What is the index that's used for scanning the table t1?  #hash#cu1 shows
  that hash join optimizer used access on index cu1 in its analysis, but
  generally it doesn't mean that full index scan on table t1 was done on index
  cu1, doesn't it? 
  I would very much prefer to see #hash#cu1:USED_INDEX there (or taking into
  account previous suggestions, hashkey:USED_INDEX). It will also be
  consistent with how hash_range and hash_index_merge use that column.
 
 Yes, the scan index is missing here. I will add.

OK.

  ISSUE4
  If key and key_len use hash_part:real_part notation, i.e. use semicolon, why
  use underscore in hash_ALL, hash_index, etc? Won't hash:ALL, hash:index look
  more consistent with other columns?
 
 again: I follow here Monty's recommendations. Nothing more, nor less.
 

BR
 Sergey
-- 
Sergey Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Fwd: [Commits] Rev 2901: BNLH algorithm always used a full table scan over the joined table in file:///home/igor/maria/maria-5.3-mwl128-hashrange/

2011-02-21 Thread Sergey Petrunya
On Fri, Feb 11, 2011 at 08:49:37PM -0800, Igor Babaev wrote:
 Hi Sergey,
 
 Please review this patch that fixes a defect of the current
 implementation of the mwl #128 that never couples hash join
 with range/index_merge scans.
 The patch also adds prefix #hash# to the names of the used hash indexes.
 (we agreed upon this at the last optimizer meeting).
 
 Regards,
 Igor.
 
  Original Message 
 Subject: [Commits] Rev 2901: BNLH algorithm always used a full table
 scan over the joined table in
 file:///home/igor/maria/maria-5.3-mwl128-hashrange/
 Date: Fri, 11 Feb 2011 20:41:23 -0800 (PST)
 From: Igor Babaev i...@askmonty.org
 Reply-To: maria-developers@lists.launchpad.net
 To: comm...@mariadb.org
 
 At file:///home/igor/maria/maria-5.3-mwl128-hashrange/
 
 
 revno: 2901
 revision-id: i...@askmonty.org-20110212044122-w9n3jdk3d2ps0w3o
 parent: i...@askmonty.org-20110207231903-3rqbfs50d33lk3r9
 committer: Igor Babaev i...@askmonty.org
 branch nick: maria-5.3-mwl128-hashrange
 timestamp: Fri 2011-02-11 20:41:22 -0800
 message:
   BNLH algorithm always used a full table scan over the joined table
   even in the cases when there existed range/index-merge scans that
   were cheaper than the full table scan.
   This was a defect/bug of the implementation of mwl #128.
   Now hash join can work not only with full table scan of the joined
   table, but also with full index scan, range and index-merge scans.
   Accordingly, in the cases when hash join is used the column 'type'
   in the EXPLAINs can contain now 'hash_ALL', 'hash_index', 'hash_range'
   and 'hash_index_merge'. If hash join is coupled with a range/index_merge
   scan then the columns 'key' and 'key_len' contain info not only on
   the used hash index, but also on the indexes used for the scan.
...
 === modified file 'sql/sql_select.cc'
 --- a/sql/sql_select.cc   2011-02-06 04:57:03 +
 +++ b/sql/sql_select.cc   2011-02-12 04:41:22 +
...
 @@ -8419,6 +8438,11 @@
sort_by_tab-type= JT_ALL;
sort_by_tab-read_first_record= join_init_read_record;
  }
 +else if (sort_by_tab-type == JT_HASH_NEXT)
 +{
 +  sort_by_tab-type= JT_HASH;
 +  sort_by_tab-read_first_record= join_init_read_record;
 +}
}
break;
  }
I have a question only to this part of the patch: I don't understand how the
above code could work. 

The if statement above the one you've added checks if sort_by_tab was using
[ordered] index scan to produce records in orderer, and if yes, switches it to
a full scan (because use of join buffering will break ordering anyway).

I suppose the part you've added does something similar, but I totally fail to
understand what exactly it does.

Now, some comments on the new EXPLAIN output:


ISSUE1. 

Consider a query:
MariaDB [j1] explain select * from t3, t4 where t3.col1=t4.col2;
++-+---+--+---+---+-++--+--+
| id | select_type | table | type | possible_keys | key   | key_len | 
ref| rows | Extra|
++-+---+--+---+---+-++--+--+
|  1 | SIMPLE  | t3| ALL  | NULL  | NULL  | NULL| 
NULL   | 1000 | Using where  |
|  1 | SIMPLE  | t4| hash_ALL | NULL  | #hash#$hj | 5   | 
j1.t3.col1 | 1000 | Using where; Using join buffer (flat, BNLH join) |
++-+---+--+---+---+-++--+--+
2 rows in set (0.00 sec)

AFAIU #hash#$hj means the hash key on the hashtable that's used for
hash join. I think the name is too convoluted.  The hashtable has only one
key, if we need to tell it from t4's indexes it would be sufficient to use
#hashkey#, without the cryptic $hj.

As for # characters: EXPLAIN already shows internal tables/indexes:

MariaDB [j1] explain select * from (select * from t1) X;
++-++--+---+--+-+--+--+---+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  
| rows | Extra |
++-++--+---+--+-+--+--+---+
|  1 | PRIMARY | derived2 | ALL  | NULL  | NULL | NULL| NULL 
| 1000 |   |
|  2 | DERIVED | t1 | ALL  | NULL  | NULL | NULL| NULL 
| 1000 |   |
++-++--+---+--+-+--+--+---+

# The following is from MWL#90 tree:
MariaDB [test] explain select * from t10 where a in (select max(b) from t10 
group by a);