On Thu, Dec 01, 2011 at 01:12:00PM +0200, Timour Katchaounov wrote: > Ok, I agree to change to MATERIALIZED (not MATERIALIZE). > Who will do the change? > > In addition, in the EXTRA field there should be info > which materialization is used. If I am not mistaken, > there are three variants: > - SJ-mat, > - SJ-inside-out, > - NON-SJ-mat > (better terms are needed for explain).
I do not fully agree to this. - One can easily tell between semi-join and non-semi-join materialization: in semi-join case you see a separate line with table='<subquery2>' - For semi-join materialization, lookup and scan variants are easily visible. Check the next two explains: the first does eq_ref access to <subquery2> table, this is SJ-Materialization-Lookup. The second does a full scan on <subquery2>, this is SJ-Materialization-Scan. I thought this was apprent enough. Do you think it is not? MariaDB [test]> explain select * from one_k A where a in (select B.a from ten B, ten C); +----+-------------+-------------+--------+---------------+--------------+---------+------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+--------+---------------+--------------+---------+------+------+-------+ | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 1000 | | | 1 | PRIMARY | <subquery2> | eq_ref | distinct_key | distinct_key | 5 | func | 1 | | | 2 | SUBQUERY | B | ALL | NULL | NULL | NULL | NULL | 10 | | | 2 | SUBQUERY | C | ALL | NULL | NULL | NULL | NULL | 10 | | +----+-------------+-------------+--------+---------------+--------------+---------+------+------+-------+ 4 rows in set (0.00 sec) MariaDB [test]> set debug_optimizer_prefer_join_prefix='B,C,A'; Query OK, 0 rows affected (0.00 sec) MariaDB [test]> explain select * from one_k A where a in (select B.a from ten B, ten C); +----+-------------+-------------+------+---------------+------+---------+------+------+-------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+---------------+------+---------+------+------+-------------------------------------------------+ | 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 10 | | | 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 1000 | Using where; Using join buffer (flat, BNL join) | | 2 | SUBQUERY | B | ALL | NULL | NULL | NULL | NULL | 10 | | | 2 | SUBQUERY | C | ALL | NULL | NULL | NULL | NULL | 10 | | +----+-------------+-------------+------+---------------+------+---------+------+------+-------------------------------------------------+ 4 rows in set (0.00 sec) > > Timour > > On 1.12.2011 12:54, Sergei Petrunia wrote: >> Reply-To: >> >> Hello, >> >> I think we have already discussed this before (multiple times), and the >> idea was received with some conditional approval of the idea. Nothing has >> been done or put to paper, though, so now I'm posting this in writing. >> >> I'm posting this now, because we're close to RC and I think it's better to do >> this kind of changes before the RC. >> >> The proposal is: if a subquery is materialized (that is, the subquery is run >> and it's output is stored with a temporary table with unique key), EXPLAIN >> should not show "SUBQUERY" (like it does now), it should show "MATERIALIZE". >> >> Consider the following two queries. The EXPLAINs are the same, although one >> of >> them runs materialization and the other does not: >> >> MariaDB [test]> EXPLAIN select * from ten where a< any (select max(a) from >> ten); >> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ >> | id | select_type | table | type | possible_keys | key | key_len | ref | >> rows | Extra | >> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ >> | 1 | PRIMARY | ten | ALL | NULL | NULL | NULL | NULL | >> 10 | Using where | >> | 2 | SUBQUERY | ten | ALL | NULL | NULL | NULL | NULL | >> 10 | | >> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ >> 2 rows in set (0.00 sec) >> >> MariaDB [test]> EXPLAIN select * from ten where a = any (select max(a) from >> ten) or a< 3; >> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ >> | id | select_type | table | type | possible_keys | key | key_len | ref | >> rows | Extra | >> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ >> | 1 | PRIMARY | ten | ALL | NULL | NULL | NULL | NULL | >> 10 | Using where | >> | 2 | SUBQUERY | ten | ALL | NULL | NULL | NULL | NULL | >> 10 | | >> +----+-------------+-------+------+---------------+------+---------+------+------+-------------+ >> 2 rows in set (0.00 sec) >> >> You can only tell that the second one uses Materialization if look at EXPLAIN >> EXTENDED warning text. I think it is extremely bad that radically new >> execution strategy lurks somewhere inside EXPLAIN EXTENDED. >> >> >> I remember you've used to argue that 'SUBUQERY' represents the fact that the >> subquqery is evaluated once, and materialization should be shown elsewhere. I >> object to that argument >> - materialization is currently shown *nowhere* (except EXPLAIN EXTENDED which >> is pain to read except for the most simplest queries) >> - Derived table subqueries, which are also executed once (and stored in a >> temporary table, like with materialization), have their special >> select_type >> value 'DERIVED': >> >> MariaDB [test]> explain select * from one_k, (select max(A.a+B.a) from ten >> A, ten B ) foo; >> +----+-------------+------------+------+---------------+------+---------+------+------+------------------------------------+ >> | id | select_type | table | type | possible_keys | key | key_len | >> ref | rows | Extra | >> +----+-------------+------------+------+---------------+------+---------+------+------+------------------------------------+ >> | 1 | PRIMARY |<derived2> | ALL | NULL | NULL | NULL | >> NULL | 100 | | >> | 1 | PRIMARY | one_k | ALL | NULL | NULL | NULL | >> NULL | 1000 | Using join buffer (flat, BNL join) | >> | 2 | DERIVED | A | ALL | NULL | NULL | NULL | >> NULL | 10 | | >> | 2 | DERIVED | B | ALL | NULL | NULL | NULL | >> NULL | 10 | Using join buffer (flat, BNL join) | >> +----+-------------+------------+------+---------------+------+---------+------+------+------------------------------------+ >> 4 rows in set (0.02 sec) >> >> >> I think the above should be sufficient to make a decision to change >> Materialized subqueries to show 'MATERIALIZED' (or 'MATERIALIZE') in EXPLAIN >> output. Any objections? >> >> >> >> -- BR Sergei -- Sergei Petrunia, Software Developer Monty Program AB, http://askmonty.org Blog: http://s.petrunia.net/blog _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp

