[MariaDB developers] Review of oldest 3 commits of full-outer-join

Michael Widenius via developers Fri, 16 Jan 2026 07:04:36 -0800

Hi!

First, I really like that you have commented (almost) all your code
throughly.  Very good!



Commit: MDEV-37932: Parser support FULL OUTER JOIN syntax

+++ b/sql/sql_lex.cc
@@ -1343,6 +1343,7 @@ void LEX::start(THD *thd_arg)

   memset(&trg_chistics, 0, sizeof(trg_chistics));
   selects_for_hint_resolution.empty();
+  has_full_outer_join= false;
   DBUG_VOID_RETURN;
 }

I noticed you removed this and changed the name in a later commit.
If possible, fix things like this in the earliest commit so that
each commit is clean!

Please move the variable to the place where all other int variables
are initialized (this makes it easier to verify that everything is
initialized).

Regarding test cases.

Please use --echo # instead of just # for different parts of the test
case. This makes it easier to find the place in the .test file if an
mtr test gives a wrong result.


+++ b/sql/sql_lex.h
@@ -3621,8 +3621,8 @@ struct LEX: public Query_tables_list
   vers_select_conds_t vers_conditions;
   vers_select_conds_t period_conditions;

+  /* False by default, this will be true if the query has a full join. */
+  bool has_full_outer_join;

Move the variable to a place in the struct where all other bool (or uint
when looking at your later patch) are initialized.
This ensures we don't get holes in the LEX struct and saves memory.
Also, for this it would be better to use uint16 instead of uint
to save memory.

----------------

Commit: MDEV-37995: FULL OUTER JOIN name resolution


+# TODO fix PS protocol before end of FULL OUTER JOIN developmen

Add the above comment also in the commit comment.
(Easier to ensure we do not miss it)

+++ b/sql/sql_parse.cc
@@ -10314,11 +10314,6 @@ bool parse_sql(THD *thd, Parser_state *parser_state,
   bool mysql_parse_status= thd->variables.sql_mode & MODE_ORACLE
                            ? ORAparse(thd) : MYSQLparse(thd);

-  /* While we accept full join syntax, such joins are not yet supported. */
-  mysql_parse_status|= thd->lex->has_full_outer_join;
-  if (thd->lex->has_full_outer_join)
-    my_error(ER_NOT_SUPPORTED_YET, MYF(0), "full join");
-
   if (mysql_parse_status)
     /*

And then
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -1895,6 +1895,18 @@ JOIN::prepare(TABLE_LIST *tables_init, COND
*conds_init, uint og_num,
     goto err;
   prepared= true;

+  /*
+    This check gates FULL JOIN functionality while it is under
+    development.  This check will be removed once FULL JOIN
+    has been completed and it allows some aspects of FULL JOIN
+    (see below) while exluding others, driven by whatever has
+    been implemented up to this point.
+  */
+  if (thd->lex->has_full_outer_join &&  // FULL JOIN not yet supported...
+      !thd->lex->is_view_context_analysis() &&  // ...but allow VIEW
creation...
+      !thd->lex->describe)  // ...and limited EXPLAIN support during
development
+    my_error(ER_NOT_SUPPORTED_YET, MYF(0), "full join");
+

Please move the above code to
MDEV-37932: Parser support FULL OUTER JOIN syntax
to the place where you added things in sql_select.cc later.

Makes it easier to review patches and ensure we can easier do rebase
on other MariaDB versions.
Note that we may have to add the full-outer-join feature on 'main' and
'oracle-compatibility' branch (based on 11.8)


@@ -2793,7 +2805,13 @@ JOIN::optimize_inner()
   with_two_phase_optimization= check_two_phase_optimization(thd);
   if (with_two_phase_optimization)
     optimization_state= JOIN::OPTIMIZATION_PHASE_1_DONE;
-  else
+  /*
+    Only during the FULL JOIN development cycle, disable second stage
+    optimization for FULL JOIN queries until the implementation is
+    mature enough to correctly execute the queries.  But for now
+    this allows for some EXPLAIN EXTENDED support.
+  */
+  else if (!thd->lex->has_full_outer_join)

Don't add a comment before an else. Makes the code really hard to read.
Instead add the comment after 'else if (!thd->lex->has_full_outer_join)'

-    if (curr->on_expr)
+    /*
+      NATURAL JOINs don't expose explicit join columns, so don't
+      print them as they're considered invalid syntax (this is
+      important for VIEWs as when VIEWs are loaded, their SQL
+      syntax is parsed again and must be valid).
+    */
+    if (curr->on_expr && !(curr->outer_join & JOIN_TYPE_NATURAL))

I was wondering why we did not have this problem before with natural
join.

Doing some testing, I noticed that this works fine:

create view v1 as select t1.a as t1a, t2.a as t2a from t1 natural left join t2;
select * from v1;
explain extended select * from v1;
drop view v1;

In this case we have in the view:
query=select `test`.`t1`.`a` AS `t1a`,`test`.`t2`.`a` AS `t2a` from
(`test`.`t1` left join `test`.`t2` on(`test`.`t1`.`a` =
`test`.`t2`.`a`))

However when we use

create view v1 as select t1.a as t1a, t2.a as t2a from t1 natural full join t2;

we have in the view:

query=select `test`.`t1`.`a` AS `t1a`,`test`.`t2`.`a` AS `t2a` from
(`test`.`t1` natural full join `test`.`t2` on(`test`.`t2`.`a` =
`test`.`t1`.`a`))

The difference is that when using left join we don't print 'natural' in the
view.
I think that natural full outer join should work like left join and generate
the 'on' clause and remove 'natural'.
This is just to make things consistent between left and full join.

Alternative change right/left natural joins to not write the on clauses.
The following patch makes left natural join and full natural join work
similar for views (tested with main.join):

diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index 3ce8bbcb1f9..f0d84d3366d 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -32214,8 +32214,6 @@ static void print_table_array(THD *thd,

     if (curr->outer_join & JOIN_TYPE_FULL)
     {
-      if (curr->outer_join & JOIN_TYPE_NATURAL)
-        str->append(STRING_WITH_LEN(" natural"));
       str->append(STRING_WITH_LEN(" full join "));
     }
     else if (curr->outer_join & (JOIN_TYPE_LEFT|JOIN_TYPE_RIGHT))
@@ -32237,7 +32235,7 @@ static void print_table_array(THD *thd,
       important for VIEWs as when VIEWs are loaded, their SQL
       syntax is parsed again and must be valid).
     */
-    if (curr->on_expr && !(curr->outer_join & JOIN_TYPE_NATURAL))
+    if (curr->on_expr)

diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
index 09bdf7243dc..1a7b9308a51 100644
--- a/sql/sql_yacc.yy
+++ b/sql/sql_yacc.yy
@@ -12387,8 +12387,9 @@ join_table:
           }
           expr
           {
-            add_join_on(thd, $1, $8);
+            add_join_on(thd, $5, $8);
             $1->on_context= Lex->pop_context();
+            $5->on_context= $1->on_context;
             Select->parsing_place= NO_MATTER;
             $$= $1;
             Lex->has_full_outer_join= true;
@@ -12415,11 +12416,13 @@ join_table:

             Select->add_joined_table($1);
             $1->outer_join|= (JOIN_TYPE_LEFT |
-                              JOIN_TYPE_FULL);
+                              JOIN_TYPE_FULL |
+                              JOIN_TYPE_NATURAL);

             Select->add_joined_table($6);
             $6->outer_join|= (JOIN_TYPE_RIGHT |
-                              JOIN_TYPE_FULL);
+                              JOIN_TYPE_FULL |
+                              JOIN_TYPE_NATURAL);

             add_join_natural($6,$1,NULL,Select);
             Lex->has_full_outer_join= true;

Move this change to MDEV-37932: Parser support FULL OUTER JOIN syntax

+++ b/sql/table.h
@@ -2216,10 +2216,11 @@ class IS_table_read_plan;
 #define VIEW_ALGORITHM_MERGE_FRM      1U
 #define VIEW_ALGORITHM_TMPTABLE_FRM   2U

-#define JOIN_TYPE_LEFT 1U
-#define JOIN_TYPE_RIGHT        2U
-#define JOIN_TYPE_FULL 4U
-#define JOIN_TYPE_OUTER 8U     /* Marker that this is an outer join */
+#define JOIN_TYPE_LEFT   1U
+#define JOIN_TYPE_RIGHT          2U
+#define JOIN_TYPE_FULL   4U
+#define JOIN_TYPE_OUTER   8U   /* Marker that this is an outer join */
+#define JOIN_TYPE_NATURAL 16U

 /* view WITH CHECK OPTION parameter options */
 #define VIEW_CHECK_NONE       0

Move this change to MDEV-37932: Parser support FULL OUTER JOIN syntax

Please also remove tab's in the above region to make things align nicely.

-----------------

Commit: MDEV-37933: Rewrite [NATURAL] FULL OUTER to LEFT, RIGHT, or INNER JOIN

In the code commit:
      SELECT * FROM t1 FULL JOIN t2 ON t1.v = t2.v WHERE t1.v IS NOT NULL;
      SELECT * FROM t1 LEFT JOIN t2 ON t1.v = t2.v;

Add also the example:
      SELECT * FROM t1 FULL JOIN t2 ON t1.v = t2.v WHERE t1.a=t2.a.
      SELECT * FROM t1 INNER JOIN t2 ON t1.v = t2.v WHERE t1.a=t2.a

---

Review.sh output for MDEV-37933:

I tested this patch with the review.sh script.
It had some good comments but said (wrongly)
"A significant limitation of this implementation is that `FULL JOIN`
is only supported for base tables, not for derived tables or subqueries"

I checked and sub queries are supported. I don't know if derivied tables
is a problem here.

Pleae add the following queries to the test to prove this.
(I could not find test for the following cases in the join.test file)

+select * from (select t1.a from t1 full join t2 on t1.a = t2.a where
t1.a > 1 and t2.a > 1) as t3;
+select t1.a from t1 full join t2 on t1.a = t2.a where t1.a > 1 and
t2.a > 1 union select * from t1;

The changes in `sql_base.cc` (`is_full_outer_join`) and `table.cc`
(`first_leaf_for_name_resolution`) are subtle and critical for
correctness, but they lack explanation. Deferring the setup of
`first_name_resolution_table` and altering how join leaves are found
suggests `FULL JOIN` has special requirements during name resolution.
Without comments, this logic is difficult to understand and maintain.
(I did also request some comment for this code, until I found out that part of
this is noot needed, see more below).

#### 3. Potential Brittleness of Unconvertible `FULL JOIN` Detection

NOTE from Monty: I have not yet at this point in time checked if this
is relevant.  Will try to do that as part of my review.

The current mechanism for detecting a non-rewritable `FULL JOIN` is
clever but implicit. It relies on the right-to-left iteration order in
`simplify_joins`, processing the right table of the join first. If no
rewrite occurs, an error is triggered later when the left table is
processed. This dependency on iteration order could be fragile.

**Recommendation:**
Consider a more explicit check. After the main loop in
`simplify_joins`, you could perform a final pass over the join list to
check for any remaining `JOIN_TYPE_FULL` flags. This would decouple
the error detection from the iteration logic.

*Example Alternative (conceptual):*
```cpp
// After the main while loop in simplify_joins
List_iterator<TABLE_LIST> final_check_li(*join_list);
TABLE_LIST *final_table;
while ((final_table = final_check_li++))
{
  if (final_table->outer_join & JOIN_TYPE_FULL)
  {
    if (!join->thd->lex->describe) // Allow EXPLAIN
    {
        my_error(ER_NOT_SUPPORTED_YET, MYF(0), ...);
        DBUG_RETURN(nullptr);
    }
  }
}
DBUG_RETURN(conds);


##### 4. Missing `EXPLAIN` Output in Tests

The tests do an excellent job of verifying the correctness of query
results. However, they don't confirm that the optimizer plan for a
rewritten `FULL JOIN` is identical to its explicit `LEFT/RIGHT/INNER
JOIN` counterpart.
*Example Test Suggestion (`mysql-test/main/join.test`):*
```diff
 --echo # FULL to LEFT JOIN, these two queries should be equal:
 select * from x full join xsq on x.x >= 0 and x.x <= 1 and xsq.x >= 0
and xsq.x <= 1 where x.pk is not null;
+explain select * from x full join xsq on x.x >= 0 and x.x <= 1 and
xsq.x >= 0 and xsq.x <= 1 where x.pk is not null;
 select * from x left join xsq on x.x >= 0 and x.x <= 1 and xsq.x >= 0
and xsq.x <= 1;
+explain select * from x left join xsq on x.x >= 0 and x.x <= 1 and
xsq.x >= 0 and xsq.x <= 1;
```

---


join.test

+# FULL to RIGHT JOIN, these two queries should be equal:
+select * from x  full join xsq on x.y = xsq.y where xsq.pk is not null;
+select * from x right join xsq on x.y = xsq.y

It would be nice if we could have the server check that the results are equal
(Not sure how to do that).

++ b/sql/sql_base.cc

+/*
+  If the tables involved participate in a FULL OUTER JOIN then don't
+  update their name resolution contexts at this point, or it will
+  break simplify_joins.
+ */

Please update comment to describe what the function is doing.
The above comment could be moved to whare is_full_outer_join is called.

static bool is_full_outer_join(Name_resolution_context *context,
+                               TABLE_LIST *right_neighbor)
+{
+  const uint ctx_outer_join= context->first_name_resolution_table->outer_join;
+  const uint rtn_outer_join=
+    right_neighbor->first_leaf_for_name_resolution()->outer_join;
+
+  return (ctx_outer_join & JOIN_TYPE_FULL) ||
+         (rtn_outer_join & JOIN_TYPE_FULL);

Last statement could be replaced with:

+  return ((ctx_outer_join | rtn_outer_join) & JOIN_TYPE_FULL);


   DBUG_ASSERT(right_neighbor);
-  context->first_name_resolution_table=
-    right_neighbor->first_leaf_for_name_resolution();
+  if (!is_full_outer_join(context, right_neighbor))
+    context->first_name_resolution_table=
+      right_neighbor->first_leaf_for_name_resolution();

Here we are calling in most case first_leaf_for_name_resolution() twice.
This is an expensive function, so plese try to remove the double call.

Maybe moving

 context->first_name_resolution_table=
right_neighbor->first_leaf_for_name_resolution();
into is_full_outer_join() (and rename it?)

What is the consequnce of not doing ?

+ context->first_name_resolution_table=
   right_neighbor->first_leaf_for_name_resolution();
Could you please explain?

I checked what happens if we remove the test for
if (!is_full_outer_join(context, right_neighbor))

In essence the diff is:
 select * from x natural full join xsq where x.pk is not null and
xsq.pk is not null;
-pk     x       y       pk      x       y
-6      0       0       6       0       0
-7      1       1       7       1       1
+pk     x       y
+6      0       0
+7      1       1

The new result is actually correct as with natural full outer join
we should only have the common column names once.
What is confusing is that the column value is COALESCE() of all the values.

>From postgresql:

test=# select * from t1;
 a | b
---+---
 1 | 1
 2 | 2
(2 rows)

test=# select * from t2;
 a | b
---+---
 1 | 1
 3 | 3

test=# select *, 'XXXX',  t1.a, t2.a from t1 natural full outer join t2;
 a | b | ?column? | a | a
---+---+----------+---+---
 1 | 1 | XXXX     | 1 | 1
 2 | 2 | XXXX     | 2 |
 3 | 3 | XXXX     |   | 3

Another issue is that at the point when this code is called, we do not
know yet if the full outer join can be changed to left, right or inner
join.

In essense for the current patch, we should remove the is_full_outer_join()
function.  I verified this with Sergei.

PosgreSQL for full outer join:

 case JOIN_FULL:
            {
                /*
                 * Here we must build a COALESCE expression to ensure that the
                 * join output is non-null if either input is.
                 */
                CoalesceExpr *c = makeNode(CoalesceExpr);

                c->coalescetype = outcoltype;
                /* coalescecollid will get set below */
                c->args = list_make2(l_node, r_node);
                c->location = -1;
                res_node = (Node *) c;
                break;
            }

/sql/sql_list.h


+  T* swap_next()
+  {
+    if (!ref() || !*ref() || !peek())
+      return nullptr;

I would like to suggest we add an DBUG_ASSERT() here as we should
never try to do a swap when there is nothing to swap.
Also, the above is not good enough to detect errors as if the iterator
is just initialized, calling ref() will crash as current_info is 0.

This is very likely a bug in the code if any of the above checks fails.
It is also very likely that the code will crash after calling swap
and getting NULL and then trying to dereference it.


+    T* cur= *ref();
+    List_iterator<T> itr= *this;
+    T* nxt= itr++;
+    replace(nxt);
+    return itr.replace(cur);

The above can be done a bit simpler with (full patch on top of yours):

+++ b/sql/sql_list.h
@@ -432,6 +432,10 @@ class base_list_iterator
   {
     return (*el)->info;
   }
+  inline void *peek_ref()
+  {
+    return &(*el)->info;
+   }
   inline void *next_fast(void)
   {
     list_node *tmp;
@@ -597,6 +601,8 @@ template <class T> class List_iterator :public
base_list_iterator
   inline void remove()      { base_list_iterator::remove(); }
   inline void after(T *a)   { base_list_iterator::after(a); }
   inline T** ref(void)     { return (T**) base_list_iterator::ref(); }
+  inline T** peek_ref()     { return (T**) base_list_iterator::peek_ref(); }
+

   /*
     Swap the current element with the next one in the list.
@@ -617,13 +623,10 @@ template <class T> class List_iterator :public
base_list_iterator
   */
   T* swap_next()
   {
-    if (!ref() || !*ref() || !peek())
-      return nullptr;
-    T* cur= *ref();
-    List_iterator<T> itr= *this;
-    T* nxt= itr++;
-    replace(nxt);
-    return itr.replace(cur);
+    T* next= peek();
+    T* cur= replace(next);
+    *peek_ref()= cur;
+    return next;
   }
 };


sql_select.cc:

+
+/**
+  rewrite_full_outer_joins function prototype.
+
+  Required to support recursive calls from this function to simplify_joins, the
+  entry point for both FULL OUTER and other JOIN rewrites.
+
+  Complete function documentation at top of implementation, below.
+*/
+
+static COND *rewrite_full_outer_joins(JOIN *join,
+                                      COND *conds,
+                                      bool top,
+                                      bool in_sj,
+                                      TABLE_LIST **table_ptr,
+                                      List_iterator<TABLE_LIST> *li_ptr,
+                                      table_map *used_tables_ptr,
+                                      table_map *not_null_tables_ptr);

Move the prototype to the start of the function, without any comments, like
we do for all other prototypes. Anyone wanting to know more should got to
the implementation.

+
+static COND *simplify_nested_join(JOIN *join, TABLE_LIST *table,
+                                  COND *conds, bool top, bool in_sj,
+                                  table_map *used_tables_ptr,
+                                  table_map *not_null_tables_ptr)
+{
+  DBUG_ASSERT(used_tables_ptr);
+  DBUG_ASSERT(not_null_tables_ptr);
+  table_map &used_tables= *used_tables_ptr;
+  table_map &not_null_tables= *not_null_tables_ptr;

Please do not hide that the variables above are pointers!
This makes it just harder to know the effiency of the code.
This is especially important when the variables are used as parameters
to a function as then one does not know if we do 'struct/class' copying
or not.
Please remove the variables completely and rename used_tables_ptr to
used_tables. Same with not_null_tables_ptr.

The code is also extremely hard to read because of the above constructs.
It is not obvious that 'used_tables= something' is setting the value
of what used_tables_ptr points to! (This confused me for some minutes
in trying to figure out what is going on).

*used_tables= nested_join->used_tables;

The above uses less code and is clear and self explanatory.

Please fix this also in rewrite_full_outer_joins()

+
+    /*
+      Attempt to rewrite any FULL JOINs as LEFT or RIGHT JOINs.  Any subsequent
+      JOINs that could be further rewritten to INNER JOINs are done below.
+    */
+    conds= rewrite_full_outer_joins(join, conds, top, in_sj, &table,
+                                    &li, &used_tables,
+                                    &not_null_tables);

Shouldn't we check if thd->lex->has_full_outer_join is set here to
avoid the call if it is not needed?
Or, even better, move part of the code from rewrite_full_outer_joins here:

 if (table->outer_join & JOIN_TYPE_FULL)
   conds= rewrite...


+static void rewrite_full_to_left(TABLE_LIST *left_table,
+                                 TABLE_LIST *right_table)

<cut>

+DBUG_ASSERT(right_table->on_expr);
<cut>
+  // The grammar 'search_condition: ' rule marks this.
+  if (!(right_table->outer_join & JOIN_TYPE_NATURAL))
+    right_table->on_expr->base_flags|= item_base_t::IS_COND;
+}

You can remove the above code. If right_table->on_expr is set then
IS_COND should also be set. If not, it is a bug (which it was).
Here follows a patch that fixes this:


+++ b/sql/sql_parse.cc
@@ -8950,9 +8950,7 @@ Item *normalize_cond(THD *thd, Item *cond)


 /**
-  Add an ON condition to the second operand of a JOIN ... ON.
-
-    Add an ON condition to the right operand of a JOIN ... ON clause.
+  Add an ON condition to the second (right) operand of a JOIN ... ON.

   @param b     the second operand of a JOIN ... ON
   @param expr  the condition to be added to the ON clause
@@ -8980,6 +8978,7 @@ void add_join_on(THD *thd, TABLE_LIST *b, Item *expr)
       b->on_expr= new (thd->mem_root) Item_cond_and(thd, b->on_expr,expr);
     }
     b->on_expr->top_level_item();
+    b->on_expr->base_flags|= item_base_t::IS_COND;
   }
 }

I verified the above with the following patch. All mtr test executed
without hitting the assert.

diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index 3ce8bbcb1f9..f723a80097e 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc

@@ -20342,7 +20342,10 @@ static void rewrite_full_to_left(TABLE_LIST
*left_table,

   // The grammar 'search_condition: ' rule marks this.
   if (!(right_table->outer_join & JOIN_TYPE_NATURAL))
+  {
+    DBUG_ASSERT((right_table->on_expr->base_flags &
item_base_t::IS_COND) == item_base_t::IS_COND);
     /* right_table->on_expr->base_flags|= item_base_t::IS_COND; */
+  }
 }

<cut>

+  /*
+    The right table must have an ON clause.  NATURAL JOINs get
+    this not from the grammar but they're built before simplify_joins
+    is called.
+

... this not from the grammar but it is built by setup_natural_join_row_types()
    before simplify_joins() is called.


+   conds= rewrite_full_outer_joins(join, conds, top, in_sj, &table,
+                                    &li, &used_tables,
+                                   &not_null_tables);

I conds is 0, check if we got an error (thd->is_error()) and return
nullptr from this function.
We should do the same for all calls to simplify_nested_join() and
simplify_joins()



<cut>
+ rewrite_full_outer_joins()
<cut>


+  @param conds       conditions to add on expressions for converted joins
->
+  @param conds       WHERE expressions. Will be be anded with ON expressions
                      if rewrite happens.


+  DBUG_ASSERT(table_ptr);
+  DBUG_ASSERT(*table_ptr);
+  DBUG_ASSERT(li_ptr);
+  DBUG_ASSERT(used_tables_ptr);
+  DBUG_ASSERT(not_null_tables_ptr);

Please don't add asserts for a lot of parameters without a good reason.

We call this function from one place and most of the above parameters are
guaranteed to be sane:

   conds= rewrite_full_outer_joins(join, conds, top, in_sj, &table,
                                    &li, &used_tables,
                                    &not_null_tables);

It is good to have assert on things that may or may not be wrong or
in places where we have crashes because of wrong arguments.
The above asserts are not of this type.

Please also add DBUG_ENTER/DBUG_EXIT to this function.

+  List_iterator<TABLE_LIST> &li= *li_ptr;
+  table_map &used_tables= *used_tables_ptr;
+  table_map &not_null_tables= *not_null_tables_ptr;

Remove above (see patch later)

+    my_error(ER_NOT_SUPPORTED_YET, MYF(0),
+             "FULL JOINs that cannot be converted to LEFT, RIGHT, or "
+             "INNER JOINs");

MYF(0) -> MYF(ME_FATAL)

+  DBUG_ASSERT(right_table->outer_join & JOIN_TYPE_RIGHT);
+
+

Remove one of the empty lines above.

+  TABLE_LIST *left_table= li.peek();

For the above you could a DBUG_ASSERT()


+  DBUG_ASSERT(left_table->outer_join & JOIN_TYPE_LEFT);
+  DBUG_ASSERT(left_table->outer_join & JOIN_TYPE_RIGHT);
->
DBUG_ASSERT(test_all_bits(left_table->outer_join,
                          JOIN_TYPE_LEFT | JOIN_TYPE_RIGHT));


    used_tables = left_has_nested ? nested_used_tables
                                  : left_table->get_map();
->

    used_tables= (left_has_nested ? nested_used_tables :
                  left_table->get_map());

Changes:
- No space before =
- : at end of line
- () around expression to get proper indentation.

There was some duplicate code testing left_has_nested and taking actions
depending on it's value.

Here is patch that fixes it by setting nested_... variables depending
on if left nested_join was used or not.
(Makes the code notable shorter).

diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index 3ce8bbcb1f9..2ddc520e834 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -19855,9 +19855,9 @@ static COND *rewrite_full_outer_joins(JOIN *join,
                                       bool top,
                                       bool in_sj,
                                       TABLE_LIST **table_ptr,
-                                      List_iterator<TABLE_LIST> *li_ptr,
-                                      table_map *used_tables_ptr,
-                                      table_map *not_null_tables_ptr);
+                                      List_iterator<TABLE_LIST> *li,
+                                      table_map *used_tables,
+                                      table_map *not_null_tables);


 /**
@@ -20439,23 +20442,16 @@ static COND *rewrite_full_outer_joins(JOIN *join,
                                       bool top,
                                       bool in_sj,
                                       TABLE_LIST **table_ptr,
-                                      List_iterator<TABLE_LIST> *li_ptr,
-                                      table_map *used_tables_ptr,
-                                      table_map *not_null_tables_ptr)
+                                      List_iterator<TABLE_LIST> * const li,
+                                      table_map *used_tables,
+                                      table_map *not_null_tables)
 {
-  DBUG_ASSERT(table_ptr);
-  DBUG_ASSERT(*table_ptr);
-  DBUG_ASSERT(li_ptr);
-  DBUG_ASSERT(used_tables_ptr);
-  DBUG_ASSERT(not_null_tables_ptr);
   TABLE_LIST *right_table= *table_ptr;
-  List_iterator<TABLE_LIST> &li= *li_ptr;
-  table_map &used_tables= *used_tables_ptr;
-  table_map &not_null_tables= *not_null_tables_ptr;
+  DBUG_ENTER("rewrite_full_outer_joins");

   // There's no FULL OUTER JOIN to attempt to rewrite, so do nothing.
   if (!(right_table->outer_join & JOIN_TYPE_FULL))
-    return conds;
+    DBUG_RETURN(conds);

   /*
     The join_list enumerates the tables from t_n, ..., t_0 so we always
@@ -20465,7 +20461,7 @@ static COND *rewrite_full_outer_joins(JOIN *join,
     INNER JOIN, so pass along the unmodified FULL JOIN.
   */
   if (right_table->outer_join & JOIN_TYPE_LEFT)
-    return conds;
+    DBUG_RETURN(conds);

   /*
     Must always see the right table before the left.  Down below, we deal
@@ -20480,34 +20476,36 @@ static COND *rewrite_full_outer_joins(JOIN *join,
     If the left table is a nested join, then recursively rewrite any
     FULL JOINs within it.
    */
-  TABLE_LIST *left_table= li.peek();
+  TABLE_LIST *left_table= li->peek();
   table_map nested_used_tables= 0;
   table_map nested_not_null_tables= 0;
-  DBUG_ASSERT(left_table->outer_join & JOIN_TYPE_FULL);
-  DBUG_ASSERT(left_table->outer_join & JOIN_TYPE_LEFT);
-  const bool left_has_nested= left_table->nested_join;
-  if (left_has_nested)
+  DBUG_ASSERT(test_all_bits(left_table->outer_join,
+                            JOIN_TYPE_FULL | JOIN_TYPE_LEFT));
+
+  if (left_table->nested_join)
   {
     conds= simplify_nested_join(join, left_table, conds, top, in_sj,
                                 &nested_used_tables, &nested_not_null_tables);
   }
+  else
+  {
+    nested_used_tables=     left_table->get_map();
+    nested_not_null_tables= *not_null_tables;
+  }

   /*
     If the right hand table is not null under the WHERE clause then we can
     rewrite it as a RIGHT JOIN, mutating the data structures to make it
     appear as though the user wrote the query as a RIGHT JOIN originally.
   */
-  if (used_tables & not_null_tables)
+  if (*used_tables & *not_null_tables)
   {
     /*
       RIGHT JOINs don't actually exist in MariaDB!  This will do what
       the grammar and convert_right_join together do when given a RIGHT JOIN.
     */
     rewrite_full_to_right(left_table, right_table);
-
-    // This will be reflected to the caller, too.
-    used_tables = left_has_nested ? nested_used_tables
-                                  : left_table->get_map();
+    *used_tables= nested_used_tables;

     /*
       Swap myself with the left as though we did convert_right_join().
@@ -20515,7 +20513,7 @@ static COND *rewrite_full_outer_joins(JOIN *join,
         FULL -> RIGHT -> LEFT.
       RIGHT JOINs don't actually exist in MariaDB!
     */
-    *table_ptr= li.swap_next();
+    *table_ptr= li->swap_next();
     --join->thd->lex->full_join_count;
   }
   else
@@ -20527,38 +20525,20 @@ static COND *rewrite_full_outer_joins(JOIN *join,
       the datastructures at this point to make the FULL JOIN look like it was
       written as a LEFT JOIN by the user (WHERE condition permitting).
     */
-    table_map peeked_map= 0;
-    if (left_has_nested)
-    {
-      // The left table was a nested join, so peek at that map.
-      peeked_map= nested_used_tables;
-      not_null_tables= nested_not_null_tables;
-    }
-    else
-    {
-      /*
-        The left table was not a nested join, so peek at its map.
-        We can't just set peeked_map to left_table->get_map() in
-        all cases because, in the case of a nested_join, the underlying
-        TABLE_LIST::table member (from which the map is derived)
-        is nullptr.
-      */
-      peeked_map= left_table->get_map();
-    }
+    *not_null_tables= nested_not_null_tables;

     /*
       If the left table, be it a nested join or not, rejects nulls for
       the WHERE condition, then rewrite.
     */
-    if (peeked_map & not_null_tables)
+    if (nested_used_tables & *not_null_tables)
     {
       rewrite_full_to_left(left_table, right_table);
       --join->thd->lex->full_join_count;
     }
     // else the FULL JOIN cannot be rewritten, pass it along.
   }
-
-  return conds;
+  DBUG_RETURN(conds);
 }

table.cc:

   /*
      If the current nested join is a RIGHT JOIN, the operands in
      'join_list' are in reverse order, thus the first operand is
      already at the front of the list. Otherwise the first operand
      is in the end of the list of join operands.
    */
-    if (!(cur_table_ref->outer_join & JOIN_TYPE_RIGHT))
+    if (!(cur_table_ref->outer_join & JOIN_TYPE_RIGHT) ||
+        (cur_table_ref->outer_join & JOIN_TYPE_FULL))

Is this because of the code change in sql_base.cc where we do not set
context->first_name_resolution_table in case of full join?

Other things:

I noticed that the 'top' argument to simplify_joins() is not used.
Could you please make a commit where you remove this and put it lowest
in your commit stack (so that you can use this code as a base for your
future commits).

please also add this example to the comment for simplify_joins() as an
example where we have to take 'on' into considering:


select * from t1 LEFT JOIN (t2 LEFT JOIN t3 on t3.a=t2.a) ON t3.a=t1.a;
->
select * from t1 LEFT JOIN (t2 INNER JOIN t3) ON t3.a=t2.a and t3.a=t1.a;

Regards,
Monty
_______________________________________________
developers mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[MariaDB developers] Review of oldest 3 commits of full-outer-join

Reply via email to