Re: [Maria-developers] Some more input on optimizer_trace

2019-02-10 Thread Sergey Petrunia
On Sun, Feb 10, 2019 at 02:14:51PM +0200, Sergey Petrunia wrote:
> Hi Varun,
> 
> I've did some adjustments MDEV-18489 and pushed the patch into
> 10.4-optimizer-trace, please check it out.
> 
> Also I have filed MDEV-18527 and MDEV-18528.
> 
> Some input on the code:
> 
> > --- 10.4-optimizer-trace-orig/sql/sql_select.cc
> > +++ 10.4-optimizer-trace-cl/sql/sql_select.cc
> 
One more thing - Json_value_context is not really a context anymore.

(the original intent was that "one gets a value context object if the
Json_writer's current state is such that it currently expects a value (and not
a name)"  but apparently it is not used this way. Which is fine, but then I
guess the object should be renamed. (to _helper? or something like that?)

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog



___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] Some more input on optimizer_trace

2019-02-10 Thread Sergey Petrunia
Hi Varun,

I've did some adjustments MDEV-18489 and pushed the patch into
10.4-optimizer-trace, please check it out.

Also I have filed MDEV-18527 and MDEV-18528.

Some input on the code:

> --- 10.4-optimizer-trace-orig/sql/sql_select.cc
> +++ 10.4-optimizer-trace-cl/sql/sql_select.cc

> @@ -15983,12 +16250,26 @@ optimize_cond(JOIN *join, COND *conds,
>that occurs in a function set a pointer to the multiple equality
>predicate. Substitute a constant instead of this field if the
>multiple equality contains a constant.
> -*/ 
> -DBUG_EXECUTE("where", print_where(conds, "original", QT_ORDINARY););
> +*/
> +
> +Opt_trace_context *const trace = >opt_trace;
> +Json_writer *writer= trace->get_current_json();
> +Json_writer_object trace_wrapper(writer);
> +Json_writer_object trace_cond(writer, "condition_processing");
> +trace_cond.add("condition", join->conds == conds ? "WHERE" : "HAVING")
> +  .add("original_condition", conds);
> +
> +Json_writer_array trace_steps(writer, "steps");
> +  DBUG_EXECUTE("where", print_where(conds, "original", QT_ORDINARY););

Small question: Why was DBUG_EXECUTE shifted right? 

A bigger question: the code seems unnecesarily verbose:

1. > +Opt_trace_context *const trace = >opt_trace;
2. > +Json_writer *writer= trace->get_current_json();
3. > +Json_writer_object trace_wrapper(writer);
4. > +Json_writer_object trace_cond(writer, "condition_processing");

Can we save the space by just calling the constructors:

  Json_writer_object trace_wrapper(thd);
  Json_writer_object trace_cond(thd, "condition_processing");

?
This applies here and in many other places.

Alternative, we could use:

  Json_writer_object trace_wrapper(thd);
  Json_writer_object trace_cond(trace_wrapper, "condition_processing");

.. which makes the nesting clearer (and we could also add debug safety check: it
is invalid to operate on trace_wrapper until trace_cond hasn't been end()'ed)


> --- 10.4-optimizer-trace-orig/sql/opt_range.cc
> +++ 10.4-optimizer-trace-cl/sql/opt_range.cc  
> 
> +void TRP_ROR_UNION::trace_basic_info(const PARAM *param,
> + Json_writer_object *trace_object) const
> +{
> +  Opt_trace_context *const trace = >thd->opt_trace;
> +  Json_writer* writer= trace->get_current_json();
> +  trace_object->add("type", "index_roworder_union");
> +  Json_writer_array ota(writer, "union_of");

The name 'ota' makes sense in MySQL codebase (where it is a contraction of
Optimizer_trace_array), but is really confusing in MariaDB codebase. Please
change everywhere to "smth_trace" or something like that (jwa in the worst
case).

> @@ -2654,12 +2833,18 @@ int SQL_SELECT::test_quick_select(THD *t
> 
>  if (cond)
>  {
> -  if ((tree= cond->get_mm_tree(, )))
> +  {
> +Json_writer_array trace_range_summary(writer,
> +   "setup_range_conditions");
> +tree= cond->get_mm_tree(, );
> +  }

Does this ever produce anything meaningful, other than empty:

"setup_range_conditions": [],

In MySQL, the possible contents of this array are:

  "impossible_condition": {
"cause": "comparison_with_null_always_false"
  } /* impossible_condition */

   "impossible_condition": {
 "cause": "null_field_in_non_null_column"
   } /* impossible_condition */

But in MariaDB we don't have this (should we?)

Also,  why_use_underscores_in_value_of_cause?  It is a quoted string where
spaces are allowed. MySQL seems to have figured this out at some point and have
a few cause strings using spaces.

> --- 10.4-optimizer-trace-orig/sql/opt_table_elimination.cc
> +++ 10.4-optimizer-trace-cl/sql/opt_table_elimination.cc  
> @@ -522,7 +524,8 @@ eliminate_tables_for_list(JOIN *join,
>List *join_list,
>table_map tables_in_list,
>Item *on_expr,
> -  table_map tables_used_elsewhere);
> +  table_map tables_used_elsewhere,
> +  Json_writer_array* eliminate_tables);

Please change the name to something indicating it's just a trace.


BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog



___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] Review for ANALYZE TABLE with Sampling

2019-02-10 Thread Vicențiu Ciorbaru
Hi Sergei!

Can you review that you are happy with the storage engine API changes?
I'veustructured the commit to be as small as possible to achieve the
desired outcome. In my tests, we are now twice as fast as MySQL for a
10 mil row table with 13 columns.

Vicențiu

 Forwarded Message From: vicentiu@mariadb.orgTo: 
commits@mariadb.orgSubject: 53730224efd: Improve histogram collection
performance by samplingDate: Sun, 10 Feb 2019 20:09:49 +0200 (EET)
revision-id: 53730224efd987f97a6cc968ff5214ee499d84e0 (mariadb-10.4.1-
163-g53730224efd)parent(s):
3c305d3f1951f1667f84e48ddd98674c6318c39dauthor: Vicențiu
Ciorbarucommitter: Vicențiu Ciorbarutimestamp: 2019-02-10 19:54:50
+0200message:
Improve histogram collection performance by sampling
Histogram collection is done by sampling a percentage of rows from the
table,not looking at all individual ones.
The default implementation, to keep the server's Engine
IndepenentStatistics component working uses Bernoulli sampling. It does
a simpletable scan, but only accepts rows that pass a dice roll.
Thisimplementation is done as a storage engine interface method, so as
toallow faster and/or more efficient implementations for storage
enginesinternally.
The number of rows collected is capped to a minimum of 5
andincreases logarithmically with a coffecient of 4096. The coffecient
ischosen so that we expect an error of less than 3% in our
estimationsaccording to the paper:"Random Sampling for Histogram
Construction: How much is enough?”– Surajit Chaudhuri, Rajeev Motwani,
Vivek Narasayya, ACM SIGMOD, 1998.
This interface is also a precursor to allowing SELECT ... FROM
tablewith sampling to work.
Performance wise, for a table of 10 million rows and 13 columns, 6
int,6 doubles, one string, the previous analyze table statistics took1
minute 20 seconds to collect all data. Post implementation, thetime is
less than 6 seconds. This was run on an InnoDB table, NVME SSD
withapproximately 2GB/s read speed and 500MB/s write speed.
--- mysql-test/main/selectivity.result|  8 +++ mysql-
test/main/selectivity_innodb.result |  8 +++--
-- sql/handler.cc| 14
+++ sql/handler.h | 40
++- sql/sql_statistics.cc  
   | 32 +++-- 5 files changed, 86 insertions(+), 16
deletions(-)
diff --git a/mysql-test/main/selectivity.result b/mysql-
test/main/selectivity.resultindex 00907235ecc..6d09c1c9b62 100644---
a/mysql-test/main/selectivity.result+++ b/mysql-
test/main/selectivity.result@@ -1290,9 +1290,9 @@ explain
extended select * from t1, t2, t1 as t3 where t1.b=t2.c and t2.d=t3.a
and t3.b<5 and t1.a < 2000; id  select_type table   typepossibl
e_keys  key key_len ref rowsfilteredExtra-1 SIMPLE  
t1  ALL NULLNULLNULLNULL262144  100.00  Using
where+1 SIMPLE  t1  ALL NULLNULLNULLNULL262117  
100.00  Using where 1   SIMPLE  t2  ref c,d c   5   
test.t1.b   5   100.00  -1  SIMPLE  t3  ALL NULL
NULLNULLNULL262144  100.00  Using where; Using join buffer
(flat, BNL join)+1  SIMPLE  t3  ALL NULLNULLNULL
NULL262117  100.00  Using where; Using join buffer (flat, BNL
join) Warnings: Note1003select `test`.`t1`.`a` AS
`a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`c` AS `c`,`test`.`t2`.`d` AS
`d`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS `b` from `test`.`t1` join
`test`.`t2` join `test`.`t1` `t3` where `test`.`t2`.`c` =
`test`.`t1`.`b` and `test`.`t3`.`a` = `test`.`t2`.`d` and
`test`.`t3`.`b` < 5 and `test`.`t1`.`a` < 2000 select * from t1, t2, t1
as t3@@ -1307,9 +1307,9 @@ explain extended select * from t1, t2, t1 as
t3 where t1.b=t2.c and t2.d=t3.a and t3.b<5 and t1.a < 2000; id select_
typetable   typepossible_keys   key key_len ref rows
filteredExtra-1 SIMPLE  t3  ALL NULLNULLNULL
NULL262144  0.00Using where+1   SIMPLE  t3  ALL NULL
NULLNULLNULL262117  0.00Using where 1   SIMPLE  t2  
ref c,d d   5   test.t3.a   7   100.00  -1  
SIMPLE  t1  ALL NULLNULLNULLNULL262144  2.00
Using where; Using join buffer (flat, BNL join)+1   SIMPLE  t1  
ALL NULLNULLNULLNULL262117  2.00Using where;
Using join buffer (flat, BNL join) Warnings: Note   1003select
`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`c` AS
`c`,`test`.`t2`.`d` AS `d`,`test`.`t3`.`a` AS `a`,`test`.`t3`.`b` AS
`b` from `test`.`t1` join `test`.`t2` join `test`.`t1` `t3` where
`test`.`t1`.`b` = `test`.`t2`.`c` and `test`.`t2`.`d` = `test`.`t3`.`a`
and `test`.`t3`.`b` < 5 and `test`.`t1`.`a` < 2000 select * from t1,
t2, t1 as t3diff --git a/mysql-test/main/selectivity_innodb.result
b/mysql-test/main/selectivity_innodb.resultindex
93917065722..0b20a40f69f 

[Maria-developers] MariaDB Developers Unconference, New York

2019-02-10 Thread Ian Gilfillan
A final reminder about the MariaDB Developers Unconference taking place 
23 - 24 February in New York, just before OpenWorks.


We're almost at capacity, so if you haven't signed up yet, please do so 
soon. Details about where to sign up and how to suggest sessions are at 
https://mariadb.org/2019-developers-unconference-new-york/


Hope to see you there!


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp