[Maria-developers] GSoC [MDEV-6017]

2019-05-27 Thread Alexey Mogilyovkin
Hello. This is the thread where I would like to publish my weekly reports
every Monday.
During preparatory weeks I set up development environment. I briefly
examined testing framework, debugging process and now I am ready to start
work.
Work progress will be available on my github repository in branch
"MDEV-6017" https://github.com/ZeroICQ/server/tree/MDEV-6017
___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


[Maria-developers] Proposal for automatic code formatting (clang-format)

2019-05-27 Thread Eugene Kosov
Hi.

I'm proposing to add some tool that automatically formats source code.

Main purpose is overall productivity. It's possible to save time on manual
formatting. Also, reading a uniformly formatted code is easier.

Some new languages has one correct coding style and code formatters.
Examples are Go (https://golang.org/cmd/gofmt/) and (to be done) Rust
(https://github.com/rust-dev-tools/fmt-rfcs/blob/master/guide/guide.md).
I suggest to read 'Motivation - why use a formatting tool' from second link.

There is no single correct coding style in C and C++ worlds: different projects
use different styles if any at all. But there is a tool which does formatting
and supports different styles: clang-format.

It's a binary available for every major platform. It's configurable: just put
.clang-format file with formatting options in source directory and formatter
will use it for current and nested directories.

I actually do not propose here to use exactly clang-format but as I know it's
the best of its kind. Here is a demo https://youtu.be/uZI_Qla4pNA?t=1893

Tool is configurable but it will not act exactly as you might expect and this is
a pay for uniformity.

So, let's add .clang-format to root directory, start using and polishing style.
Here is some info on how to add it to some text editors 
https://clang.llvm.org/docs/ClangFormat.html

I do not(!) suggesting to reformat whole project. At least for now.

Here is my PR to add .clang-format files to server and InnoDB: 
https://github.com/MariaDB/server/pull/1270

I think that eventually .clang-format usage will lead to reformatting
everything and having some project-wide precommit hook which will properly
format your diffs.

Do you have any objections or questions on having such a tool?
Let's discuss, please.

-- 
Eugene


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp


Re: [Maria-developers] Regarding approach used for Early NULL filtering in the range optimizer(MDEV-15777)

2019-05-27 Thread Sergey Petrunia
On Fri, May 24, 2019 at 04:39:51PM -0700, Igor Babaev wrote:
> On 05/24/2019 01:41 PM, Sergey Petrunia wrote:
> > On Fri, May 24, 2019 at 10:08:04AM -0700, Igor Babaev wrote:
> > > On 05/24/2019 04:03 AM, Varun Gupta wrote:
> > > > Hi Igor,
> > > > After discussing with Sergey, we came up with these conclusions as to
> > > > why we used the approach of going through all the keyuses in the KEYUSE
> > > > array
> > > > 
> > > > Cases to consider:
> > > > we have an index on column a
> > > > 
> > > > 1) a OP const
> > > > where OP can be ( > > > This case is already handled by the range optimizer. We do not create a
> > > > NULL rejecting predicate for such conditions.
> > > > 
> > > > 2) eq_join conditions (tbl1.a = tbl2.col2)
> > > > This is the specific case we have tried to implement in MDEV-15777, we
> > > > create NULL rejecting predicates for the keyuses for each table and then
> > > > feed these predicates to range optimizer
> > > There is no logical explanation for checking null rejection of fields used
> > > in equalities with the help of array keyuse.
> > The logic here is as follows:
> > 
> > We already collect the set of KEYUSE::null_rejecting attributes. Its
> > meaning is very close to what the patch needs.
> The meaning of this argument is to be used a key value for ref access by an
> index i1.
> You need range access by a different index i2. There is no logical
> connection
> between these two indexes. The usage of i1 can be blocked by IGNORE clause
> and this case
> there won't be any keyuses for it. But the equality from which the
> null-rejection can be
> deduced is still there. You need range scan to reduce the number of records
> produced
> by the left operand of a join operation. And you should not care weather it
> should be the
> argument of any ref access.
> Here is the situation:
> The field of interest for range access is t1.a and you have two conditions
>t1.a=t2.a with index i1 on t2.a
>t1.b=t3.b with index i2 on t3.b
> The use prohibited to use i1. t1.a apparently remains null-rejecting. Yet it
> cannot be not detected
> as such by a look through array of keyuses.

I don't follow the logic. What MDEV-15777 does is to basically look at KEYUSE
object, which represent

  tbl.key_col = other_column

and feed this into the range optimizer:
  
  tbl.key_col IS NOT NULL

If the query CAN use a certain index:
- we will have a KEYUSE object for it
- we will feed the IS NOT NULL into the range optimizer.

if the query CANNOT use a certain index (because of IGNORE INDEX), then
- we won't have KEYUSE object for that index (and so won't construct an
  IS NOT NULL condition for it)
- range optimizer will not build ranges for it (so it doesn't really matter 
  if IS NOT NULL was constructed or not)

Can you provide a full example?

> > 
> > Because of this, the part of the patch for MDEV-15777 which analyzes the 
> > KEYUSE
> > array is very small: 122 lines including the comment (I counted
> > make_null_rejecting_conds() and add_cond()).
> > 
> > I think, the CPU overhead is small, too.
> > 
> > > > 3) a < func(b,c)
> > > > we do not handle this case, because:
> > > > 
> > > > a) It is harder to process. We would need to process the whole WHERE
> > > > clause to infer
> > > > non-nullability of the condition. Example:
> > > > 
> > > >(t1.f2+1 < t1.f1+t1.f3) OR ... OR t1.f1 is null
> > > > 
> > > > here the left part of the OR condition doesn't allow f1 to be NULL, but
> > > > the
> > > > right condition allows it. We would also need to take into account 
> > > > tables
> > > > inside outer joins. We would need to go through Item classes and add 
> > > > code
> > > > which say "Item_xxx() will compute to false when its argument Y is null"
> > > > (there is item->not_null_tables() currently, but not 
> > > > not_null_columns()).
> > > Have you actually looked at the implementations of the virtual function
> > > not_null_tables()?
> > > Do you need me to provide the implementation of the virtual function
> > > not_null_columns()?
> > It will require multiple implementations. At the moment we have:
> > 
> > nm mysqld --demangle | grep 'Item.*::not_null_tables'  | wc -l
> > 21
> > 
> > Another question: do you intend to collect not_null_columns() for all 
> > columns,
> > or just columns that are a part of some index?
> > 
> > What would a good data structure to store a set of tablename.column_name ?
> It will be just a bitmap created for each table that can be used for many
> other purposes.
> If you want to filter out those that are not part of any index it can be
> easily done
> (even once for any table share)

My point was that constructing such a bitmap is a lot heavier than the
not_null_table_map().

table.h declares a field bitmap class:

  typedef Bitmap Field_map;

but it's huge:

(gdb) p sizeof(Field_map)
  $52 = 544

and note that we will not be able to just put one Field_map into the TABLE
object:

If we think about how to compute not_null_columns() for Item_cond_and or

Re: [Maria-developers] [Commits] 53d0e921794: MDEV-11094: Blackhole table updates on slave fail when row annotation is enabled

2019-05-27 Thread andrei . elkin
Sujatha, howdy.

Thank you for spelling it out for me!

> Hello Andrei,
...
>>>
>>> Analysis:
>>> =
>>> Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
>>> annotate rows events received from the master to its own binary log. The
>>> received annotate events are applied after the Gtid event as shown below.
>>> thd->query() will be set to the actual query received from the master, 
>>> through
>>> annotate event. Annotate_rows event should not be deleted after the event is
>>> applied as the thd->query will be used to generate new Annotate_rows event
>>> during applying the subsequent Rows events.
>> [ here ]
>>The Table_map and Rows_log_event that follow do not touch thd->query().
>
> The Table_map and Rows_log_event that follow "Annotate_rows" event do
> make use of "thd->query()".

Frankly I did not expect that. Perhaps it would have not been a bad idea
to make Annotate_rows_log_event::do_apply_event() logging of
itself, which still makes sense even if the slave applier will filter
out its group's Rows_log_event:s so in this case the slave side event group
would consists of mere Annotate.

The above is a mere remark though, I have to accept what we have.

>>
>> I also suggest to check out mysqlbinlog run. There seems to be no test
>> for that though, so one needs to be written. The goal would be to prove
>> a created binlog output is as expected (e.g to include Annotate).
>
>
> Sure, I will add a test case.

Super! The patch is approved then.

Cheers,

Andrei

___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp