Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-10 Thread Dmitry Yemanov

10.07.2017 14:20, Paul Reeves wrote:


I can understand that this plan might appear to be invalid from the
perspective of the optimiser. But surely the whole point of adding the
PLAN clause is because I think I know better than the optimiser what I
want.


Unless the engine physically cannot execute your plan. This was exactly 
the case.



So to return to the original query:

   where e.EMP_NO = COALESCE( ?, e.emp_no )

surely I ought to be able to add

   PLAN (E INDEX (RDB$PRIMARY7))

Then if the parameter resolves to a value at run-time the index will be
used. Obviously if the parameter is null performance would be much
worse than a natural scan because the engine must walk the index pages
and then walk the data pages. But the customer is always right,
surely :-)


You seem to believe that the engine can execute the INDEX plan without 
lower/upper bounds (which require e.emp_no to be known in advance). In 
theory, it could. In practice, it's pointless and thus prohibited by the 
optimizer.



Why not create a second stream in that case?

that seems to be what happens when I create this query:

SELECT  *
FROM employee
WHERE
  EMP_NO = 2
OR
  EMP_NO is null

this plan is chosen...

   PLAN (EMPLOYEE INDEX (RDB$PRIMARY7, RDB$PRIMARY7))


This is still a single stream and bitmap based on two index scans.

Solution with two streams (chosen conditionally at runtime) is 
implemented in FB3 but only for some specific syntax. Perhaps your 
COALESCE trick could join this special category, but it doesn't yet.



Because there are two streams the fetches and the indexed reads more or
less double because of the OR clause.

(Again, a subject for another day is why does the optimiser even bother
with the second stream when EMP_NO can never be null ? )


The optimizer don't take NOT NULL constraints into account. They don't 
have existence locks and thus can affect query results if dropped in the 
meantime.



Dmitry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-10 Thread Paul Reeves
On Mon, 10 Jul 2017 12:31:43 +0300 Dmitry Yemanov wrote

> 
> > The other question I asked was...
> > 
> >  why can't I force the plan...
> >
> >PLAN (E INDEX (RDB$PRIMARY7))  
> 
> Because it's invalid in this case.
> 

I can understand that this plan might appear to be invalid from the
perspective of the optimiser. But surely the whole point of adding the
PLAN clause is because I think I know better than the optimiser what I
want. So to return to the original query:

  where e.EMP_NO = COALESCE( ?, e.emp_no )

surely I ought to be able to add 

  PLAN (E INDEX (RDB$PRIMARY7))  

Then if the parameter resolves to a value at run-time the index will be
used. Obviously if the parameter is null performance would be much
worse than a natural scan because the engine must walk the index pages
and then walk the data pages. But the customer is always right,
surely :-)

( I also understand there is a downside to the customer always being
right argument - one bad customer could screw everything up for all the
others. But is that argument being used here or is it just because of
the way the optimiser works? )


> > What are the rules for deciding when a PLAN statement will be
> > rejected?  
> >>From one logical point of view if have this sort of query  
> > 
> >select * from mytable m where afield = whatever;
> > 
> > and I add
> > 
> >plan (m index(myindex ))
> > 
> > the engine should accept that. Shouldn't it?  
> 
> It depends on "whatever". If it's literal or independent expression
> or priorly evaluated field, index scan can be used. If it includes a
> field from the same stream, it cannot.
> 

Why not create a second stream in that case?

that seems to be what happens when I create this query:

   SELECT  *
   FROM employee
   WHERE
 EMP_NO = 2  
   OR
 EMP_NO is null 

this plan is chosen...

  PLAN (EMPLOYEE INDEX (RDB$PRIMARY7, RDB$PRIMARY7))

Because there are two streams the fetches and the indexed reads more or
less double because of the OR clause. 

(Again, a subject for another day is why does the optimiser even bother
with the second stream when EMP_NO can never be null ? )

Anyway, none of the above is any sort of criticism. I'm just trying to
get a better understanding of the rules and maybe find out where things
are broken that could be fixed easily (or hacked around).


Paul
-- 
Paul Reeves
http://www.ibphoenix.com
Supporting users of Firebird
 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-10 Thread Dmitry Yemanov

10.07.2017 12:17, Paul Reeves пишет:

On Fri, 7 Jul 2017 18:07:55 +0300 Dmitry Yemanov wrote


07.07.2017 17:51, Paul Reeves wrote:


I understand that evaluating COALESCE(?, e.emp_no ) at prepare time
may require a circular logic and is thus impractical


It cannot be done at runtime either (without deducting that e.emp_no is 
the same in both paths of the condition).



but surely
the more important piece of information is in the

   where e.emp_no =

A unique index exists on this column so surely the optimiser should
choose it instead of a natural scan ?


No, it cannot. Imagine "where T.A = T.B", it cannot use an index for 
either A or B. Only full table scan is possible.



The other question I asked was...

 why can't I force the plan...
   
   PLAN (E INDEX (RDB$PRIMARY7))


Because it's invalid in this case.


What are the rules for deciding when a PLAN statement will be rejected?

From one logical point of view if have this sort of query


   select * from mytable m where afield = whatever;

and I add

   plan (m index(myindex ))

the engine should accept that. Shouldn't it?


It depends on "whatever". If it's literal or independent expression or 
priorly evaluated field, index scan can be used. If it includes a field 
from the same stream, it cannot.



Dmitry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-10 Thread Paul Reeves
On Fri, 7 Jul 2017 18:07:55 +0300 Dmitry Yemanov wrote

> 07.07.2017 17:51, Paul Reeves wrote:
> 
> > But that doesn't answer all my questions...
> > 
> > Given
> > 
> >where e.EMP_NO = COALESCE(?, e.emp_no )
> > 
> > and that there is an index on EMP_NO, why doesn't the optimiser
> > default to the index. After all, it is logically more likely that a
> > value will be passed in the where condition, rather than a NULL.  
> 
> To evaluate COALESCE, e.emp_no must be known. How it can be known
> before we start reading the table (via index scan)? Chicken and egg
> problem.
> 

I understand that evaluating COALESCE(?, e.emp_no ) at prepare time
may require a circular logic and is thus impractical but surely
the more important piece of information is in the 

  where e.emp_no = 

A unique index exists on this column so surely the optimiser should
choose it instead of a natural scan ?


The other question I asked was...

why can't I force the plan...
  
  PLAN (E INDEX (RDB$PRIMARY7))

This error is raised...

  'cannot be used in the specified plan'


And a supplementary question (perhaps it merits a separate thread )

What are the rules for deciding when a PLAN statement will be rejected?
>From one logical point of view if have this sort of query

  select * from mytable m where afield = whatever;

and I add 

  plan (m index(myindex ))

the engine should accept that. Shouldn't it?



Paul
-- 
Paul Reeves
http://www.ibphoenix.com
Supporting users of Firebird
 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Dmitry Yemanov

07.07.2017 18:26, Adriano dos Santos Fernandes wrote:


BTW, isn't ConditionalStream used for something in this field?


Yep, but the optimizer so far handles just one specific case (A = ? or ? 
is null). It could be extended though.



Dmitry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Leyne, Sean


> To evaluate COALESCE, e.emp_no must be known. 

Why?

Is it not really the case that for all practical purposes the COALESCE will 
always return a value

How else could COALESCE(?, NULL ) [Example #3] use an index?
{Nothing says that the input parameter won't be NULL}


Sean


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Adriano dos Santos Fernandes
On 07/07/2017 12:19, Dmitry Yemanov wrote:
> 07.07.2017 18:12, Dimitry Sibiryakov wrote:
>>
>> In this particular case it is enough to know parameter value to
>> choose plan. Parameters are known before reading table.
>
> True, but our optimizer is developed for generic cases, not such
> specific ones. It could be improved, but I'd say we have more
> important issues to solve there.
>
BTW, isn't ConditionalStream used for something in this field?


Adriano


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Dmitry Yemanov

07.07.2017 18:12, Dimitry Sibiryakov wrote:


In this particular case it is enough to know parameter value to 
choose plan. Parameters are known before reading table.


True, but our optimizer is developed for generic cases, not such 
specific ones. It could be improved, but I'd say we have more important 
issues to solve there.



Dmitry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Dimitry Sibiryakov

07.07.2017 17:07, Dmitry Yemanov wrote:

Given

   where e.EMP_NO = COALESCE(?, e.emp_no )

and that there is an index on EMP_NO, why doesn't the optimiser default
to the index. After all, it is logically more likely that a value will
be passed in the where condition, rather than a NULL.


To evaluate COALESCE, e.emp_no must be known. How it can be known before we start reading 
the table (via index scan)? Chicken and egg problem.


  In this particular case it is enough to know parameter value to choose plan. Parameters 
are known before reading table.


--
  WBR, SD.

PS: Is it only my Thunderbird has started to send answers to authors instead of 
list?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Dmitry Yemanov

07.07.2017 17:51, Paul Reeves wrote:


But that doesn't answer all my questions...

Given

   where e.EMP_NO = COALESCE(?, e.emp_no )

and that there is an index on EMP_NO, why doesn't the optimiser default
to the index. After all, it is logically more likely that a value will
be passed in the where condition, rather than a NULL.


To evaluate COALESCE, e.emp_no must be known. How it can be known before 
we start reading the table (via index scan)? Chicken and egg problem.



Dmitry

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Paul Reeves
On Fri, 7 Jul 2017 11:27:26 -0300 Adriano dos Santos Fernandes wrote

> >  
> It does not evaluate COALESCE at prepare time.
> 
> It's just a expression which may or may not contain fields.
> 
> If it doesn't contain fields, it will be the same as a simple "?" for
> the plan calculation purposes.
> 
> 

OK. That is not the answer I was hoping for :-)

But that doesn't answer all my questions...

Given 

  where e.EMP_NO = COALESCE(?, e.emp_no )

and that there is an index on EMP_NO, why doesn't the optimiser default
to the index. After all, it is logically more likely that a value will
be passed in the where condition, rather than a NULL.

In fact, considering the index is a primary key 

 where e.EMP_NO = NULL 

would make no sense at all. Surely the logic ought to be

  if unique index on field then use index
 

Also, even though a field is specified in the coalesce why can't I
force the plan...

  PLAN (E INDEX (RDB$PRIMARY7))

This error is raised...

  'cannot be used in the specified plan'



Paul
-- 
Paul Reeves
http://www.ibphoenix.com
Supporting users of Firebird
 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Optimisation of where afield = coalesce(value1, value2 )

2017-07-07 Thread Adriano dos Santos Fernandes
On 07/07/2017 11:19, Paul Reeves wrote:
> Given this simple SQL statement
>
>   select e.emp_no, e.full_name
>   from employee e
>   where e.emp_no = coalesce(value1,value2 )
>
> the plan varies depending on the following :
>
> 1.  where e.EMP_NO = COALESCE(?, ? ) -- PLAN (E INDEX (RDB$PRIMARY7))
>
> 2.  where e.EMP_NO = COALESCE(2, ? ) -- PLAN (E INDEX (RDB$PRIMARY7))
>
> 3.  where e.EMP_NO = COALESCE(?, NULL ) -- PLAN (E INDEX (RDB$PRIMARY7))
>
> 4.  where e.EMP_NO = COALESCE(2, NULL ) -- PLAN (E INDEX (RDB$PRIMARY7))
>
> 5.  where e.EMP_NO = COALESCE(NULL, 2 ) -- PLAN (E INDEX (RDB$PRIMARY7))
>
> 6.  where e.EMP_NO = COALESCE(NULL, e.emp_no ) -- PLAN (E NATURAL)
>
> 7.  where e.EMP_NO = COALESCE(2, e.emp_no ) -- PLAN (E NATURAL)
>
> 8.  where e.EMP_NO = COALESCE(?, e.emp_no ) -- PLAN (E NATURAL)
>
>

...

> And examples 2, and 4 seem to prove that the optimiser is capable of
> evaluating the COALESCE during prepare and choosing an index. 
...
> However we have seen in
> examples 1 and 3 that it is equally capable of choosing an index when
> evaluating the coalesce even if the final evaluation may be null.
>
> And so here are the questions 
>
> - is the plan produced in example 8 wrong or right ?
> - the above examples seem to lack coherence so how much is COALESCE
>   evaluated during a prepare ?
> - should the optimiser assume that the first parameter will
>   always resolve to a value
> - how does the optimiser consider COALESCE when this construct is used
>   in a stored procedure?
>
>
It does not evaluate COALESCE at prepare time.

It's just a expression which may or may not contain fields.

If it doesn't contain fields, it will be the same as a simple "?" for
the plan calculation purposes.


Adriano


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel