Re: [Firebird-devel] Compiled statement cache

2022-03-03 Thread Adriano dos Santos Fernandes
On 03/03/2022 14:19, Alex Peshkoff via Firebird-devel wrote:
> On 2/28/22 20:30, Adriano dos Santos Fernandes wrote:
>> On 28/02/2022 13:18, Alex Peshkoff via Firebird-devel wrote:
>>> I suppose it's with mentoned 2Mb cache.
>> Yes. But it's the same even with 100K as the test uses very few and
>> small statements.
> 
> Taking into an account that bigmost was 21k there are really few
> statements :)
> Did you try with smaller cache size?
> 

No, only with it disabled.

I still have some problems to detect statements size with more precision.


> At the first glance that's changing index active/inactive

This should be already invalidating cache, as it's a DDL statement.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-03-03 Thread Dmitry Yemanov

03.03.2022 20:19, Alex Peshkoff via Firebird-devel wrote:


I think there are two possible ways:

- Timeout of cached statement (counting from its first appearance in
cache, not last usage)


Yes, re-preparing all statements once per relatively big timeout should 
not cause visible performance problem.


Better it should be "re-optimizing" (just CMP_post_rse) rather than 
"re-preparing". The problem is that the optimizer uses the same request 
pool and all its allocations are "delete-by-pool", so re-optimization is 
gonna to become a memory leak. Perhaps we'll need to add an explicit 
request's child pool for optimization purposes which can be re-created 
during re-optimization.



- When engine detects a condition which could change a plan, it may ask
cache for invalidation.


At the first glance that's changing index active/inactive and in the 
future bulk insert. That's first things that come to my mind.


It's more complex than that, but I think this is a 2nd priority task and 
somewhat later I'll jump in to help with the optimizer part.



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-03-03 Thread Alex Peshkoff via Firebird-devel

On 2/28/22 20:30, Adriano dos Santos Fernandes wrote:

On 28/02/2022 13:18, Alex Peshkoff via Firebird-devel wrote:

I suppose it's with mentoned 2Mb cache.

Yes. But it's the same even with 100K as the test uses very few and
small statements.


Taking into an account that bigmost was 21k there are really few 
statements :)

Did you try with smaller cache size?




One more question. Suppose some statement remains in a cache for very
long time cause it's reused again and again. That's fine - but with DB
grow optimal plan can change. Is it solved somehow? Suppose in first
implementation not,

Correct.

And it's a problem also now when application holds prepared statement
manually, but that probably is not going to be changed.


Yes, it's out of scope.




but are there plans to solve it?


I think there are two possible ways:

- Timeout of cached statement (counting from its first appearance in
cache, not last usage)


Yes, re-preparing all statements once per relatively big timeout should 
not cause visible performance problem.



- When engine detects a condition which could change a plan, it may ask
cache for invalidation.


At the first glance that's changing index active/inactive and in the 
future bulk insert. That's first things that come to my mind.






Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-28 Thread Adriano dos Santos Fernandes
On 28/02/2022 13:18, Alex Peshkoff via Firebird-devel wrote:
> I suppose it's with mentoned 2Mb cache.

Yes. But it's the same even with 100K as the test uses very few and
small statements.


> 
> One more question. Suppose some statement remains in a cache for very
> long time cause it's reused again and again. That's fine - but with DB
> grow optimal plan can change. Is it solved somehow? Suppose in first
> implementation not,

Correct.

And it's a problem also now when application holds prepared statement
manually, but that probably is not going to be changed.


> but are there plans to solve it?
> 

I think there are two possible ways:

- Timeout of cached statement (counting from its first appearance in
cache, not last usage)

- When engine detects a condition which could change a plan, it may ask
cache for invalidation.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-28 Thread Alex Peshkoff via Firebird-devel

On 2/28/22 16:54, Adriano dos Santos Fernandes wrote:

On 26/02/2022 22:05, Adriano dos Santos Fernandes wrote:

I will report back some numbers (memory, prepare times).


Here is a performance test I did: https://pastebin.com/3UnB5BNU

It's relative simple (not much indices) and common metadata.

It's tested as whole (create database, create metadata, populate, run
queries and get results, drop database).

The last select has a statement size of 21.5KB.

For reference, "select 1 from rdb$database" has size 6KB.

Test timings:
- Without statement cache: 19800ms
- With statement cache: 16700ms


More than 10% is great!
I suppose it's with mentoned 2Mb cache.

One more question. Suppose some statement remains in a cache for very 
long time cause it's reused again and again. That's fine - but with DB 
grow optimal plan can change. Is it solved somehow? Suppose in first 
implementation not, but are there plans to solve it?





Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-28 Thread Adriano dos Santos Fernandes
On 26/02/2022 22:05, Adriano dos Santos Fernandes wrote:
> 
> I will report back some numbers (memory, prepare times).
> 

Here is a performance test I did: https://pastebin.com/3UnB5BNU

It's relative simple (not much indices) and common metadata.

It's tested as whole (create database, create metadata, populate, run
queries and get results, drop database).

The last select has a statement size of 21.5KB.

For reference, "select 1 from rdb$database" has size 6KB.

Test timings:
- Without statement cache: 19800ms
- With statement cache: 16700ms


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-26 Thread Adriano dos Santos Fernandes
On 26/02/2022 14:43, Dmitry Yemanov wrote:
> 26.02.2022 17:14, Adriano dos Santos Fernandes wrote:
>>
>> I do want to define default cache size.
>>
>> I'm thinking in 2M.
>>
>> Comments?
> 
> We need to start with something, so why not. However, it would be
> helpful to know what are the "common" statement sizes for tables,
> procedures, etc. Of course, table with one column much differs from
> table with 100 computed columns, as well as a one-liner procedure
> differs from a 1MB-BLR one. But maybe you have more or less real
> databases to get these estimations from.
> 

Procedure body does not count on this cache, as only the user issued
statements counts, not recursively counting others called routines.

But amount of parameters count. So as complex EXECUTE BLOCK. And EXECUTE
BLOCK' sub routines.

But memory size of statement is not always related to compile time, and
cache is important for time-consuming compilation too. A situation I'd
think here is when there are many indices to analyze.

I have seem that very simple selects would have a size of few KBs.

So my reason to think in 2M is that even a database with many
connections (say 100) will not have a very memory-consuming cache if
default is not changed.

I will report back some numbers (memory, prepare times).


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-26 Thread Carlos H. Cantu
Maybe IBSurgeon can help with this, since HQBird monitor statements and they 
have lots of customers using it... probably an average can be calculated based 
on them.

[]s
Carlos
http://www.firebirdnews.org
FireBase - http://www.FireBase.com.br

DY> 26.02.2022 17:14, Adriano dos Santos Fernandes wrote:
>> > I do want to define default cache size.
>> > I'm thinking in 2M.
>> > Comments?

DY> We need to start with something, so why not. However, it would be helpful 
to know what are the "common" statement sizes for tables, procedures, etc. Of 
course, table with one column much differs from table with 100 computed 
columns, as well as a one-liner procedure differs from a 1MB-BLR one. But maybe 
you have more or less real databases to get these estimations from.


DY> Dmitry


DY> Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-26 Thread Dmitry Yemanov

26.02.2022 17:14, Adriano dos Santos Fernandes wrote:


I do want to define default cache size.

I'm thinking in 2M.

Comments?


We need to start with something, so why not. However, it would be 
helpful to know what are the "common" statement sizes for tables, 
procedures, etc. Of course, table with one column much differs from 
table with 100 computed columns, as well as a one-liner procedure 
differs from a 1MB-BLR one. But maybe you have more or less real 
databases to get these estimations from.



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-26 Thread Alex Peshkoff via Firebird-devel

On 2/26/22 17:14, Adriano dos Santos Fernandes wrote:

Hi!

I do want to define default cache size.

I'm thinking in 2M.

Comments?


As long as it's one_place_change constant - any value is OK for begining.




Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-26 Thread Adriano dos Santos Fernandes
Hi!

I do want to define default cache size.

I'm thinking in 2M.

Comments?


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 14:44, Dmitry Yemanov wrote:
> 10.02.2022 20:18, Adriano dos Santos Fernandes wrote:
>>
>> We should have better strategy for request cache inside the statement.
>> If they are cheap to create, it would make no sense to never destroy
>> them like now.
> 
> Looking at Statement::getRequest() I see it as dirty cheap, just a
> matter of few allocations.
> 

Yes, but the impure space stores more allocations as execution is
happening and this also serve as cache for later executions.

So I think we have at least maintain a few ready instead of destroy all
of them directly.

And it seems we currently do not track later allocations stored in
requests (for example vlu_string). And requests uses the statement pool.

So maybe requests should have they own pool. In non-shared cache they
could be sub-pool of the statement pool, but with shared cache they
probably would need to be child of the attachment pool.


>> Please note that requests (both DSQL and JRD) are also created from the
>> statement pools.
>>
>> But if there are active requests, I think the statement should not even
>> be considered to be taken out of cache. It's necessary in this case, so
>> in reality it does not use cache space.
>>
>> So I think cache size (to remove least recent used) should not consider
>> active (necessary) statements.
> 
> Getting rid of long-running but rare statements may be useful. If we
> speak LRU, then the cached statement should be stamped when a new child
> request is created. So we may defer "uncaching" of active request until
> it's freed by user (if its statement weren't re-stamped in the
> meantime). It's size should not be taken into account, as you say.
> 

Yes.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dmitry Yemanov

10.02.2022 20:18, Adriano dos Santos Fernandes wrote:


We should have better strategy for request cache inside the statement.
If they are cheap to create, it would make no sense to never destroy
them like now.


Looking at Statement::getRequest() I see it as dirty cheap, just a 
matter of few allocations.



I would calculate size of statement as sum of DSQL statement pool + JRD
statement pool after it is prepared.


Sounds reasonable.


Please note that requests (both DSQL and JRD) are also created from the
statement pools.

But if there are active requests, I think the statement should not even
be considered to be taken out of cache. It's necessary in this case, so
in reality it does not use cache space.

So I think cache size (to remove least recent used) should not consider
active (necessary) statements.


Getting rid of long-running but rare statements may be useful. If we 
speak LRU, then the cached statement should be stamped when a new child 
request is created. So we may defer "uncaching" of active request until 
it's freed by user (if its statement weren't re-stamped in the 
meantime). It's size should not be taken into account, as you say.



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 12:30, Vlad Khorsun wrote:
>   So, main benefit visible to end-user is to save prepare and check access
> time, correct ?

Yes.


> Also, good written apps (that re-uses prepared statements)
> will not see much changes - at least until impl of shared metadata cache.
> 

I consider ORM written applications as good ones.

And I would not consider them good if they cache statements in the
client requiring extra efforts to avoid locks in DDL changes in another
connections.

Also good written applications uses connection pools and will benefit
from the cache.

It's like saying the current sharing of JRD statements from
procedures/functions/triggers inside the attachment has no value because
it's not shared.

So I do not agree with your points.


> 
>> Also reuse of cached statements reduces memory consumption of individual
>> uncached identical statements.
> 
>   I.e. when application uses more than one instance of the same
> statement in
> the same connection ? Hard to imagine, but everything possible...
> 

ORMs tends to execute parameterized queries that is always executed
again and again in the course of the application execution.

This is very common pattern.


>   The main idea is to allow to disable user cache but not system cache.
> Are you going to move IRQ_REQUESTS and DYN_REQUESTS into such system cache,
> or am I too optimistic ?
> 

As I said in another answer, I consider this a different topic.


>> If they are not, them when roles are different than one present in
>> cached statement, a verifyAccess would need to be called on the
>> statement get from the cache before it's usage is allowed.
> 
>   I ask because such restrictions could make stmt cache less useful.
> 
> For example, we could keep list of already verified set of credentials
> (user name + roles list) with cached stmt to not include it into key.
> 

Yes, this is good.


>   I want to clarify - what happens when app prepares (or execute) two
> identical stmts ? First instance could be taken from cache, ok. What
> happens with second instance ?
> 

First statement is prepared like before (parse, DSQL passes, GEN, JRD
compile, JRD passes, DSQL messages) and then inserted into the cache.
Will then create DSQL request with dsqlStatement->createRequest().

Second prepare will see the statement key in the cache and get
DsqlStatement from it, check access and do dsqlStatement->createRequest().

Statement cache is basically a map>.

DSqlStatement has the (Jrd)Statement inside it.

So cached statements avoid all DSQL and JRD compilation passes.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 14:13, Dimitry Sibiryakov wrote:
> Dmitry Yemanov wrote 10.02.2022 18:07:
>>> But as I said and Vlad also said, we can remove roles from key and
>>> verify (with verification cache) after get statement from cache. This
>>> would be better.
>>
>> Yes, this gets my vote too.
> 
>   On the other hand ACL verification used to be a known bottleneck in
> the past so caching it may gain more that everything else in total.
> 

It's why I said: *with verification cache*.

ACL changes would them invalidate verification cache only.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 13:13, Dmitry Yemanov wrote:
> 08.02.2022 16:36, Adriano dos Santos Fernandes wrote:
>>
>> First what should be the statement key in the cache?
>>
>> I've peek these:
>> - statement text
>> - clientDialect
>> - isInternalRequest
>> - current client charset (as external engines may change it)
> 
> Cannot the UTF8-translated SQL text (which is currently inside
> Statement::sqlText) be the key, to avoid dependency on the charset? I
> don't think that semantically different statements may have the same
> UTF8 representation. However, it would make sense to have a single
> cached statement for the same statement executed from different client
> charsets.
> 

Answered in another topic.


>> I see three approaches:
>> - 1. Timeout after its put in the cache, updated when it is get from it
>> - 2. LRU based on memory consumption and max cache size
>> - 3. Both 1. and 2.
> 
> I'd start with 2.

Ok.


> How are you going to calculate the memory consumption?
> Size of the statement pool + impureSize?
> 

I don't think the impureSize should be taken. It's related to request
executions.

We should have better strategy for request cache inside the statement.
If they are cheap to create, it would make no sense to never destroy
them like now.

I would calculate size of statement as sum of DSQL statement pool + JRD
statement pool after it is prepared.

Please note that requests (both DSQL and JRD) are also created from the
statement pools.

But if there are active requests, I think the statement should not even
be considered to be taken out of cache. It's necessary in this case, so
in reality it does not use cache space.

So I think cache size (to remove least recent used) should not consider
active (necessary) statements.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Dmitry Yemanov wrote 10.02.2022 18:07:

But as I said and Vlad also said, we can remove roles from key and
verify (with verification cache) after get statement from cache. This
would be better.


Yes, this gets my vote too.


  On the other hand ACL verification used to be a known bottleneck in the past 
so caching it may gain more that everything else in total.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 13:34, Dimitry Sibiryakov wrote:
> Dmitry Yemanov wrote 10.02.2022 17:28:
>>> Only if such translation is made right. Remember charset introducers.
>>
>> They're a problem, but it may only cause a cache miss, not a false
>> match, right?
> 
>   Yes, they can only cause miss but I'm not sure about literals as
> whole. Are they stored in execution tree already transliterated or with
> charset mark?
>   Can the query "insert into t values ('абв')" sent from attachment with
> different charsets (win1251 and utf-8 for example) hit the cache at all?
> Should it?..
> 

Please also note that transform queries to parameterized ones may be
problematic for future optimizer improvements (histograms).


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 13:28, Dmitry Yemanov wrote:
> 
> In the ideal world, maybe. But this is *much* more complicated than it
> seems at the first glance.
> 

I'd even go further and say we should not slow down things to catch this.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dmitry Yemanov

10.02.2022 19:54, Adriano dos Santos Fernandes wrote:


I come with this requirement because verifyAccess is currently part of
compilation.

But as I said and Vlad also said, we can remove roles from key and
verify (with verification cache) after get statement from cache. This
would be better.


Yes, this gets my vote too.


Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 13:21, Dimitry Sibiryakov wrote:
> Dmitry Yemanov wrote 10.02.2022 17:13:
>> Cannot the UTF8-translated SQL text (which is currently inside
>> Statement::sqlText) be the key, to avoid dependency on the charset? I
>> don't think that semantically different statements may have the same
>> UTF8 representation. However, it would make sense to have a single
>> cached statement for the same statement executed from different client
>> charsets.
> 
>   Only if such translation is made right. Remember charset introducers.
>   But apparently to transform the query before using it as a cache key
> is a right idea. Two queries different only by some whitespaces,
> comments or case (unless in literals) should not miss the cache.
> 

I think the main cache objective is not to lose time processing text and
caching these cases, like ad hoc different queries may have. Only basic
left/right trim would be more than enough.

Applications uses queries and they do not create extra random spaces in
statement text everytime they prepare/execute a query.

Applications also tends to use the same character set.

But main problem is because messages (both DSQL and JRD) are completely
different when different charsets are used.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 13:20, Dmitry Yemanov wrote:
> 10.02.2022 16:01, Adriano dos Santos Fernandes wrote:
>>
>> (Jrd)Statement is reused - new jrd_req are get from same statement.
> 
> IIRC, the existing cache of internal requests preserves jrd_req's. Am I
> right that after the jrd_req->Statement refactoring the cost of creation
> of new jrd_req is trivial, so it does not make sense to preserve them?
> 

Also, in the refactoring already made, the only change about
Statement/jrd_req is that now Statement could be created without an
initial jrd_req. Previously the internal API always created an initial
jrd_req.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 13:20, Dmitry Yemanov wrote:
> 10.02.2022 16:01, Adriano dos Santos Fernandes wrote:
>>
>> (Jrd)Statement is reused - new jrd_req are get from same statement.
> 
> IIRC, the existing cache of internal requests preserves jrd_req's. Am I
> right that after the jrd_req->Statement refactoring the cost of creation
> of new jrd_req is trivial, so it does not make sense to preserve them?
> 

I'm talking about compiled cache at DSQL level.

We currently have one or two internal DSQL request, but I don't think
it's a way to go now (or directly related to this) and expand it. I have
some ideas for it which would replace GDML by something better, without
preprocessor but also without lose type safety, but I think this is
another topic.

So in this thread I'm not considering changes in internal (GDML) requests.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 12:43, Dmitry Yemanov wrote:
> 10.02.2022 15:57, Adriano dos Santos Fernandes wrote:
>>
>>>
>>> If we need to take roles into an account - only for attachment with same
>>> USER.
>>
>> Even without shared cache, user can change its roles with SET ROLES and
>> new prepared statements should work as before even when they were
>> previously cached with different roles.
> 
> I'm not sure I get why security credentials should affect the cache at
> all. From the runtime POV, all BLR/SQL operations
> (current_user/current_role/rdb$*_roles) are redirected to Attachment,
> AFAIK we don't store anything role-specific inside the statement tree.
> From the security POV, we just need to execute verifyAccess() for the
> request retrieved from the cache.
> 
> What am I missing?
> 

I come with this requirement because verifyAccess is currently part of
compilation.

But as I said and Vlad also said, we can remove roles from key and
verify (with verification cache) after get statement from cache. This
would be better.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Alex Peshkoff via Firebird-devel wrote 10.02.2022 17:35:
If we see solid effect from new cache - certainloy, it will be worth to make it 
better prepare search key. I.e. let's start from something simple and see does 
it make sense.


  That's why I suggested to ask kdv for stats: to prevent a lot of work without 
visible gain.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Alex Peshkoff via Firebird-devel

On 2/10/22 19:21, Dimitry Sibiryakov wrote:

Dmitry Yemanov wrote 10.02.2022 17:13:
Cannot the UTF8-translated SQL text (which is currently inside 
Statement::sqlText) be the key, to avoid dependency on the charset? I 
don't think that semantically different statements may have the same 
UTF8 representation. However, it would make sense to have a single 
cached statement for the same statement executed from different 
client charsets.


  Only if such translation is made right. Remember charset introducers.
  But apparently to transform the query before using it as a cache key 
is a right idea. Two queries different only by some whitespaces, 
comments or case (unless in literals) should not miss the cache.




I think that for the first time it\s enough to implement simple most 
cache. IMHO more precise detection (like different whitespaces, etc.) of 
same statements hardly makes big dufference in most cases. If we see 
solid effect from new cache - certainloy, it will be worth to make it 
better prepare search key. I.e. let's start from something simple and 
see does it make sense.





Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Dmitry Yemanov wrote 10.02.2022 17:28:

Only if such translation is made right. Remember charset introducers.


They're a problem, but it may only cause a cache miss, not a false match, right?


  Yes, they can only cause miss but I'm not sure about literals as whole. Are 
they stored in execution tree already transliterated or with charset mark?
  Can the query "insert into t values ('абв')" sent from attachment with 
different charsets (win1251 and utf-8 for example) hit the cache at all? Should 
it?..


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dmitry Yemanov

10.02.2022 19:21, Dimitry Sibiryakov wrote:


Only if such translation is made right. Remember charset introducers.


They're a problem, but it may only cause a cache miss, not a false 
match, right?


But apparently to transform the query before using it as a cache key 
is a right idea. Two queries different only by some whitespaces, 
comments or case (unless in literals) should not miss the cache.


In the ideal world, maybe. But this is *much* more complicated than it 
seems at the first glance.



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Dmitry Yemanov wrote 10.02.2022 17:13:
Cannot the UTF8-translated SQL text (which is currently inside 
Statement::sqlText) be the key, to avoid dependency on the charset? I don't 
think that semantically different statements may have the same UTF8 
representation. However, it would make sense to have a single cached statement 
for the same statement executed from different client charsets.


  Only if such translation is made right. Remember charset introducers.
  But apparently to transform the query before using it as a cache key is a 
right idea. Two queries different only by some whitespaces, comments or case 
(unless in literals) should not miss the cache.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dmitry Yemanov

10.02.2022 16:01, Adriano dos Santos Fernandes wrote:


(Jrd)Statement is reused - new jrd_req are get from same statement.


IIRC, the existing cache of internal requests preserves jrd_req's. Am I 
right that after the jrd_req->Statement refactoring the cost of creation 
of new jrd_req is trivial, so it does not make sense to preserve them?



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dmitry Yemanov

08.02.2022 16:36, Adriano dos Santos Fernandes wrote:


First what should be the statement key in the cache?

I've peek these:
- statement text
- clientDialect
- isInternalRequest
- current client charset (as external engines may change it)


Cannot the UTF8-translated SQL text (which is currently inside 
Statement::sqlText) be the key, to avoid dependency on the charset? I 
don't think that semantically different statements may have the same 
UTF8 representation. However, it would make sense to have a single 
cached statement for the same statement executed from different client 
charsets.



- active roles


Already wrote about that.


Then there is when statements should go out of cache?

I see three approaches:
- 1. Timeout after its put in the cache, updated when it is get from it
- 2. LRU based on memory consumption and max cache size
- 3. Both 1. and 2.


I'd start with 2. How are you going to calculate the memory consumption? 
Size of the statement pool + impureSize?



Should it be enabled by default?


Maybe, with a reasonable size. As Mark says, it doesn't make sense to 
disable new features by default ;-)



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Dmitry Yemanov wrote 10.02.2022 16:43:
I'm not sure I get why security credentials should affect the cache at all. From 
the runtime POV, all BLR/SQL operations (current_user/current_role/rdb$*_roles) 
are redirected to Attachment, AFAIK we don't store anything role-specific inside 
the statement tree. From the security POV, we just need to execute 
verifyAccess() for the request retrieved from the cache.


What am I missing?


  I guess Adriano is going to invalidate the cache on ACL changes.

--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Vlad Khorsun

10.02.2022 17:35, Dimitry Sibiryakov wrote:

Vlad Khorsun wrote 10.02.2022 16:30:

   More, I already implemented simple cache of prepared statements near 5 years
ago and I know when it is useful :) It was not ported into Firebird because of
limited usage and its simplicity. But customer which uses it, was very happy.


   Isn't it inside of EXECUTE STATEMENT implementation?


 No.

Regards,
Vlad


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dmitry Yemanov

10.02.2022 15:57, Adriano dos Santos Fernandes wrote:




If we need to take roles into an account - only for attachment with same
USER.


Even without shared cache, user can change its roles with SET ROLES and
new prepared statements should work as before even when they were
previously cached with different roles.


I'm not sure I get why security credentials should affect the cache at 
all. From the runtime POV, all BLR/SQL operations 
(current_user/current_role/rdb$*_roles) are redirected to Attachment, 
AFAIK we don't store anything role-specific inside the statement tree. 
From the security POV, we just need to execute verifyAccess() for the 
request retrieved from the cache.


What am I missing?


Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Vlad Khorsun wrote 10.02.2022 16:30:

   More, I already implemented simple cache of prepared statements near 5 years
ago and I know when it is useful :) It was not ported into Firebird because of
limited usage and its simplicity. But customer which uses it, was very happy.


  Isn't it inside of EXECUTE STATEMENT implementation?

--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Vlad Khorsun

10.02.2022 14:56, Adriano dos Santos Fernandes wrote:

On 10/02/2022 06:44, Vlad Khorsun wrote:

08.02.2022 15:36, Adriano dos Santos Fernandes wrote:

Hi!

I have refactored DSQL statements/requests (continued worked of many
time ago) to separate shared (statement) and specific (request) parts.

It seems this ground work for compiled statement cache is complete and I
have even did a (very very) initial version of a compiled statement
cache working.


   First, please, describe - what goals\benefits of this cache ? Especially,
taking  into account current per-attachment metadata.



In ideal world applications prepare their statement and execute them
when that is going to be repeated.

I would even consider a bug if they do not do that correctly in
situations where an application process starts and do things in a loop.

But it's not surprise that many applications are not well coded and
sometimes does not do this. Competitor DBMSs has cache and works well in
this case too.

But more important to make bugged applications better, it's to make more
well coded applications better.

For example if ORM frameworks (you like it or not - they are used a lot)
cache prepared statements for better performance, that will have side
effects:
- It cannot control server memory usage.
- It locks not currently used objects preventing database changes.

I have seem in test a not very complex statement having it's prepare
time reduced by 50% when it's cached.


  So, main benefit visible to end-user is to save prepare and check access
time, correct ? Also, good written apps (that re-uses prepared statements)
will not see much changes - at least until impl of shared metadata cache.

  I'm not against of caching statements by the engine, I just want to explore
both sides of coin.

  More, I already implemented simple cache of prepared statements near 5 years
ago and I know when it is useful :) It was not ported into Firebird because of
limited usage and its simplicity. But customer which uses it, was very happy.


Also reuse of cached statements reduces memory consumption of individual
uncached identical statements.


  I.e. when application uses more than one instance of the same statement in
the same connection ? Hard to imagine, but everything possible...


Now I think it's better to discuss its semantics.

First what should be the statement key in the cache?


   All what affects statement compilation process, at least.


I've peek these:
- statement text
- clientDialect
- isInternalRequest


   Why it is important ? Do we have internal DSQL requests now ?


Yes.



Is it makes sense to have two caches - for internal and for user
statements ?



In long term I think yes, but it's not two cached, it's just a piece of
the key. And not used cached internal requests would go out of cache.

Anyway, it's very easy to disable it if it's considered as not important
now.


  The main idea is to allow to disable user cache but not system cache.
Are you going to move IRQ_REQUESTS and DYN_REQUESTS into such system cache,
or am I too optimistic ?


- current client charset (as external engines may change it)
- active roles


   If\when shared metadata will be implemented - will it be possible to
use cached compiled statement created in one attachment to use by another
attachment ?


It should be the easier part in the process of shared metadata changes.



If yes, does it means that another attachment should use same
client charset


Yes.



and active roles to be able to use cached compiled
statement ?



About roles, they may not necessary be part of the cache key.

If they are not, them when roles are different than one present in
cached statement, a verifyAccess would need to be called on the
statement get from the cache before it's usage is allowed.


  I ask because such restrictions could make stmt cache less useful.

For example, we could keep list of already verified set of credentials
(user name + roles list) with cached stmt to not include it into key.

...


   And you not explained cache usage - when and how cached statement
should be used.



I'm not sure you want more additional information than one I replied here.


  I want to clarify - what happens when app prepares (or execute) two
identical stmts ? First instance could be taken from cache, ok. What
happens with second instance ?

Regards,
Vlad


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Alex Peshkoff via Firebird-devel

On 2/10/22 15:57, Adriano dos Santos Fernandes wrote:

On 10/02/2022 06:59, Alex Peshkoff via Firebird-devel wrote:

If we need to take roles into an account - only for attachment with same
USER.


Even without shared cache, user can change its roles with SET ROLES and
new prepared statements should work as before even when they were
previously cached with different roles.


Some of them should not wotk at all - provided access to some objects in 
them was granted to previous role only.





Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Adriano dos Santos Fernandes wrote 10.02.2022 14:08:

Some times yes, some times no (it's identical statement reprepared or
executed).


  In this case you could ask IBSurgeon for their statistics about query 
prepare/execute ratio beforehand to estimate possible gain from the cache. IIRC 
during conferences they shown really awful numbers.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 10:04, Dimitry Sibiryakov wrote:
> Adriano dos Santos Fernandes wrote 10.02.2022 13:56:
>> But it's not surprise that many applications are not well coded and
>> sometimes does not do this. Competitor DBMSs has cache and works well in
>> this case too.
> 
>   Such applications used to build the queries ad-hock with data as
> literals.

Some times yes, some times no (it's identical statement reprepared or
executed).


> The competitors has a way to forcefully parameterize such
> queries and without it the cache is useless because every query is
> different.
> 

Yes, this is possible and some do it.

But I do not see it in the scope of a first version.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Adriano dos Santos Fernandes wrote 10.02.2022 13:56:

But it's not surprise that many applications are not well coded and
sometimes does not do this. Competitor DBMSs has cache and works well in
this case too.


  Such applications used to build the queries ad-hock with data as literals. 
The competitors has a way to forcefully parameterize such queries and without it 
the cache is useless because every query is different.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 07:36, Dimitry Sibiryakov wrote:
> Alex Peshkoff via Firebird-devel wrote 10.02.2022 10:59:
>>>   If\when shared metadata will be implemented - will it be possible to
>>> use cached compiled statement created in one attachment to use by
>>> another
>>> attachment ? 
>>
>> If we need to take roles into an account - only for attachment with
>> same USER.
> 
>   It depends on what exactly is going to be cached. If only execution
> plan is cached - it is not affected any by current user, any by role.
> 

Every DSQL structure (messages structures, for example) is cached.

(Jrd)Statement is reused - new jrd_req are get from same statement.

Things related with the DSQL request (message buffers) are created for
each request.


Adriano


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 06:59, Alex Peshkoff via Firebird-devel wrote:
> 
> If we need to take roles into an account - only for attachment with same
> USER.
> 

Even without shared cache, user can change its roles with SET ROLES and
new prepared statements should work as before even when they were
previously cached with different roles.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Adriano dos Santos Fernandes
On 10/02/2022 06:44, Vlad Khorsun wrote:
> 08.02.2022 15:36, Adriano dos Santos Fernandes wrote:
>> Hi!
>>
>> I have refactored DSQL statements/requests (continued worked of many
>> time ago) to separate shared (statement) and specific (request) parts.
>>
>> It seems this ground work for compiled statement cache is complete and I
>> have even did a (very very) initial version of a compiled statement
>> cache working.
> 
>   First, please, describe - what goals\benefits of this cache ? Especially,
> taking  into account current per-attachment metadata.
> 

In ideal world applications prepare their statement and execute them
when that is going to be repeated.

I would even consider a bug if they do not do that correctly in
situations where an application process starts and do things in a loop.

But it's not surprise that many applications are not well coded and
sometimes does not do this. Competitor DBMSs has cache and works well in
this case too.

But more important to make bugged applications better, it's to make more
well coded applications better.

For example if ORM frameworks (you like it or not - they are used a lot)
cache prepared statements for better performance, that will have side
effects:
- It cannot control server memory usage.
- It locks not currently used objects preventing database changes.

I have seem in test a not very complex statement having it's prepare
time reduced by 50% when it's cached.

Also reuse of cached statements reduces memory consumption of individual
uncached identical statements.


>> Now I think it's better to discuss its semantics.
>>
>> First what should be the statement key in the cache?
> 
>   All what affects statement compilation process, at least.
> 
>> I've peek these:
>> - statement text
>> - clientDialect
>> - isInternalRequest
> 
>   Why it is important ? Do we have internal DSQL requests now ?

Yes.


> Is it makes sense to have two caches - for internal and for user
> statements ?
> 

In long term I think yes, but it's not two cached, it's just a piece of
the key. And not used cached internal requests would go out of cache.

Anyway, it's very easy to disable it if it's considered as not important
now.


>> - current client charset (as external engines may change it)
>> - active roles
> 
>   If\when shared metadata will be implemented - will it be possible to
> use cached compiled statement created in one attachment to use by another
> attachment ?

It should be the easier part in the process of shared metadata changes.


> If yes, does it means that another attachment should use same
> client charset

Yes.


> and active roles to be able to use cached compiled
> statement ?
> 

About roles, they may not necessary be part of the cache key.

If they are not, them when roles are different than one present in
cached statement, a verifyAccess would need to be called on the
statement get from the cache before it's usage is allowed.


>> Do you see any thing more?
>>
>> Then there is when statements should go out of cache?
>>
>> I see three approaches:
>> - 1. Timeout after its put in the cache, updated when it is get from it
>> - 2. LRU based on memory consumption and max cache size
>> - 3. Both 1. and 2.
>>
>> I think we can start with 1.
> 
>   For me 2 is a must, 1 good to have but less important.
> 

Ok.


>> Should it be enabled by default?
>>
>> I think yes. 
> 
>   Yes for internal requests. Not sure about user requests.
> 

I would better put small max. memory usage for it than disable.

It would impact a lot some usage patterns - for example - reports with
sub-reports where sub-report always prepare its identical parameterized
query.


>   And you not explained cache usage - when and how cached statement
> should be used.
> 

I'm not sure you want more additional information than one I replied here.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Dimitry Sibiryakov

Alex Peshkoff via Firebird-devel wrote 10.02.2022 10:59:

  If\when shared metadata will be implemented - will it be possible to
use cached compiled statement created in one attachment to use by another
attachment ? 


If we need to take roles into an account - only for attachment with same USER.


  It depends on what exactly is going to be cached. If only execution plan is 
cached - it is not affected any by current user, any by role.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Alex Peshkoff via Firebird-devel

On 2/10/22 12:44, Vlad Khorsun wrote:


I've peek these:
- statement text
- clientDialect
- isInternalRequest


  Why it is important ? Do we have internal DSQL requests now ?


Yes we have - but not too much.


Is it makes sense to have two caches - for internal and for user
statements ?


- current client charset (as external engines may change it)
- active roles


  If\when shared metadata will be implemented - will it be possible to
use cached compiled statement created in one attachment to use by another
attachment ? 


If we need to take roles into an account - only for attachment with same 
USER.





Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-10 Thread Vlad Khorsun

08.02.2022 15:36, Adriano dos Santos Fernandes wrote:

Hi!

I have refactored DSQL statements/requests (continued worked of many
time ago) to separate shared (statement) and specific (request) parts.

It seems this ground work for compiled statement cache is complete and I
have even did a (very very) initial version of a compiled statement
cache working.


  First, please, describe - what goals\benefits of this cache ? Especially,
taking  into account current per-attachment metadata.


Now I think it's better to discuss its semantics.

First what should be the statement key in the cache?


  All what affects statement compilation process, at least.


I've peek these:
- statement text
- clientDialect
- isInternalRequest


  Why it is important ? Do we have internal DSQL requests now ?
Is it makes sense to have two caches - for internal and for user
statements ?


- current client charset (as external engines may change it)
- active roles


  If\when shared metadata will be implemented - will it be possible to
use cached compiled statement created in one attachment to use by another
attachment ? If yes, does it means that another attachment should use same
client charset and active roles to be able to use cached compiled statement ?


Do you see any thing more?

Then there is when statements should go out of cache?

I see three approaches:
- 1. Timeout after its put in the cache, updated when it is get from it
- 2. LRU based on memory consumption and max cache size
- 3. Both 1. and 2.

I think we can start with 1.


  For me 2 is a must, 1 good to have but less important.


Should it be enabled by default?

I think yes. 


  Yes for internal requests. Not sure about user requests.

> And it may have per database configuration of the timeout
> value.

  Sure.


Cache invalidation:

Cached (and unused) statements should not lock objects preventing DDL
changes.

Assuming that cached statements never lock anything exclusively (there
should be bug if that happens), then when someone tries to lock
something exclusively it should ask caches to release the statements
having that objects.

Or, I think it's completely ok for an initial implementation, any try to
exclusively lock an object may ask caches for complete invalidation.


  Enough for start, IMHO.

  And you not explained cache usage - when and how cached statement should be 
used.

Regards,
Vlad


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Compiled statement cache

2022-02-08 Thread Alex Peshkoff via Firebird-devel

On 2/8/22 16:36, Adriano dos Santos Fernandes wrote:

Hi!

I have refactored DSQL statements/requests (continued worked of many
time ago) to separate shared (statement) and specific (request) parts.

It seems this ground work for compiled statement cache is complete and I
have even did a (very very) initial version of a compiled statement
cache working.

Now I think it's better to discuss its semantics.

First what should be the statement key in the cache?

I've peek these:
- statement text
- clientDialect
- isInternalRequest
- current client charset (as external engines may change it)
- active roles

Do you see any thing more?


Even if you missed something adding an item later should not be too 
awful problem.




Then there is when statements should go out of cache?

I see three approaches:
- 1. Timeout after its put in the cache, updated when it is get from it
- 2. LRU based on memory consumption and max cache size
- 3. Both 1. and 2.

I think we can start with 1.


I like 2 better - we have much less chances to go OOM with it.


Should it be enabled by default?

I think yes. And it may have per database configuration of the timeout
value.


Or cache size value...


Cache invalidation:

Cached (and unused) statements should not lock objects preventing DDL
changes.

Assuming that cached statements never lock anything exclusively (there
should be bug if that happens), then when someone tries to lock
something exclusively it should ask caches to release the statements
having that objects.

Or, I think it's completely ok for an initial implementation, any try to
exclusively lock an object may ask caches for complete invalidation.


Agreed - that's ok, for first time definitely.




Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel