Re: Improve lazy mode SQL query execution (IGNITE-9171)

2018-11-23 Thread Vladimir Ozerov
I would say that we have multiple conditions which may require re-try from
user side - MVCC write conflict, lock timeout, node crash. Now concurrent
DDL operation will be added to the list. I would think about proper retry
exception semantics in a separate task.

On Thu, Nov 22, 2018 at 5:37 PM Taras Ledkov  wrote:

> Hi,
>
> Lets discuss changes of Ignite public interface relates to retry SQL
> queries.
>
> Should we add new public exception (.e.g: `QueryRetryException`) for
> cache SQL API
> and special vendor error code to SQLException for JDBC?
>
> 21.11.2018 16:32, Vladimir Ozerov пишет:
> > Hi Taras,
> >
> > Unfortunately, this will not be easy to implement. We already have
> > AffinityTopologyVersion which is updated on certain schema change
> > operations, such as CREATE/DROP TABLE. Moreover, it is already passed in
> > query request messages (GridH2QueryRequest.topVer). Unfortunately, it is
> > not updated for other DDL scenarios, such as ALTER TABLE. We can
> introduce
> > our own global counter. We can try to integrate with
> AffinityTopologyVersion.
> > Both solutions will require tight integration with PME mechanism to
> > function properly. But good news is that this is not a new problem.
> > Currently it is already possible for two map requests to be executed on
> > different schemas:
> >
> > client_1: SEND_QUERY
> > client_2: SEND_ALTER
> > node_1  :EXEC_QUERY EXEC_ALTER
> > node_2  :   EXEC_ALTER EXEC_QUERY
> >
> > So I would say that nothing changes for us even after proposed changes,
> and
> > way may leave this problem aside for now.
> >
> > Makes sense?
> >
> > On Wed, Nov 21, 2018 at 4:15 PM Taras Ledkov 
> wrote:
> >
> >> Vladimir,
> >>
> >> The query protocol will be changed in both solution.
> >>
> >> The tables versions must be added to the 'GridQueryNextPageResponse'
> >> message
> >> and must be compared on the reduce node too. Because the DLL / DML race
> >> may be happens on different nodes.
> >>
> >> 21.11.2018 15:24, Vladimir Ozerov пишет:
> >>> Taras,
> >>>
> >>> Thank you for analysis. I'd like to add that there is another solution
> -
> >>> PME :-) In this case DDL will be blocked until current SELECTs and
> >> updates
> >>> are finished, and do not allow new operations to be executed. This
> >> approach
> >>> solves all issues - it avoids deadlocks between query threads and DDL
> >>> threads when they are reordered on different nodes, it avoids
> starvation
> >> of
> >>> DDL operations, and it doesn't cancel any queries. But there is serious
> >>> drawback - performance. The drawback is that it is more complex to
> >>> implement (query protocol changes might be required), it blocks the
> >> cluster
> >>> even when it is needed, and it may destabilize PME mechanism, which is
> >>> already on his last legs.
> >>>
> >>> For this reason killing queries which interleave with DDL looks like a
> >>> balanced solution for now - it is reasonably simple, allows us to we
> >> avoid
> >>> OOME in many cases, and do not introduce any additional complexity for
> >>> users, as they are already prepared for re-tries.
> >>>
> >>> But I would like to stress one thing - we will need integration with
> PME
> >> at
> >>> some point in time anyway. Some DDL operations are blocking in their
> >> nature
> >>> (e.g. DROP COLUMN). Other DDL operations may be non-blocking, but
> >> blocking
> >>> implementation may give them serious performance benefits (e.g. CREATE
> >>> INDEX).
> >>>
> >>> So I propose to go with your solution for now, and start thinking about
> >> SQL
> >>> -> PME integration in the background.
> >>>
> >>> Vladimir.
> >>>
> >>> On Wed, Nov 21, 2018 at 2:23 PM Taras Ledkov 
> >> wrote:
>  Hi community,
> 
>  We will enhance lazy mode for SQL query execution.
> 
>  Lazy mode overview:
>  Lazy mode is related to H2 lazy mode when the all query results are
> not
>  copied to the RAM in some cases.
>  The mode it is applicable for SELECTs that doesn't not require
>  materialize all results in memory, e.g.  simple scan plans, IDX
> lookup,
>  merge join etc.
>  And not applicable for SORT by not indexed fields, aggregates, nested
>  loops joins etc.
> 
>  When mode is applicable it produces result with iterator-like behavior
>  on server side and not consume RAM.
>  So the huge result set may be selected without OOME.
> 
>  The current implementation.
>  The current implementation is start separate thread for each query
> with
>  'lazy=true'.
>  This is caused by the our implementation of 'GridH2Table'. In details:
>  the table's locks.
>  The table must be locked while result set  is completed.
> 
>  When lazy is disabled a complete result is generated on the first step
>  of a query execution (then tables unlock)
>  and result is stored on the node and sent to other node (or client)
> page
>  by page.
> 
>  When lazy is enabled 

Re: Improve lazy mode SQL query execution (IGNITE-9171)

2018-11-22 Thread Taras Ledkov

Hi,

Lets discuss changes of Ignite public interface relates to retry SQL 
queries.


Should we add new public exception (.e.g: `QueryRetryException`) for 
cache SQL API

and special vendor error code to SQLException for JDBC?

21.11.2018 16:32, Vladimir Ozerov пишет:

Hi Taras,

Unfortunately, this will not be easy to implement. We already have
AffinityTopologyVersion which is updated on certain schema change
operations, such as CREATE/DROP TABLE. Moreover, it is already passed in
query request messages (GridH2QueryRequest.topVer). Unfortunately, it is
not updated for other DDL scenarios, such as ALTER TABLE. We can introduce
our own global counter. We can try to integrate with AffinityTopologyVersion.
Both solutions will require tight integration with PME mechanism to
function properly. But good news is that this is not a new problem.
Currently it is already possible for two map requests to be executed on
different schemas:

client_1: SEND_QUERY
client_2: SEND_ALTER
node_1  :EXEC_QUERY EXEC_ALTER
node_2  :   EXEC_ALTER EXEC_QUERY

So I would say that nothing changes for us even after proposed changes, and
way may leave this problem aside for now.

Makes sense?

On Wed, Nov 21, 2018 at 4:15 PM Taras Ledkov  wrote:


Vladimir,

The query protocol will be changed in both solution.

The tables versions must be added to the 'GridQueryNextPageResponse'
message
and must be compared on the reduce node too. Because the DLL / DML race
may be happens on different nodes.

21.11.2018 15:24, Vladimir Ozerov пишет:

Taras,

Thank you for analysis. I'd like to add that there is another solution -
PME :-) In this case DDL will be blocked until current SELECTs and

updates

are finished, and do not allow new operations to be executed. This

approach

solves all issues - it avoids deadlocks between query threads and DDL
threads when they are reordered on different nodes, it avoids starvation

of

DDL operations, and it doesn't cancel any queries. But there is serious
drawback - performance. The drawback is that it is more complex to
implement (query protocol changes might be required), it blocks the

cluster

even when it is needed, and it may destabilize PME mechanism, which is
already on his last legs.

For this reason killing queries which interleave with DDL looks like a
balanced solution for now - it is reasonably simple, allows us to we

avoid

OOME in many cases, and do not introduce any additional complexity for
users, as they are already prepared for re-tries.

But I would like to stress one thing - we will need integration with PME

at

some point in time anyway. Some DDL operations are blocking in their

nature

(e.g. DROP COLUMN). Other DDL operations may be non-blocking, but

blocking

implementation may give them serious performance benefits (e.g. CREATE
INDEX).

So I propose to go with your solution for now, and start thinking about

SQL

-> PME integration in the background.

Vladimir.

On Wed, Nov 21, 2018 at 2:23 PM Taras Ledkov 

wrote:

Hi community,

We will enhance lazy mode for SQL query execution.

Lazy mode overview:
Lazy mode is related to H2 lazy mode when the all query results are not
copied to the RAM in some cases.
The mode it is applicable for SELECTs that doesn't not require
materialize all results in memory, e.g.  simple scan plans, IDX lookup,
merge join etc.
And not applicable for SORT by not indexed fields, aggregates, nested
loops joins etc.

When mode is applicable it produces result with iterator-like behavior
on server side and not consume RAM.
So the huge result set may be selected without OOME.

The current implementation.
The current implementation is start separate thread for each query with
'lazy=true'.
This is caused by the our implementation of 'GridH2Table'. In details:
the table's locks.
The table must be locked while result set  is completed.

When lazy is disabled a complete result is generated on the first step
of a query execution (then tables unlock)
and result is stored on the node and sent to other node (or client) page
by page.

When lazy is enabled tables are locked until result set delivery to

client.

The start new thread causes overhead for requests that returns small
result set.
But current table lock is used `ReentrantReadWriteLock` and we cannot
lock tables from one thread
of QUERY thread pool and unlock in the other thread (when query is
complete or cancel).

The trivial solve is using the 'StampedLock' it solve the lock behavior,
but not solve the table DDL starvation / deadlock.
Example:
Lets the QUERY thread pool contains only one thread. The case is scaled
for any thread pool size.
Write operation that require to exclusive table lock is DDL operation.

1. The query Q0 acquires the shared lock for the table T, send first
page result and leave thread 'threadQP0' control.
2. DDL0 blocks on write lock the table T at the 'threadWP0 '
3. The query Q1 blocks on read lock  the 'threadQP0' (because the writer
in the queue).
The 

Re: Improve lazy mode SQL query execution (IGNITE-9171)

2018-11-21 Thread Vladimir Ozerov
Hi Taras,

Unfortunately, this will not be easy to implement. We already have
AffinityTopologyVersion which is updated on certain schema change
operations, such as CREATE/DROP TABLE. Moreover, it is already passed in
query request messages (GridH2QueryRequest.topVer). Unfortunately, it is
not updated for other DDL scenarios, such as ALTER TABLE. We can introduce
our own global counter. We can try to integrate with AffinityTopologyVersion.
Both solutions will require tight integration with PME mechanism to
function properly. But good news is that this is not a new problem.
Currently it is already possible for two map requests to be executed on
different schemas:

client_1: SEND_QUERY
client_2: SEND_ALTER
node_1  :EXEC_QUERY EXEC_ALTER
node_2  :   EXEC_ALTER EXEC_QUERY

So I would say that nothing changes for us even after proposed changes, and
way may leave this problem aside for now.

Makes sense?

On Wed, Nov 21, 2018 at 4:15 PM Taras Ledkov  wrote:

> Vladimir,
>
> The query protocol will be changed in both solution.
>
> The tables versions must be added to the 'GridQueryNextPageResponse'
> message
> and must be compared on the reduce node too. Because the DLL / DML race
> may be happens on different nodes.
>
> 21.11.2018 15:24, Vladimir Ozerov пишет:
> > Taras,
> >
> > Thank you for analysis. I'd like to add that there is another solution -
> > PME :-) In this case DDL will be blocked until current SELECTs and
> updates
> > are finished, and do not allow new operations to be executed. This
> approach
> > solves all issues - it avoids deadlocks between query threads and DDL
> > threads when they are reordered on different nodes, it avoids starvation
> of
> > DDL operations, and it doesn't cancel any queries. But there is serious
> > drawback - performance. The drawback is that it is more complex to
> > implement (query protocol changes might be required), it blocks the
> cluster
> > even when it is needed, and it may destabilize PME mechanism, which is
> > already on his last legs.
> >
> > For this reason killing queries which interleave with DDL looks like a
> > balanced solution for now - it is reasonably simple, allows us to we
> avoid
> > OOME in many cases, and do not introduce any additional complexity for
> > users, as they are already prepared for re-tries.
> >
> > But I would like to stress one thing - we will need integration with PME
> at
> > some point in time anyway. Some DDL operations are blocking in their
> nature
> > (e.g. DROP COLUMN). Other DDL operations may be non-blocking, but
> blocking
> > implementation may give them serious performance benefits (e.g. CREATE
> > INDEX).
> >
> > So I propose to go with your solution for now, and start thinking about
> SQL
> > -> PME integration in the background.
> >
> > Vladimir.
> >
> > On Wed, Nov 21, 2018 at 2:23 PM Taras Ledkov 
> wrote:
> >
> >> Hi community,
> >>
> >> We will enhance lazy mode for SQL query execution.
> >>
> >> Lazy mode overview:
> >> Lazy mode is related to H2 lazy mode when the all query results are not
> >> copied to the RAM in some cases.
> >> The mode it is applicable for SELECTs that doesn't not require
> >> materialize all results in memory, e.g.  simple scan plans, IDX lookup,
> >> merge join etc.
> >> And not applicable for SORT by not indexed fields, aggregates, nested
> >> loops joins etc.
> >>
> >> When mode is applicable it produces result with iterator-like behavior
> >> on server side and not consume RAM.
> >> So the huge result set may be selected without OOME.
> >>
> >> The current implementation.
> >> The current implementation is start separate thread for each query with
> >> 'lazy=true'.
> >> This is caused by the our implementation of 'GridH2Table'. In details:
> >> the table's locks.
> >> The table must be locked while result set  is completed.
> >>
> >> When lazy is disabled a complete result is generated on the first step
> >> of a query execution (then tables unlock)
> >> and result is stored on the node and sent to other node (or client) page
> >> by page.
> >>
> >> When lazy is enabled tables are locked until result set delivery to
> client.
> >>
> >> The start new thread causes overhead for requests that returns small
> >> result set.
> >> But current table lock is used `ReentrantReadWriteLock` and we cannot
> >> lock tables from one thread
> >> of QUERY thread pool and unlock in the other thread (when query is
> >> complete or cancel).
> >>
> >> The trivial solve is using the 'StampedLock' it solve the lock behavior,
> >> but not solve the table DDL starvation / deadlock.
> >> Example:
> >> Lets the QUERY thread pool contains only one thread. The case is scaled
> >> for any thread pool size.
> >> Write operation that require to exclusive table lock is DDL operation.
> >>
> >> 1. The query Q0 acquires the shared lock for the table T, send first
> >> page result and leave thread 'threadQP0' control.
> >> 2. DDL0 blocks on write lock the table T at the 'threadWP0 '

Re: Improve lazy mode SQL query execution (IGNITE-9171)

2018-11-21 Thread Taras Ledkov

Vladimir,

The query protocol will be changed in both solution.

The tables versions must be added to the 'GridQueryNextPageResponse' 
message

and must be compared on the reduce node too. Because the DLL / DML race
may be happens on different nodes.

21.11.2018 15:24, Vladimir Ozerov пишет:

Taras,

Thank you for analysis. I'd like to add that there is another solution -
PME :-) In this case DDL will be blocked until current SELECTs and updates
are finished, and do not allow new operations to be executed. This approach
solves all issues - it avoids deadlocks between query threads and DDL
threads when they are reordered on different nodes, it avoids starvation of
DDL operations, and it doesn't cancel any queries. But there is serious
drawback - performance. The drawback is that it is more complex to
implement (query protocol changes might be required), it blocks the cluster
even when it is needed, and it may destabilize PME mechanism, which is
already on his last legs.

For this reason killing queries which interleave with DDL looks like a
balanced solution for now - it is reasonably simple, allows us to we avoid
OOME in many cases, and do not introduce any additional complexity for
users, as they are already prepared for re-tries.

But I would like to stress one thing - we will need integration with PME at
some point in time anyway. Some DDL operations are blocking in their nature
(e.g. DROP COLUMN). Other DDL operations may be non-blocking, but blocking
implementation may give them serious performance benefits (e.g. CREATE
INDEX).

So I propose to go with your solution for now, and start thinking about SQL
-> PME integration in the background.

Vladimir.

On Wed, Nov 21, 2018 at 2:23 PM Taras Ledkov  wrote:


Hi community,

We will enhance lazy mode for SQL query execution.

Lazy mode overview:
Lazy mode is related to H2 lazy mode when the all query results are not
copied to the RAM in some cases.
The mode it is applicable for SELECTs that doesn't not require
materialize all results in memory, e.g.  simple scan plans, IDX lookup,
merge join etc.
And not applicable for SORT by not indexed fields, aggregates, nested
loops joins etc.

When mode is applicable it produces result with iterator-like behavior
on server side and not consume RAM.
So the huge result set may be selected without OOME.

The current implementation.
The current implementation is start separate thread for each query with
'lazy=true'.
This is caused by the our implementation of 'GridH2Table'. In details:
the table's locks.
The table must be locked while result set  is completed.

When lazy is disabled a complete result is generated on the first step
of a query execution (then tables unlock)
and result is stored on the node and sent to other node (or client) page
by page.

When lazy is enabled tables are locked until result set delivery to client.

The start new thread causes overhead for requests that returns small
result set.
But current table lock is used `ReentrantReadWriteLock` and we cannot
lock tables from one thread
of QUERY thread pool and unlock in the other thread (when query is
complete or cancel).

The trivial solve is using the 'StampedLock' it solve the lock behavior,
but not solve the table DDL starvation / deadlock.
Example:
Lets the QUERY thread pool contains only one thread. The case is scaled
for any thread pool size.
Write operation that require to exclusive table lock is DDL operation.

1. The query Q0 acquires the shared lock for the table T, send first
page result and leave thread 'threadQP0' control.
2. DDL0 blocks on write lock the table T at the 'threadWP0 '
3. The query Q1 blocks on read lock  the 'threadQP0' (because the writer
in the queue).
The deadlock happens. Q0 never can finish and unlock because query pool
hasn't free thread.

The possible solution:

1. Don't use readlock at all. The lock is used only for write /
exclusive (DDL) operations.
2. The DDL (exclusive) operation change the table version.
3. Each read operation (query execution, result page fetch) store the
table version before start and compare with the table version on the
end. If the version is changed the special retry exception is thrown.

CONS:
- The retry logic is less user-friendly. But the distributed SQL cannot
protect the user from implement retry logic totally: e.g. cluster
topology change must handled on user side by retry query implemented by
user, because some data have been delivered to user and we don't track
which data is delivered.

PROS:
- no deadlocks;
- no contention on table lock for SQL query.

What do you think?

--

Taras Ledkov
Mail-To: tled...@gridgain.com



--
Taras Ledkov
Mail-To: tled...@gridgain.com



Re: Improve lazy mode SQL query execution (IGNITE-9171)

2018-11-21 Thread Vladimir Ozerov
Taras,

Thank you for analysis. I'd like to add that there is another solution -
PME :-) In this case DDL will be blocked until current SELECTs and updates
are finished, and do not allow new operations to be executed. This approach
solves all issues - it avoids deadlocks between query threads and DDL
threads when they are reordered on different nodes, it avoids starvation of
DDL operations, and it doesn't cancel any queries. But there is serious
drawback - performance. The drawback is that it is more complex to
implement (query protocol changes might be required), it blocks the cluster
even when it is needed, and it may destabilize PME mechanism, which is
already on his last legs.

For this reason killing queries which interleave with DDL looks like a
balanced solution for now - it is reasonably simple, allows us to we avoid
OOME in many cases, and do not introduce any additional complexity for
users, as they are already prepared for re-tries.

But I would like to stress one thing - we will need integration with PME at
some point in time anyway. Some DDL operations are blocking in their nature
(e.g. DROP COLUMN). Other DDL operations may be non-blocking, but blocking
implementation may give them serious performance benefits (e.g. CREATE
INDEX).

So I propose to go with your solution for now, and start thinking about SQL
-> PME integration in the background.

Vladimir.

On Wed, Nov 21, 2018 at 2:23 PM Taras Ledkov  wrote:

> Hi community,
>
> We will enhance lazy mode for SQL query execution.
>
> Lazy mode overview:
> Lazy mode is related to H2 lazy mode when the all query results are not
> copied to the RAM in some cases.
> The mode it is applicable for SELECTs that doesn't not require
> materialize all results in memory, e.g.  simple scan plans, IDX lookup,
> merge join etc.
> And not applicable for SORT by not indexed fields, aggregates, nested
> loops joins etc.
>
> When mode is applicable it produces result with iterator-like behavior
> on server side and not consume RAM.
> So the huge result set may be selected without OOME.
>
> The current implementation.
> The current implementation is start separate thread for each query with
> 'lazy=true'.
> This is caused by the our implementation of 'GridH2Table'. In details:
> the table's locks.
> The table must be locked while result set  is completed.
>
> When lazy is disabled a complete result is generated on the first step
> of a query execution (then tables unlock)
> and result is stored on the node and sent to other node (or client) page
> by page.
>
> When lazy is enabled tables are locked until result set delivery to client.
>
> The start new thread causes overhead for requests that returns small
> result set.
> But current table lock is used `ReentrantReadWriteLock` and we cannot
> lock tables from one thread
> of QUERY thread pool and unlock in the other thread (when query is
> complete or cancel).
>
> The trivial solve is using the 'StampedLock' it solve the lock behavior,
> but not solve the table DDL starvation / deadlock.
> Example:
> Lets the QUERY thread pool contains only one thread. The case is scaled
> for any thread pool size.
> Write operation that require to exclusive table lock is DDL operation.
>
> 1. The query Q0 acquires the shared lock for the table T, send first
> page result and leave thread 'threadQP0' control.
> 2. DDL0 blocks on write lock the table T at the 'threadWP0 '
> 3. The query Q1 blocks on read lock  the 'threadQP0' (because the writer
> in the queue).
> The deadlock happens. Q0 never can finish and unlock because query pool
> hasn't free thread.
>
> The possible solution:
>
> 1. Don't use readlock at all. The lock is used only for write /
> exclusive (DDL) operations.
> 2. The DDL (exclusive) operation change the table version.
> 3. Each read operation (query execution, result page fetch) store the
> table version before start and compare with the table version on the
> end. If the version is changed the special retry exception is thrown.
>
> CONS:
> - The retry logic is less user-friendly. But the distributed SQL cannot
> protect the user from implement retry logic totally: e.g. cluster
> topology change must handled on user side by retry query implemented by
> user, because some data have been delivered to user and we don't track
> which data is delivered.
>
> PROS:
> - no deadlocks;
> - no contention on table lock for SQL query.
>
> What do you think?
>
> --
>
> Taras Ledkov
> Mail-To: tled...@gridgain.com
>
>


Improve lazy mode SQL query execution (IGNITE-9171)

2018-11-21 Thread Taras Ledkov

Hi community,

We will enhance lazy mode for SQL query execution.

Lazy mode overview:
Lazy mode is related to H2 lazy mode when the all query results are not 
copied to the RAM in some cases.
The mode it is applicable for SELECTs that doesn't not require 
materialize all results in memory, e.g.  simple scan plans, IDX lookup, 
merge join etc.
And not applicable for SORT by not indexed fields, aggregates, nested 
loops joins etc.


When mode is applicable it produces result with iterator-like behavior 
on server side and not consume RAM.

So the huge result set may be selected without OOME.

The current implementation.
The current implementation is start separate thread for each query with 
'lazy=true'.
This is caused by the our implementation of 'GridH2Table'. In details: 
the table's locks.

The table must be locked while result set  is completed.

When lazy is disabled a complete result is generated on the first step 
of a query execution (then tables unlock)
and result is stored on the node and sent to other node (or client) page 
by page.


When lazy is enabled tables are locked until result set delivery to client.

The start new thread causes overhead for requests that returns small 
result set.
But current table lock is used `ReentrantReadWriteLock` and we cannot 
lock tables from one thread
of QUERY thread pool and unlock in the other thread (when query is 
complete or cancel).


The trivial solve is using the 'StampedLock' it solve the lock behavior, 
but not solve the table DDL starvation / deadlock.

Example:
Lets the QUERY thread pool contains only one thread. The case is scaled 
for any thread pool size.

Write operation that require to exclusive table lock is DDL operation.

1. The query Q0 acquires the shared lock for the table T, send first 
page result and leave thread 'threadQP0' control.

2. DDL0 blocks on write lock the table T at the 'threadWP0 '
3. The query Q1 blocks on read lock  the 'threadQP0' (because the writer 
in the queue).
The deadlock happens. Q0 never can finish and unlock because query pool 
hasn't free thread.


The possible solution:

1. Don't use readlock at all. The lock is used only for write / 
exclusive (DDL) operations.

2. The DDL (exclusive) operation change the table version.
3. Each read operation (query execution, result page fetch) store the 
table version before start and compare with the table version on the 
end. If the version is changed the special retry exception is thrown.


CONS:
- The retry logic is less user-friendly. But the distributed SQL cannot 
protect the user from implement retry logic totally: e.g. cluster 
topology change must handled on user side by retry query implemented by 
user, because some data have been delivered to user and we don't track 
which data is delivered.


PROS:
- no deadlocks;
- no contention on table lock for SQL query.

What do you think?

--

Taras Ledkov
Mail-To: tled...@gridgain.com