I would say that we have multiple conditions which may require re-try from user side - MVCC write conflict, lock timeout, node crash. Now concurrent DDL operation will be added to the list. I would think about proper retry exception semantics in a separate task.
On Thu, Nov 22, 2018 at 5:37 PM Taras Ledkov <tled...@gridgain.com> wrote: > Hi, > > Lets discuss changes of Ignite public interface relates to retry SQL > queries. > > Should we add new public exception (.e.g: `QueryRetryException`) for > cache SQL API > and special vendor error code to SQLException for JDBC? > > 21.11.2018 16:32, Vladimir Ozerov пишет: > > Hi Taras, > > > > Unfortunately, this will not be easy to implement. We already have > > AffinityTopologyVersion which is updated on certain schema change > > operations, such as CREATE/DROP TABLE. Moreover, it is already passed in > > query request messages (GridH2QueryRequest.topVer). Unfortunately, it is > > not updated for other DDL scenarios, such as ALTER TABLE. We can > introduce > > our own global counter. We can try to integrate with > AffinityTopologyVersion. > > Both solutions will require tight integration with PME mechanism to > > function properly. But good news is that this is not a new problem. > > Currently it is already possible for two map requests to be executed on > > different schemas: > > > > client_1: SEND_QUERY > > client_2: SEND_ALTER > > node_1 : EXEC_QUERY EXEC_ALTER > > node_2 : EXEC_ALTER EXEC_QUERY > > > > So I would say that nothing changes for us even after proposed changes, > and > > way may leave this problem aside for now. > > > > Makes sense? > > > > On Wed, Nov 21, 2018 at 4:15 PM Taras Ledkov <tled...@gridgain.com> > wrote: > > > >> Vladimir, > >> > >> The query protocol will be changed in both solution. > >> > >> The tables versions must be added to the 'GridQueryNextPageResponse' > >> message > >> and must be compared on the reduce node too. Because the DLL / DML race > >> may be happens on different nodes. > >> > >> 21.11.2018 15:24, Vladimir Ozerov пишет: > >>> Taras, > >>> > >>> Thank you for analysis. I'd like to add that there is another solution > - > >>> PME :-) In this case DDL will be blocked until current SELECTs and > >> updates > >>> are finished, and do not allow new operations to be executed. This > >> approach > >>> solves all issues - it avoids deadlocks between query threads and DDL > >>> threads when they are reordered on different nodes, it avoids > starvation > >> of > >>> DDL operations, and it doesn't cancel any queries. But there is serious > >>> drawback - performance. The drawback is that it is more complex to > >>> implement (query protocol changes might be required), it blocks the > >> cluster > >>> even when it is needed, and it may destabilize PME mechanism, which is > >>> already on his last legs. > >>> > >>> For this reason killing queries which interleave with DDL looks like a > >>> balanced solution for now - it is reasonably simple, allows us to we > >> avoid > >>> OOME in many cases, and do not introduce any additional complexity for > >>> users, as they are already prepared for re-tries. > >>> > >>> But I would like to stress one thing - we will need integration with > PME > >> at > >>> some point in time anyway. Some DDL operations are blocking in their > >> nature > >>> (e.g. DROP COLUMN). Other DDL operations may be non-blocking, but > >> blocking > >>> implementation may give them serious performance benefits (e.g. CREATE > >>> INDEX). > >>> > >>> So I propose to go with your solution for now, and start thinking about > >> SQL > >>> -> PME integration in the background. > >>> > >>> Vladimir. > >>> > >>> On Wed, Nov 21, 2018 at 2:23 PM Taras Ledkov <tled...@gridgain.com> > >> wrote: > >>>> Hi community, > >>>> > >>>> We will enhance lazy mode for SQL query execution. > >>>> > >>>> Lazy mode overview: > >>>> Lazy mode is related to H2 lazy mode when the all query results are > not > >>>> copied to the RAM in some cases. > >>>> The mode it is applicable for SELECTs that doesn't not require > >>>> materialize all results in memory, e.g. simple scan plans, IDX > lookup, > >>>> merge join etc. > >>>> And not applicable for SORT by not indexed fields, aggregates, nested > >>>> loops joins etc. > >>>> > >>>> When mode is applicable it produces result with iterator-like behavior > >>>> on server side and not consume RAM. > >>>> So the huge result set may be selected without OOME. > >>>> > >>>> The current implementation. > >>>> The current implementation is start separate thread for each query > with > >>>> 'lazy=true'. > >>>> This is caused by the our implementation of 'GridH2Table'. In details: > >>>> the table's locks. > >>>> The table must be locked while result set is completed. > >>>> > >>>> When lazy is disabled a complete result is generated on the first step > >>>> of a query execution (then tables unlock) > >>>> and result is stored on the node and sent to other node (or client) > page > >>>> by page. > >>>> > >>>> When lazy is enabled tables are locked until result set delivery to > >> client. > >>>> The start new thread causes overhead for requests that returns small > >>>> result set. > >>>> But current table lock is used `ReentrantReadWriteLock` and we cannot > >>>> lock tables from one thread > >>>> of QUERY thread pool and unlock in the other thread (when query is > >>>> complete or cancel). > >>>> > >>>> The trivial solve is using the 'StampedLock' it solve the lock > behavior, > >>>> but not solve the table DDL starvation / deadlock. > >>>> Example: > >>>> Lets the QUERY thread pool contains only one thread. The case is > scaled > >>>> for any thread pool size. > >>>> Write operation that require to exclusive table lock is DDL operation. > >>>> > >>>> 1. The query Q0 acquires the shared lock for the table T, send first > >>>> page result and leave thread 'threadQP0' control. > >>>> 2. DDL0 blocks on write lock the table T at the 'threadWP0 ' > >>>> 3. The query Q1 blocks on read lock the 'threadQP0' (because the > writer > >>>> in the queue). > >>>> The deadlock happens. Q0 never can finish and unlock because query > pool > >>>> hasn't free thread. > >>>> > >>>> The possible solution: > >>>> > >>>> 1. Don't use readlock at all. The lock is used only for write / > >>>> exclusive (DDL) operations. > >>>> 2. The DDL (exclusive) operation change the table version. > >>>> 3. Each read operation (query execution, result page fetch) store the > >>>> table version before start and compare with the table version on the > >>>> end. If the version is changed the special retry exception is thrown. > >>>> > >>>> CONS: > >>>> - The retry logic is less user-friendly. But the distributed SQL > cannot > >>>> protect the user from implement retry logic totally: e.g. cluster > >>>> topology change must handled on user side by retry query implemented > by > >>>> user, because some data have been delivered to user and we don't track > >>>> which data is delivered. > >>>> > >>>> PROS: > >>>> - no deadlocks; > >>>> - no contention on table lock for SQL query. > >>>> > >>>> What do you think? > >>>> > >>>> -- > >>>> > >>>> Taras Ledkov > >>>> Mail-To: tled...@gridgain.com > >>>> > >>>> > >> -- > >> Taras Ledkov > >> Mail-To: tled...@gridgain.com > >> > >> > -- > Taras Ledkov > Mail-To: tled...@gridgain.com > >