Re: [DISCUSSION] Reserve partitions for CacheQueries

2021-12-08 Thread Pavel Tupitsyn
Agree with Ivan regarding compatibility.
Partition reservation should be opt-in if we decide to proceed.

> is there a real need/benefit from using
> scan queries over primitive SQL queries today

SQL requires additional configuration (QueryEntity) and involves memory and
CPU overhead to maintain the indexes.
Especially if the query is used rarely (maybe some maintenance tasks), scan
is a better option.

On Thu, Dec 9, 2021 at 10:12 AM Ivan Pavlukhin  wrote:

> Hi Maksim,
>
> Thank you for looking into this. While fixing wrong/surprising
> behavior is very important, I also have some concerns, let's say, from
> different angle of view:
> 1. From a first glance it seems that similar behavior of scan and SQL
> queries is a good idea. But is there a real need/benefit from using
> scan queries over primitive SQL queries today (e.g. SELECT * FROM
> table1)?
> 2. Also we need to carefully think about compatibility. It is
> desirable that successful (but may be strange) scenarios continue to
> work after fix (and not fail e.g. due to some introduced exception).
>
> Regarding partition reservation for long-time open cursors I guess
> some ideas might be found in LAZY mode for SQL queries [1].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-9171
>
> 2021-12-08 20:11 GMT+03:00, Maksim Timonin :
> > Hi, Igniters!
> >
> > There is a known issue that ScanQuery on unstable topology returns
> > incorrect results: duplicates data [1] or fails with an exception [2].
> The
> > reason is ScanQuery doesn't reserve partitions.
> >
> > IndexQuery shares the same query processing as ScanQuery, and then it is
> > also affected by unstable topology. I want to fix it for IndexQuery.
> > IndexQuery should provide the same stability as SQL queries do - no
> > occasional failures or data duplication.
> >
> > I dived into the SQL query processing and found that we can unify logic
> for
> > SQL queries and cache queries (ScanQuery, IndexQuery, TextQuery):
> >
> > 1. Currently, cache queries use `GridCacheQueryAdapter#nodes` to find
> nodes
> > to run a query. From the other side, SQL processing uses for the same
> goal
> > `ReducePartitionMapper#nodesForPartitions`. It looks like those methods
> > have some slight differences, but in the common, they do the same. So, I
> > propose to make a single method for both cases.
> >
> > 2. SQL processing uses `PartitionReservationManager` for reserve
> partitions
> > on the map side. I propose to move it to the ignite-core module and start
> > using it for the cache queries.
> >
> > 3. Implement retries for the cache queries in case we failed to reserve
> > partitions on the map side.
> >
> > Currently, I see a downside of reserving partitions for cache queries:
> > cache queries are lazy. And the time of partition reservation depends on
> a
> > user's application code (how fast a cursor is iterated and closed).
> AFAIU,
> > it's not very good to have a partition in reserve too long. Please,
> correct
> > me if I'm wrong here.
> >
> > But from the other side, Ignite reserves partitions for ScanQuery when a
> > partition has been specified as a ScanQuery parameter, and Ignite
> reserves
> > partitions for SQL with the flag lazy=true. Also:
> > - IndexQuery: I expect simple queries that will return a relatively small
> > amount of data. Then partitions wouldn't be reserved too much time.
> > - the same is for TextQuery - it returns a limited amount of data (due to
> > the Lucene logic).
> > - full ScanQuery it's not in use much, AFAIK. So, it's by default a
> pretty
> > heavy operation.
> >
> > So, I think it's safe to reserve partitions in any case. But there could
> be
> > an alternative optimistic approach, smth like that:
> > 1. Check that topology is stable and waiting while it's not stabilized.
> > 2. Run a query on a stable cluster.
> > 3. In cases when a cluster becomes unstable during query execution - try
> to
> > reserve partitions at runtime (query should be aware of topology changes)
> > and fail in case of reservation failure (if a user already fetched some
> > data).
> >
> > I don't like the idea of this optimistic approach because in case a user
> > got some data, we don't have a better solution than to fail a query in
> case
> > of cluster instability and reservation failure.
> >
> > Igniters, WDYT?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-12591
> > [2] https://issues.apache.org/jira/browse/IGNITE-16031
> >
>
>
> --
>
> Best regards,
> Ivan Pavlukhin
>


Re: [DISCUSSION] Reserve partitions for CacheQueries

2021-12-08 Thread Ivan Pavlukhin
Hi Maksim,

Thank you for looking into this. While fixing wrong/surprising
behavior is very important, I also have some concerns, let's say, from
different angle of view:
1. From a first glance it seems that similar behavior of scan and SQL
queries is a good idea. But is there a real need/benefit from using
scan queries over primitive SQL queries today (e.g. SELECT * FROM
table1)?
2. Also we need to carefully think about compatibility. It is
desirable that successful (but may be strange) scenarios continue to
work after fix (and not fail e.g. due to some introduced exception).

Regarding partition reservation for long-time open cursors I guess
some ideas might be found in LAZY mode for SQL queries [1].

[1] https://issues.apache.org/jira/browse/IGNITE-9171

2021-12-08 20:11 GMT+03:00, Maksim Timonin :
> Hi, Igniters!
>
> There is a known issue that ScanQuery on unstable topology returns
> incorrect results: duplicates data [1] or fails with an exception [2]. The
> reason is ScanQuery doesn't reserve partitions.
>
> IndexQuery shares the same query processing as ScanQuery, and then it is
> also affected by unstable topology. I want to fix it for IndexQuery.
> IndexQuery should provide the same stability as SQL queries do - no
> occasional failures or data duplication.
>
> I dived into the SQL query processing and found that we can unify logic for
> SQL queries and cache queries (ScanQuery, IndexQuery, TextQuery):
>
> 1. Currently, cache queries use `GridCacheQueryAdapter#nodes` to find nodes
> to run a query. From the other side, SQL processing uses for the same goal
> `ReducePartitionMapper#nodesForPartitions`. It looks like those methods
> have some slight differences, but in the common, they do the same. So, I
> propose to make a single method for both cases.
>
> 2. SQL processing uses `PartitionReservationManager` for reserve partitions
> on the map side. I propose to move it to the ignite-core module and start
> using it for the cache queries.
>
> 3. Implement retries for the cache queries in case we failed to reserve
> partitions on the map side.
>
> Currently, I see a downside of reserving partitions for cache queries:
> cache queries are lazy. And the time of partition reservation depends on a
> user's application code (how fast a cursor is iterated and closed). AFAIU,
> it's not very good to have a partition in reserve too long. Please, correct
> me if I'm wrong here.
>
> But from the other side, Ignite reserves partitions for ScanQuery when a
> partition has been specified as a ScanQuery parameter, and Ignite reserves
> partitions for SQL with the flag lazy=true. Also:
> - IndexQuery: I expect simple queries that will return a relatively small
> amount of data. Then partitions wouldn't be reserved too much time.
> - the same is for TextQuery - it returns a limited amount of data (due to
> the Lucene logic).
> - full ScanQuery it's not in use much, AFAIK. So, it's by default a pretty
> heavy operation.
>
> So, I think it's safe to reserve partitions in any case. But there could be
> an alternative optimistic approach, smth like that:
> 1. Check that topology is stable and waiting while it's not stabilized.
> 2. Run a query on a stable cluster.
> 3. In cases when a cluster becomes unstable during query execution - try to
> reserve partitions at runtime (query should be aware of topology changes)
> and fail in case of reservation failure (if a user already fetched some
> data).
>
> I don't like the idea of this optimistic approach because in case a user
> got some data, we don't have a better solution than to fail a query in case
> of cluster instability and reservation failure.
>
> Igniters, WDYT?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-12591
> [2] https://issues.apache.org/jira/browse/IGNITE-16031
>


-- 

Best regards,
Ivan Pavlukhin


[DISCUSSION] Reserve partitions for CacheQueries

2021-12-08 Thread Maksim Timonin
Hi, Igniters!

There is a known issue that ScanQuery on unstable topology returns
incorrect results: duplicates data [1] or fails with an exception [2]. The
reason is ScanQuery doesn't reserve partitions.

IndexQuery shares the same query processing as ScanQuery, and then it is
also affected by unstable topology. I want to fix it for IndexQuery.
IndexQuery should provide the same stability as SQL queries do - no
occasional failures or data duplication.

I dived into the SQL query processing and found that we can unify logic for
SQL queries and cache queries (ScanQuery, IndexQuery, TextQuery):

1. Currently, cache queries use `GridCacheQueryAdapter#nodes` to find nodes
to run a query. From the other side, SQL processing uses for the same goal
`ReducePartitionMapper#nodesForPartitions`. It looks like those methods
have some slight differences, but in the common, they do the same. So, I
propose to make a single method for both cases.

2. SQL processing uses `PartitionReservationManager` for reserve partitions
on the map side. I propose to move it to the ignite-core module and start
using it for the cache queries.

3. Implement retries for the cache queries in case we failed to reserve
partitions on the map side.

Currently, I see a downside of reserving partitions for cache queries:
cache queries are lazy. And the time of partition reservation depends on a
user's application code (how fast a cursor is iterated and closed). AFAIU,
it's not very good to have a partition in reserve too long. Please, correct
me if I'm wrong here.

But from the other side, Ignite reserves partitions for ScanQuery when a
partition has been specified as a ScanQuery parameter, and Ignite reserves
partitions for SQL with the flag lazy=true. Also:
- IndexQuery: I expect simple queries that will return a relatively small
amount of data. Then partitions wouldn't be reserved too much time.
- the same is for TextQuery - it returns a limited amount of data (due to
the Lucene logic).
- full ScanQuery it's not in use much, AFAIK. So, it's by default a pretty
heavy operation.

So, I think it's safe to reserve partitions in any case. But there could be
an alternative optimistic approach, smth like that:
1. Check that topology is stable and waiting while it's not stabilized.
2. Run a query on a stable cluster.
3. In cases when a cluster becomes unstable during query execution - try to
reserve partitions at runtime (query should be aware of topology changes)
and fail in case of reservation failure (if a user already fetched some
data).

I don't like the idea of this optimistic approach because in case a user
got some data, we don't have a better solution than to fail a query in case
of cluster instability and reservation failure.

Igniters, WDYT?

[1] https://issues.apache.org/jira/browse/IGNITE-12591
[2] https://issues.apache.org/jira/browse/IGNITE-16031


Re: Apache Ignite 2.12 RELEASE [Time, Scope, Manager]

2021-12-08 Thread Nikolay Izhikov
Hello.

Let’s include [1] in 2.12 scope.

After migrations authentication processor to security API we have an issue.
Persistent data region should exists on client node if authentication is 
enabled.

It seems very annoying for the users.

[1] https://issues.apache.org/jira/browse/IGNITE-15969

> 2 дек. 2021 г., в 19:50, Nikita Amelchev  написал(а):
> 
> I would like to formally announce code freeze for 2.12.0. Only
> blockers are accepted to be included to the scope. [1]
> 
> We have one blocker issue [2] that will be fixed soon.
> 
> [1] 
> https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-P3.Stabilization
> [2] https://issues.apache.org/jira/browse/IGNITE-15966
> 
> чт, 2 дек. 2021 г. в 16:01, Nikita Amelchev :
>> 
>> Hi, Igniters.
>> 
>> The release is blocked by the auth issue [1] discussed above. The
>> patch will be prepared at the nearest time.
>> 
>> I want to include the useful issue that adds the ability to configure
>> snapshot thread pool size if nobody minds.
>> 
>> [1] https://issues.apache.org/jira/browse/IGNITE-15966
>> [2] https://issues.apache.org/jira/browse/IGNITE-16014
>> 
>> пт, 26 нояб. 2021 г. в 11:44, Nikita Amelchev :
>>> 
>>> Hello,
>>> 
>>> I want to include the issue [1] to the 2.12 scope. It fixes some
>>> metrics according to the JSR107.
>>> 
>>> Any objections?
>>> 
>>> [1] https://issues.apache.org/jira/browse/IGNITE-16002 The remove
>>> cache method should update statistics if the method returns true
>>> 
>>> чт, 25 нояб. 2021 г. в 17:12, Nikita Amelchev :
 
 Hello Ivan,
 
 +1, the issue can be cherry-picked to the 2.12.
 
 чт, 25 нояб. 2021 г. в 17:07, Ivan Bessonov :
> 
> Hello everyone,
> 
> is there a chance to add this issue to the release scope? [1]
> Currently, defragmentation of certain types of indexes can corrupt them.
> Fix is straightforward and should not cause any problems.
> Thank you!
> 
> [1] https://issues.apache.org/jira/browse/IGNITE-15968
> 
> вт, 23 нояб. 2021 г. в 20:44, Nikita Amelchev :
> 
>> Ivan,
>> 
>> I have cherry-picked the issue:
>> https://issues.apache.org/jira/browse/IGNITE-15767
>> 
>> вт, 23 нояб. 2021 г. в 20:31, Nikolay Izhikov :
>>> 
 [1] https://issues.apache.org/jira/browse/IGNITE-15951
>>> 
>>> Cherry-picked to 2.12
>>> 
 23 нояб. 2021 г., в 16:38, Nikita Amelchev 
>> написал(а):
 
 Hi, +1 to cherry-pick.
 
 вт, 23 нояб. 2021 г. в 15:25, Ivan Daschinsky :
> 
> Hi! Lets add one simple bug fix, but very useful
> 
> https://issues.apache.org/jira/browse/IGNITE-15767
> 
> пн, 22 нояб. 2021 г. в 15:22, Pavel Tupitsyn :
> 
>> Yes, let's fix those auth issues, there are so many complaints on
>> the user
>> list.
>> 
>> On Mon, Nov 22, 2021 at 1:03 PM Mikhail Petrov <
>> pmgheap@gmail.com>
>> wrote:
>> 
>>> Igniters,
>>> it seems that issues [1] and [2] that were discovered recently are
>>> blockers for the 2.12 release.
>>> 
>>> [1] https://issues.apache.org/jira/browse/IGNITE-15951
>>> [2] https://issues.apache.org/jira/browse/IGNITE-15966
>>> 
>>> On 22.11.2021 11:04, Nikolay Izhikov wrote:
 Hello.
 
 One more tiny fix that will improve java thin client performance in
>>> certain cases [1]
 I think it worth to cherry-pick it to 2.12 release.
 
 Any objections?
 
 [1] https://issues.apache.org/jira/browse/IGNITE-15924
 
> 19 нояб. 2021 г., в 14:56, Nikita Amelchev 
>>> написал(а):
> 
> Vladimir, +1.
> 
> I have cherry-picked the issue.
> 
> пт, 19 нояб. 2021 г. в 14:53, Steshin Vladimir <
>> vlads...@gmail.com>:
>>Hi all.
>> 
>> 
>>Let's add simple and useful thin-client-pool monitoring in
>> 2.12:
>> 
>> https://issues.apache.org/jira/browse/IGNITE-15183
>> 
>> https://github.com/apache/ignite/pull/9525
>> 
>> 
>> 19.11.2021 09:50, Maksim Timonin пишет:
>>> Hi, I want to add to 2.12 a fix [1] of a bug [2]. It affects
>> users
>> of
>>> metrics in case there are too many partitions (> 2), we
>> observed
>>> extensive heap usage.
>>> 
>>> [1] https://github.com/apache/ignite/pull/9518/files
>>> [2] https://issues.apache.org/jira/browse/IGNITE-15781
>>> 
>>> On Thu, Nov 18, 2021 at 12:59 PM Maksim Timonin <
>>> timonin.ma...@gmail.com>
>>> wrote:
>>> 
 Hi,