Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Courtney Robinson
Prefer 1 from Teras' response. Specifying index name is preferred.
I've seen customers do idx(A,B) and idx(B,A) where semantics change between
the two.

Regards,
Courtney Robinson
Founder and CEO, Hypi
Tel: ++44 208 123 2413 (GMT+0) 


https://hypi.io


On Thu, Aug 26, 2021 at 4:28 PM Taras Ledkov  wrote:

> Hi,
>
> My proposal:
> 1. Don't search index by criteria, specify the index name always
> (preferred).
>
> OR
>
> 2. Search index by criteria without check the order of criteriones.
> Use the Set of criterions instead of the ordered collection.
> In the strange case when the both index exist (a, b) and (b, a) - use
> the any index
> when index name isn't specified.
>
> On 26.08.2021 16:49, Maksim Timonin wrote:
> > There are some thoughts about strict field order:
> > 1. Index (A, B) is not equivalent to index (B, A). Some queries may have
> > different performance on such indexes, and users have to specify the
> right
> > index. What if both indexes exist?
> > 2. We should avoid cases when a user uses in query only field B for index
> > (A, B). We have to force the user to specify range for (A) too, or
> > explicitly set it (null, null). Otherwise it looks like a mistake.
> >
> >
> >
> >
> > On Thu, Aug 26, 2021 at 4:39 PM Ivan Daschinsky 
> wrote:
> >
> >> 1. I suppose, that the next step is to implement the api for manually
> >> creating index. I think that user wants to create index that will speed
> up
> >> his criteria base queries, so he or she will use the same criteria to
> >> define the index. So no problem at all
> >> 2. We should print warning or throws exception if there is not any index
> >> that match specific criteria.
> >>
> >> BTW, Mongo DB doesn't make user to write index name in query. It just
> >> works.
> >>
> >> чт, 26 авг. 2021 г., 15:52 Taras Ledkov :
> >>
> >>> Hi,
> >>>
>  It is an usability nightmare to make user write index name in all
> >> cases.
> >>> I don't see any difference between specifying the index name and
> >>> specifying the index fields in the right order.
> >>> Do you see?
> >>>
> >>> Let's there is the index:
> >>> idx_A_B ON TBL (A, B)
> >>>
> >>> Is it OK that the query like below doesn't math the index 'idx_A_B'?
> >>> new IndexQuery<>(..)
> >>>   .setCriteria(lt("b", 1), lt("a", 2));
> >>>
> >>> On 26.08.2021 15:23, Ivan Daschinsky wrote:
>  I am against to make user write index name. It is quite simple and
>  straightforward algorithm to match index to field names, so it is
> >> strange
>  to compare it to sql engine optimizer.
> 
>  It is an usability nightmare to make user write index name in all
> >> cases.
>  чт, 26 авг. 2021 г., 14:42 Maksim Timonin :
> 
> > Hi, Igniters!
> >
> > There is a discussion about how to specify an index to query with an
> > IndexQuery [1]. Currently my PR provides 2 ways to specify index:
> > 1. With a table and index name;
> > 2. With a table and list of index fields (without index name). In
> this
> >>> case
> > IndexQueryProcessor tries to find an index that matches table and
> >> index
> > fields in strict order (order of fields in criteria has to match the
> >>> order
> > of fields in index).
> >
> > Discussion is whether is the second approach valid?
> >
> > Pros:
> > 1. Currently index name is an optional field for QueryIndex and
> > QuerySqlField. Then users can create an index with a table and list
> of
> > fields. Then, we should provide an opportunity to define an index for
> > querying the same way as we do for creating.
> > 2. It's required to know the index name to query it (in case the
> index
> >>> was
> > created without an explicit name). Users can find it and then use it
> >> as
> >>> a
> > constant in code, but I see some troubles there:
> > 2.1. Get index name by querying the system view INDEXES. Note, that
> >>> system
> > views are marked as an experimental feature [2].
> > 2.2. There is a workaround to know an index name with EXPLAIN clause
> >> for
> > sql query that uses the required index (but it depends on SQL
> >>> optimizer).
> > 2.3. Users can use the index name builder, but it is in the
> > internal package
> > (org.apache.ignite.internal.processors.query.QueryUtils#indexName).
> >>> Then it
> > can be changed from version to version without warning, and then the
> >>> user
> > can't rely on it in code.
> > 3. Name of the Primary Key index (_key_PK) is predefined and
> hardcoded
> >>> in
> > Ignite. Users can't set it while creating it, and the name of PK
> index
> >>> is
> > hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).
> >
> > Cons:
> > 1. It's declared that IndexQuery avoids some SQL steps (like
> planning,
> > optimizer) in favor of speed. It looks like that looking for an index
> >> by
> > list of fields is the work of an optimizer.
> > 2. It 

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Taras Ledkov

Hi,

My proposal:
1. Don't search index by criteria, specify the index name always 
(preferred).


OR

2. Search index by criteria without check the order of criteriones.
Use the Set of criterions instead of the ordered collection.
In the strange case when the both index exist (a, b) and (b, a) - use 
the any index

when index name isn't specified.

On 26.08.2021 16:49, Maksim Timonin wrote:

There are some thoughts about strict field order:
1. Index (A, B) is not equivalent to index (B, A). Some queries may have
different performance on such indexes, and users have to specify the right
index. What if both indexes exist?
2. We should avoid cases when a user uses in query only field B for index
(A, B). We have to force the user to specify range for (A) too, or
explicitly set it (null, null). Otherwise it looks like a mistake.




On Thu, Aug 26, 2021 at 4:39 PM Ivan Daschinsky  wrote:


1. I suppose, that the next step is to implement the api for manually
creating index. I think that user wants to create index that will speed up
his criteria base queries, so he or she will use the same criteria to
define the index. So no problem at all
2. We should print warning or throws exception if there is not any index
that match specific criteria.

BTW, Mongo DB doesn't make user to write index name in query. It just
works.

чт, 26 авг. 2021 г., 15:52 Taras Ledkov :


Hi,


It is an usability nightmare to make user write index name in all

cases.

I don't see any difference between specifying the index name and
specifying the index fields in the right order.
Do you see?

Let's there is the index:
idx_A_B ON TBL (A, B)

Is it OK that the query like below doesn't math the index 'idx_A_B'?
new IndexQuery<>(..)
  .setCriteria(lt("b", 1), lt("a", 2));

On 26.08.2021 15:23, Ivan Daschinsky wrote:

I am against to make user write index name. It is quite simple and
straightforward algorithm to match index to field names, so it is

strange

to compare it to sql engine optimizer.

It is an usability nightmare to make user write index name in all

cases.

чт, 26 авг. 2021 г., 14:42 Maksim Timonin :


Hi, Igniters!

There is a discussion about how to specify an index to query with an
IndexQuery [1]. Currently my PR provides 2 ways to specify index:
1. With a table and index name;
2. With a table and list of index fields (without index name). In this

case

IndexQueryProcessor tries to find an index that matches table and

index

fields in strict order (order of fields in criteria has to match the

order

of fields in index).

Discussion is whether is the second approach valid?

Pros:
1. Currently index name is an optional field for QueryIndex and
QuerySqlField. Then users can create an index with a table and list of
fields. Then, we should provide an opportunity to define an index for
querying the same way as we do for creating.
2. It's required to know the index name to query it (in case the index

was

created without an explicit name). Users can find it and then use it

as

a

constant in code, but I see some troubles there:
2.1. Get index name by querying the system view INDEXES. Note, that

system

views are marked as an experimental feature [2].
2.2. There is a workaround to know an index name with EXPLAIN clause

for

sql query that uses the required index (but it depends on SQL

optimizer).

2.3. Users can use the index name builder, but it is in the
internal package
(org.apache.ignite.internal.processors.query.QueryUtils#indexName).

Then it

can be changed from version to version without warning, and then the

user

can't rely on it in code.
3. Name of the Primary Key index (_key_PK) is predefined and hardcoded

in

Ignite. Users can't set it while creating it, and the name of PK index

is

hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).

Cons:
1. It's declared that IndexQuery avoids some SQL steps (like planning,
optimizer) in favor of speed. It looks like that looking for an index

by

list of fields is the work of an optimizer.
2. It can be not obvious that the order of fields in a query has to

match

the order of fields in the related index. We should have it in mind

when

building a query - there should be a check for order of fields before
querying.

  From my side, I think that arguments for enforcing usage of an index

name

for queries are strong enough. But for me it's strange that it's

possible

to create an index without a name, but it's required to use name to

query

it. Also taking in consideration that there is no guaranteed way to

get

an

index name (or I don't know it).

Igniters, what do you think?
[1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
[2]

https://ignite.apache.org/docs/latest/monitoring-metrics/system-views


On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin <

timonin.ma...@gmail.com>

wrote:


Hi, all!

It's a gentle reminder. There is a PR for the new Index API [1]. It

was

approved by Alex Plekhanov. Does anybody want to review this 

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Maksim Timonin
> But lt("b", 1) AND lt("a", 2) is equivalent to lt("a", 1) AND lt("b",
2) according with criteria API and javadoc of the 'IndexQuery#setCriteria'
method.

I updated javadoc, Andrey suggested writing this in javadocs, so now there
is a note about field order.

> If the user uses logical criteria instead of index bounds why he must
remember about the order of the index's fields

In case of non-strict fields, the choice of index (A, B) or (B, A) depends
on field order in the user query in any case (whether we make it strict or
non-strict), as IndexQueryProcessor doesn't analyze which of those 2
indexes are better. So Ignite silently may choose the wrong index. So, I
propose to restrict fields in order to avoid such confusing situations.

This case also can be avoided if we make the index name as required param.
And I think that in the first step we can make an index name required (and
fields order too), to make those restrictions more soft later? WDYT?

But currently I don't see a *100% legal* way to find an index name in
Ignite for an index that was created without specifying it? Am I missing
smth there?




On Thu, Aug 26, 2021 at 5:55 PM Taras Ledkov  wrote:

> Hi,
>
>  > 1. Index (A, B) is not equivalent to index (B, A).
>
> But
> lt("b", 1) AND lt("a", 2) is equivalent to lt("a", 1) AND lt("b", 2)
> according with criteria API and javadoc of the 'IndexQuery#setCriteria'
> method.
>
> If the user uses logical criteria instead of index bounds
> why he must remember about the order of the index's fields?
>
>  > 2. We should avoid cases when a user uses in query only field B for
> index (A, B).
> Sure. This case must be checked. Is checking of this case related to
> search index by criteria conditions?
>
> On 26.08.2021 16:49, Maksim Timonin wrote:
>
> > There are some thoughts about strict field order:
> > 1. Index (A, B) is not equivalent to index (B, A). Some queries may have
> > different performance on such indexes, and users have to specify the
> right
> > index. What if both indexes exist?
> > 2. We should avoid cases when a user uses in query only field B for index
> > (A, B). We have to force the user to specify range for (A) too, or
> > explicitly set it (null, null). Otherwise it looks like a mistake.
> >
> >
> >
> >
> > On Thu, Aug 26, 2021 at 4:39 PM Ivan Daschinsky 
> wrote:
> >
> >> 1. I suppose, that the next step is to implement the api for manually
> >> creating index. I think that user wants to create index that will speed
> up
> >> his criteria base queries, so he or she will use the same criteria to
> >> define the index. So no problem at all
> >> 2. We should print warning or throws exception if there is not any index
> >> that match specific criteria.
> >>
> >> BTW, Mongo DB doesn't make user to write index name in query. It just
> >> works.
> >>
> >> чт, 26 авг. 2021 г., 15:52 Taras Ledkov :
> >>
> >>> Hi,
> >>>
>  It is an usability nightmare to make user write index name in all
> >> cases.
> >>> I don't see any difference between specifying the index name and
> >>> specifying the index fields in the right order.
> >>> Do you see?
> >>>
> >>> Let's there is the index:
> >>> idx_A_B ON TBL (A, B)
> >>>
> >>> Is it OK that the query like below doesn't math the index 'idx_A_B'?
> >>> new IndexQuery<>(..)
> >>>   .setCriteria(lt("b", 1), lt("a", 2));
> >>>
> >>> On 26.08.2021 15:23, Ivan Daschinsky wrote:
>  I am against to make user write index name. It is quite simple and
>  straightforward algorithm to match index to field names, so it is
> >> strange
>  to compare it to sql engine optimizer.
> 
>  It is an usability nightmare to make user write index name in all
> >> cases.
>  чт, 26 авг. 2021 г., 14:42 Maksim Timonin :
> 
> > Hi, Igniters!
> >
> > There is a discussion about how to specify an index to query with an
> > IndexQuery [1]. Currently my PR provides 2 ways to specify index:
> > 1. With a table and index name;
> > 2. With a table and list of index fields (without index name). In
> this
> >>> case
> > IndexQueryProcessor tries to find an index that matches table and
> >> index
> > fields in strict order (order of fields in criteria has to match the
> >>> order
> > of fields in index).
> >
> > Discussion is whether is the second approach valid?
> >
> > Pros:
> > 1. Currently index name is an optional field for QueryIndex and
> > QuerySqlField. Then users can create an index with a table and list
> of
> > fields. Then, we should provide an opportunity to define an index for
> > querying the same way as we do for creating.
> > 2. It's required to know the index name to query it (in case the
> index
> >>> was
> > created without an explicit name). Users can find it and then use it
> >> as
> >>> a
> > constant in code, but I see some troubles there:
> > 2.1. Get index name by querying the system view INDEXES. Note, that
> >>> system
> > views are marked as 

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Maksim Timonin
There are some thoughts about strict field order:
1. Index (A, B) is not equivalent to index (B, A). Some queries may have
different performance on such indexes, and users have to specify the right
index. What if both indexes exist?
2. We should avoid cases when a user uses in query only field B for index
(A, B). We have to force the user to specify range for (A) too, or
explicitly set it (null, null). Otherwise it looks like a mistake.




On Thu, Aug 26, 2021 at 4:39 PM Ivan Daschinsky  wrote:

> 1. I suppose, that the next step is to implement the api for manually
> creating index. I think that user wants to create index that will speed up
> his criteria base queries, so he or she will use the same criteria to
> define the index. So no problem at all
> 2. We should print warning or throws exception if there is not any index
> that match specific criteria.
>
> BTW, Mongo DB doesn't make user to write index name in query. It just
> works.
>
> чт, 26 авг. 2021 г., 15:52 Taras Ledkov :
>
> > Hi,
> >
> > > It is an usability nightmare to make user write index name in all
> cases.
> > I don't see any difference between specifying the index name and
> > specifying the index fields in the right order.
> > Do you see?
> >
> > Let's there is the index:
> > idx_A_B ON TBL (A, B)
> >
> > Is it OK that the query like below doesn't math the index 'idx_A_B'?
> > new IndexQuery<>(..)
> >  .setCriteria(lt("b", 1), lt("a", 2));
> >
> > On 26.08.2021 15:23, Ivan Daschinsky wrote:
> > > I am against to make user write index name. It is quite simple and
> > > straightforward algorithm to match index to field names, so it is
> strange
> > > to compare it to sql engine optimizer.
> > >
> > > It is an usability nightmare to make user write index name in all
> cases.
> > >
> > > чт, 26 авг. 2021 г., 14:42 Maksim Timonin :
> > >
> > >> Hi, Igniters!
> > >>
> > >> There is a discussion about how to specify an index to query with an
> > >> IndexQuery [1]. Currently my PR provides 2 ways to specify index:
> > >> 1. With a table and index name;
> > >> 2. With a table and list of index fields (without index name). In this
> > case
> > >> IndexQueryProcessor tries to find an index that matches table and
> index
> > >> fields in strict order (order of fields in criteria has to match the
> > order
> > >> of fields in index).
> > >>
> > >> Discussion is whether is the second approach valid?
> > >>
> > >> Pros:
> > >> 1. Currently index name is an optional field for QueryIndex and
> > >> QuerySqlField. Then users can create an index with a table and list of
> > >> fields. Then, we should provide an opportunity to define an index for
> > >> querying the same way as we do for creating.
> > >> 2. It's required to know the index name to query it (in case the index
> > was
> > >> created without an explicit name). Users can find it and then use it
> as
> > a
> > >> constant in code, but I see some troubles there:
> > >> 2.1. Get index name by querying the system view INDEXES. Note, that
> > system
> > >> views are marked as an experimental feature [2].
> > >> 2.2. There is a workaround to know an index name with EXPLAIN clause
> for
> > >> sql query that uses the required index (but it depends on SQL
> > optimizer).
> > >> 2.3. Users can use the index name builder, but it is in the
> > >> internal package
> > >> (org.apache.ignite.internal.processors.query.QueryUtils#indexName).
> > Then it
> > >> can be changed from version to version without warning, and then the
> > user
> > >> can't rely on it in code.
> > >> 3. Name of the Primary Key index (_key_PK) is predefined and hardcoded
> > in
> > >> Ignite. Users can't set it while creating it, and the name of PK index
> > is
> > >> hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).
> > >>
> > >> Cons:
> > >> 1. It's declared that IndexQuery avoids some SQL steps (like planning,
> > >> optimizer) in favor of speed. It looks like that looking for an index
> by
> > >> list of fields is the work of an optimizer.
> > >> 2. It can be not obvious that the order of fields in a query has to
> > match
> > >> the order of fields in the related index. We should have it in mind
> when
> > >> building a query - there should be a check for order of fields before
> > >> querying.
> > >>
> > >>  From my side, I think that arguments for enforcing usage of an index
> > name
> > >> for queries are strong enough. But for me it's strange that it's
> > possible
> > >> to create an index without a name, but it's required to use name to
> > query
> > >> it. Also taking in consideration that there is no guaranteed way to
> get
> > an
> > >> index name (or I don't know it).
> > >>
> > >> Igniters, what do you think?
> > >> [1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
> > >> [2]
> > https://ignite.apache.org/docs/latest/monitoring-metrics/system-views
> > >>
> > >>
> > >> On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin <
> timonin.ma...@gmail.com>
> > >> wrote:
> > >>
> > >>> 

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Ivan Daschinsky
1. I suppose, that the next step is to implement the api for manually
creating index. I think that user wants to create index that will speed up
his criteria base queries, so he or she will use the same criteria to
define the index. So no problem at all
2. We should print warning or throws exception if there is not any index
that match specific criteria.

BTW, Mongo DB doesn't make user to write index name in query. It just
works.

чт, 26 авг. 2021 г., 15:52 Taras Ledkov :

> Hi,
>
> > It is an usability nightmare to make user write index name in all cases.
> I don't see any difference between specifying the index name and
> specifying the index fields in the right order.
> Do you see?
>
> Let's there is the index:
> idx_A_B ON TBL (A, B)
>
> Is it OK that the query like below doesn't math the index 'idx_A_B'?
> new IndexQuery<>(..)
>  .setCriteria(lt("b", 1), lt("a", 2));
>
> On 26.08.2021 15:23, Ivan Daschinsky wrote:
> > I am against to make user write index name. It is quite simple and
> > straightforward algorithm to match index to field names, so it is strange
> > to compare it to sql engine optimizer.
> >
> > It is an usability nightmare to make user write index name in all cases.
> >
> > чт, 26 авг. 2021 г., 14:42 Maksim Timonin :
> >
> >> Hi, Igniters!
> >>
> >> There is a discussion about how to specify an index to query with an
> >> IndexQuery [1]. Currently my PR provides 2 ways to specify index:
> >> 1. With a table and index name;
> >> 2. With a table and list of index fields (without index name). In this
> case
> >> IndexQueryProcessor tries to find an index that matches table and index
> >> fields in strict order (order of fields in criteria has to match the
> order
> >> of fields in index).
> >>
> >> Discussion is whether is the second approach valid?
> >>
> >> Pros:
> >> 1. Currently index name is an optional field for QueryIndex and
> >> QuerySqlField. Then users can create an index with a table and list of
> >> fields. Then, we should provide an opportunity to define an index for
> >> querying the same way as we do for creating.
> >> 2. It's required to know the index name to query it (in case the index
> was
> >> created without an explicit name). Users can find it and then use it as
> a
> >> constant in code, but I see some troubles there:
> >> 2.1. Get index name by querying the system view INDEXES. Note, that
> system
> >> views are marked as an experimental feature [2].
> >> 2.2. There is a workaround to know an index name with EXPLAIN clause for
> >> sql query that uses the required index (but it depends on SQL
> optimizer).
> >> 2.3. Users can use the index name builder, but it is in the
> >> internal package
> >> (org.apache.ignite.internal.processors.query.QueryUtils#indexName).
> Then it
> >> can be changed from version to version without warning, and then the
> user
> >> can't rely on it in code.
> >> 3. Name of the Primary Key index (_key_PK) is predefined and hardcoded
> in
> >> Ignite. Users can't set it while creating it, and the name of PK index
> is
> >> hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).
> >>
> >> Cons:
> >> 1. It's declared that IndexQuery avoids some SQL steps (like planning,
> >> optimizer) in favor of speed. It looks like that looking for an index by
> >> list of fields is the work of an optimizer.
> >> 2. It can be not obvious that the order of fields in a query has to
> match
> >> the order of fields in the related index. We should have it in mind when
> >> building a query - there should be a check for order of fields before
> >> querying.
> >>
> >>  From my side, I think that arguments for enforcing usage of an index
> name
> >> for queries are strong enough. But for me it's strange that it's
> possible
> >> to create an index without a name, but it's required to use name to
> query
> >> it. Also taking in consideration that there is no guaranteed way to get
> an
> >> index name (or I don't know it).
> >>
> >> Igniters, what do you think?
> >> [1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
> >> [2]
> https://ignite.apache.org/docs/latest/monitoring-metrics/system-views
> >>
> >>
> >> On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin 
> >> wrote:
> >>
> >>> Hi, all!
> >>>
> >>> It's a gentle reminder. There is a PR for the new Index API [1]. It was
> >>> approved by Alex Plekhanov. Does anybody want to review this API too?
> If
> >>> there won't be objections we're going to merge it Monday, 16th of
> August.
> >>>
> >>> Thanks!
> >>>
> >>> [1] https://github.com/apache/ignite/pull/9118
> >>>
> >>> On Fri, May 21, 2021 at 10:43 PM Maksim Timonin <
> timonin.ma...@gmail.com
> >>>
> >>> wrote:
> >>>
>  Andrey, hi!
> 
>  Some updates, there.
> 
>  I've submitted a PR for IndexQuery [1]. There is an issue about lazy
> >> page
>  loading, that is also related to Text query ticket IGNITE-12291.
> 
>  CacheQueries already have pending pages functionality, it's done with
>  

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Taras Ledkov

Hi,


It is an usability nightmare to make user write index name in all cases.

I don't see any difference between specifying the index name and specifying the 
index fields in the right order.
Do you see?

Let's there is the index:
idx_A_B ON TBL (A, B)

Is it OK that the query like below doesn't math the index 'idx_A_B'?
new IndexQuery<>(..)
.setCriteria(lt("b", 1), lt("a", 2));

On 26.08.2021 15:23, Ivan Daschinsky wrote:

I am against to make user write index name. It is quite simple and
straightforward algorithm to match index to field names, so it is strange
to compare it to sql engine optimizer.

It is an usability nightmare to make user write index name in all cases.

чт, 26 авг. 2021 г., 14:42 Maksim Timonin :


Hi, Igniters!

There is a discussion about how to specify an index to query with an
IndexQuery [1]. Currently my PR provides 2 ways to specify index:
1. With a table and index name;
2. With a table and list of index fields (without index name). In this case
IndexQueryProcessor tries to find an index that matches table and index
fields in strict order (order of fields in criteria has to match the order
of fields in index).

Discussion is whether is the second approach valid?

Pros:
1. Currently index name is an optional field for QueryIndex and
QuerySqlField. Then users can create an index with a table and list of
fields. Then, we should provide an opportunity to define an index for
querying the same way as we do for creating.
2. It's required to know the index name to query it (in case the index was
created without an explicit name). Users can find it and then use it as a
constant in code, but I see some troubles there:
2.1. Get index name by querying the system view INDEXES. Note, that system
views are marked as an experimental feature [2].
2.2. There is a workaround to know an index name with EXPLAIN clause for
sql query that uses the required index (but it depends on SQL optimizer).
2.3. Users can use the index name builder, but it is in the
internal package
(org.apache.ignite.internal.processors.query.QueryUtils#indexName). Then it
can be changed from version to version without warning, and then the user
can't rely on it in code.
3. Name of the Primary Key index (_key_PK) is predefined and hardcoded in
Ignite. Users can't set it while creating it, and the name of PK index is
hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).

Cons:
1. It's declared that IndexQuery avoids some SQL steps (like planning,
optimizer) in favor of speed. It looks like that looking for an index by
list of fields is the work of an optimizer.
2. It can be not obvious that the order of fields in a query has to match
the order of fields in the related index. We should have it in mind when
building a query - there should be a check for order of fields before
querying.

 From my side, I think that arguments for enforcing usage of an index name
for queries are strong enough. But for me it's strange that it's possible
to create an index without a name, but it's required to use name to query
it. Also taking in consideration that there is no guaranteed way to get an
index name (or I don't know it).

Igniters, what do you think?
[1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
[2] https://ignite.apache.org/docs/latest/monitoring-metrics/system-views


On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin 
wrote:


Hi, all!

It's a gentle reminder. There is a PR for the new Index API [1]. It was
approved by Alex Plekhanov. Does anybody want to review this API too? If
there won't be objections we're going to merge it Monday, 16th of August.

Thanks!

[1] https://github.com/apache/ignite/pull/9118

On Fri, May 21, 2021 at 10:43 PM Maksim Timonin 
Andrey, hi!

Some updates, there.

I've submitted a PR for IndexQuery [1]. There is an issue about lazy

page

loading, that is also related to Text query ticket IGNITE-12291.

CacheQueries already have pending pages functionality, it's done with
multiple sending GridCacheQueryRequest. There was an issue with

TextQuery

and limit, after exceeding a limit we still send requests, so I

submitted a

patch to fix this [2].

But currently, TextQuery, as SqlFieldsQuery also does, prepares whole
data on query request, holds it, and provides a cursor over this
collection.

As I understand you correctly, you propose to run TextQuery over index
with every poll page request. We can do this with Lucene
IndexSearcher.searchAfter. So from one side, it will save resources. But
from the other side, no queries (no TextQuery, no SqlFieldsQuery) lock
index for querying. So there can be data inconsistency, as there can be
concurrent operations on an index while a user iterates over the

cursor. It

also could be for queries now, due to no index lock being there, but the
window of time of such inconsistency is much shorter.

The same dilemma I have for IndexQuery. In my patch [1] I provide lazy
iteration over BPlusTree. There is no lock on an index too while


Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Andrey Mashenkov
@Ivan Daschinsky 

By the way, users are forced to specify index condition in order that match
column order in the index.

On Thu, Aug 26, 2021 at 3:24 PM Ivan Daschinsky  wrote:

> I am against to make user write index name. It is quite simple and
> straightforward algorithm to match index to field names, so it is strange
> to compare it to sql engine optimizer.
>
> It is an usability nightmare to make user write index name in all cases.
>
> чт, 26 авг. 2021 г., 14:42 Maksim Timonin :
>
> > Hi, Igniters!
> >
> > There is a discussion about how to specify an index to query with an
> > IndexQuery [1]. Currently my PR provides 2 ways to specify index:
> > 1. With a table and index name;
> > 2. With a table and list of index fields (without index name). In this
> case
> > IndexQueryProcessor tries to find an index that matches table and index
> > fields in strict order (order of fields in criteria has to match the
> order
> > of fields in index).
> >
> > Discussion is whether is the second approach valid?
> >
> > Pros:
> > 1. Currently index name is an optional field for QueryIndex and
> > QuerySqlField. Then users can create an index with a table and list of
> > fields. Then, we should provide an opportunity to define an index for
> > querying the same way as we do for creating.
> > 2. It's required to know the index name to query it (in case the index
> was
> > created without an explicit name). Users can find it and then use it as a
> > constant in code, but I see some troubles there:
> > 2.1. Get index name by querying the system view INDEXES. Note, that
> system
> > views are marked as an experimental feature [2].
> > 2.2. There is a workaround to know an index name with EXPLAIN clause for
> > sql query that uses the required index (but it depends on SQL optimizer).
> > 2.3. Users can use the index name builder, but it is in the
> > internal package
> > (org.apache.ignite.internal.processors.query.QueryUtils#indexName). Then
> it
> > can be changed from version to version without warning, and then the user
> > can't rely on it in code.
> > 3. Name of the Primary Key index (_key_PK) is predefined and hardcoded in
> > Ignite. Users can't set it while creating it, and the name of PK index is
> > hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).
> >
> > Cons:
> > 1. It's declared that IndexQuery avoids some SQL steps (like planning,
> > optimizer) in favor of speed. It looks like that looking for an index by
> > list of fields is the work of an optimizer.
> > 2. It can be not obvious that the order of fields in a query has to match
> > the order of fields in the related index. We should have it in mind when
> > building a query - there should be a check for order of fields before
> > querying.
> >
> > From my side, I think that arguments for enforcing usage of an index name
> > for queries are strong enough. But for me it's strange that it's possible
> > to create an index without a name, but it's required to use name to query
> > it. Also taking in consideration that there is no guaranteed way to get
> an
> > index name (or I don't know it).
> >
> > Igniters, what do you think?
> > [1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
> > [2]
> https://ignite.apache.org/docs/latest/monitoring-metrics/system-views
> >
> >
> > On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin 
> > wrote:
> >
> > > Hi, all!
> > >
> > > It's a gentle reminder. There is a PR for the new Index API [1]. It was
> > > approved by Alex Plekhanov. Does anybody want to review this API too?
> If
> > > there won't be objections we're going to merge it Monday, 16th of
> August.
> > >
> > > Thanks!
> > >
> > > [1] https://github.com/apache/ignite/pull/9118
> > >
> > > On Fri, May 21, 2021 at 10:43 PM Maksim Timonin <
> timonin.ma...@gmail.com
> > >
> > > wrote:
> > >
> > >> Andrey, hi!
> > >>
> > >> Some updates, there.
> > >>
> > >> I've submitted a PR for IndexQuery [1]. There is an issue about lazy
> > page
> > >> loading, that is also related to Text query ticket IGNITE-12291.
> > >>
> > >> CacheQueries already have pending pages functionality, it's done with
> > >> multiple sending GridCacheQueryRequest. There was an issue with
> > TextQuery
> > >> and limit, after exceeding a limit we still send requests, so I
> > submitted a
> > >> patch to fix this [2].
> > >>
> > >> But currently, TextQuery, as SqlFieldsQuery also does, prepares whole
> > >> data on query request, holds it, and provides a cursor over this
> > >> collection.
> > >>
> > >> As I understand you correctly, you propose to run TextQuery over index
> > >> with every poll page request. We can do this with Lucene
> > >> IndexSearcher.searchAfter. So from one side, it will save resources.
> But
> > >> from the other side, no queries (no TextQuery, no SqlFieldsQuery) lock
> > >> index for querying. So there can be data inconsistency, as there can
> be
> > >> concurrent operations on an index while a user iterates over the
> > cursor. It
> > >> 

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Ivan Daschinsky
I am against to make user write index name. It is quite simple and
straightforward algorithm to match index to field names, so it is strange
to compare it to sql engine optimizer.

It is an usability nightmare to make user write index name in all cases.

чт, 26 авг. 2021 г., 14:42 Maksim Timonin :

> Hi, Igniters!
>
> There is a discussion about how to specify an index to query with an
> IndexQuery [1]. Currently my PR provides 2 ways to specify index:
> 1. With a table and index name;
> 2. With a table and list of index fields (without index name). In this case
> IndexQueryProcessor tries to find an index that matches table and index
> fields in strict order (order of fields in criteria has to match the order
> of fields in index).
>
> Discussion is whether is the second approach valid?
>
> Pros:
> 1. Currently index name is an optional field for QueryIndex and
> QuerySqlField. Then users can create an index with a table and list of
> fields. Then, we should provide an opportunity to define an index for
> querying the same way as we do for creating.
> 2. It's required to know the index name to query it (in case the index was
> created without an explicit name). Users can find it and then use it as a
> constant in code, but I see some troubles there:
> 2.1. Get index name by querying the system view INDEXES. Note, that system
> views are marked as an experimental feature [2].
> 2.2. There is a workaround to know an index name with EXPLAIN clause for
> sql query that uses the required index (but it depends on SQL optimizer).
> 2.3. Users can use the index name builder, but it is in the
> internal package
> (org.apache.ignite.internal.processors.query.QueryUtils#indexName). Then it
> can be changed from version to version without warning, and then the user
> can't rely on it in code.
> 3. Name of the Primary Key index (_key_PK) is predefined and hardcoded in
> Ignite. Users can't set it while creating it, and the name of PK index is
> hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).
>
> Cons:
> 1. It's declared that IndexQuery avoids some SQL steps (like planning,
> optimizer) in favor of speed. It looks like that looking for an index by
> list of fields is the work of an optimizer.
> 2. It can be not obvious that the order of fields in a query has to match
> the order of fields in the related index. We should have it in mind when
> building a query - there should be a check for order of fields before
> querying.
>
> From my side, I think that arguments for enforcing usage of an index name
> for queries are strong enough. But for me it's strange that it's possible
> to create an index without a name, but it's required to use name to query
> it. Also taking in consideration that there is no guaranteed way to get an
> index name (or I don't know it).
>
> Igniters, what do you think?
> [1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
> [2] https://ignite.apache.org/docs/latest/monitoring-metrics/system-views
>
>
> On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin 
> wrote:
>
> > Hi, all!
> >
> > It's a gentle reminder. There is a PR for the new Index API [1]. It was
> > approved by Alex Plekhanov. Does anybody want to review this API too? If
> > there won't be objections we're going to merge it Monday, 16th of August.
> >
> > Thanks!
> >
> > [1] https://github.com/apache/ignite/pull/9118
> >
> > On Fri, May 21, 2021 at 10:43 PM Maksim Timonin  >
> > wrote:
> >
> >> Andrey, hi!
> >>
> >> Some updates, there.
> >>
> >> I've submitted a PR for IndexQuery [1]. There is an issue about lazy
> page
> >> loading, that is also related to Text query ticket IGNITE-12291.
> >>
> >> CacheQueries already have pending pages functionality, it's done with
> >> multiple sending GridCacheQueryRequest. There was an issue with
> TextQuery
> >> and limit, after exceeding a limit we still send requests, so I
> submitted a
> >> patch to fix this [2].
> >>
> >> But currently, TextQuery, as SqlFieldsQuery also does, prepares whole
> >> data on query request, holds it, and provides a cursor over this
> >> collection.
> >>
> >> As I understand you correctly, you propose to run TextQuery over index
> >> with every poll page request. We can do this with Lucene
> >> IndexSearcher.searchAfter. So from one side, it will save resources. But
> >> from the other side, no queries (no TextQuery, no SqlFieldsQuery) lock
> >> index for querying. So there can be data inconsistency, as there can be
> >> concurrent operations on an index while a user iterates over the
> cursor. It
> >> also could be for queries now, due to no index lock being there, but the
> >> window of time of such inconsistency is much shorter.
> >>
> >> The same dilemma I have for IndexQuery. In my patch [1] I provide lazy
> >> iteration over BPlusTree. There is no lock on an index too while
> querying.
> >> And I want to discuss the right way. I have in mind the next things:
> >> 1. Indexes currently doesn't support transactions, also SQL 

Re: [DISCUSS] IEP-71 Public API for secondary index search

2021-08-26 Thread Maksim Timonin
Hi, Igniters!

There is a discussion about how to specify an index to query with an
IndexQuery [1]. Currently my PR provides 2 ways to specify index:
1. With a table and index name;
2. With a table and list of index fields (without index name). In this case
IndexQueryProcessor tries to find an index that matches table and index
fields in strict order (order of fields in criteria has to match the order
of fields in index).

Discussion is whether is the second approach valid?

Pros:
1. Currently index name is an optional field for QueryIndex and
QuerySqlField. Then users can create an index with a table and list of
fields. Then, we should provide an opportunity to define an index for
querying the same way as we do for creating.
2. It's required to know the index name to query it (in case the index was
created without an explicit name). Users can find it and then use it as a
constant in code, but I see some troubles there:
2.1. Get index name by querying the system view INDEXES. Note, that system
views are marked as an experimental feature [2].
2.2. There is a workaround to know an index name with EXPLAIN clause for
sql query that uses the required index (but it depends on SQL optimizer).
2.3. Users can use the index name builder, but it is in the
internal package
(org.apache.ignite.internal.processors.query.QueryUtils#indexName). Then it
can be changed from version to version without warning, and then the user
can't rely on it in code.
3. Name of the Primary Key index (_key_PK) is predefined and hardcoded in
Ignite. Users can't set it while creating it, and the name of PK index is
hardcoded in the internal package too (QueryUtils.PRIMARY_KEY_INDEX).

Cons:
1. It's declared that IndexQuery avoids some SQL steps (like planning,
optimizer) in favor of speed. It looks like that looking for an index by
list of fields is the work of an optimizer.
2. It can be not obvious that the order of fields in a query has to match
the order of fields in the related index. We should have it in mind when
building a query - there should be a check for order of fields before
querying.

>From my side, I think that arguments for enforcing usage of an index name
for queries are strong enough. But for me it's strange that it's possible
to create an index without a name, but it's required to use name to query
it. Also taking in consideration that there is no guaranteed way to get an
index name (or I don't know it).

Igniters, what do you think?
[1] https://github.com/apache/ignite/pull/9118#discussion_r642557531
[2] https://ignite.apache.org/docs/latest/monitoring-metrics/system-views


On Fri, Aug 6, 2021 at 4:04 PM Maksim Timonin 
wrote:

> Hi, all!
>
> It's a gentle reminder. There is a PR for the new Index API [1]. It was
> approved by Alex Plekhanov. Does anybody want to review this API too? If
> there won't be objections we're going to merge it Monday, 16th of August.
>
> Thanks!
>
> [1] https://github.com/apache/ignite/pull/9118
>
> On Fri, May 21, 2021 at 10:43 PM Maksim Timonin 
> wrote:
>
>> Andrey, hi!
>>
>> Some updates, there.
>>
>> I've submitted a PR for IndexQuery [1]. There is an issue about lazy page
>> loading, that is also related to Text query ticket IGNITE-12291.
>>
>> CacheQueries already have pending pages functionality, it's done with
>> multiple sending GridCacheQueryRequest. There was an issue with TextQuery
>> and limit, after exceeding a limit we still send requests, so I submitted a
>> patch to fix this [2].
>>
>> But currently, TextQuery, as SqlFieldsQuery also does, prepares whole
>> data on query request, holds it, and provides a cursor over this
>> collection.
>>
>> As I understand you correctly, you propose to run TextQuery over index
>> with every poll page request. We can do this with Lucene
>> IndexSearcher.searchAfter. So from one side, it will save resources. But
>> from the other side, no queries (no TextQuery, no SqlFieldsQuery) lock
>> index for querying. So there can be data inconsistency, as there can be
>> concurrent operations on an index while a user iterates over the cursor. It
>> also could be for queries now, due to no index lock being there, but the
>> window of time of such inconsistency is much shorter.
>>
>> The same dilemma I have for IndexQuery. In my patch [1] I provide lazy
>> iteration over BPlusTree. There is no lock on an index too while querying.
>> And I want to discuss the right way. I have in mind the next things:
>> 1. Indexes currently doesn't support transactions, also SQL queries don't
>> lock index for queries, so Ignite don't guarantee data consistency;
>> 2. As I understand preparing whole data for SQL queries is required due
>> to relations between tables. The more complex query and relations we have,
>> the much consistency issues we have in result in case of parallel
>> operations;
>> 3. Querying a single index only (by TextQuery or IndexQuery) doesn't
>> affect any relations, so we can allow concurrent updates, as it could
>> affect a query result but it