Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-03-14 Thread Ivan Pavlukhin
Yuriy,

> Let me summarize the approaches:
I agree with your reasoning, p.2 sounds the best one to me as well.

Will look into merge-sort strategy some time later.

Best regards,
Ivan Pavlukhin

пт, 13 мар. 2020 г. в 19:23, Yuriy Shuliga :
>
> Ivan,
>
> I have made changes in the fork that reflects merge-sort strategy and now
> query future iterator unblocks as soon all first pages are delivered from
> nodes; then it waits for the next pages portions and so on.
> https://github.com/shuliga/ignite/commit/c84f04c18f67e99ab7bc0a7893b75f1dc83a76bd
>
> Please validate the design if you wish.
>
> Regarding ranking field in the entity.
>
> Entities for text queries in search domain are usually treated as
> documents with some metadata.
> This can be an id, issued/expired date, and document score returned for
> given query.
> It is common to include such fields in entity design.
>
> Answer to your question about omitting QueryRankField:
> - Then the response records just will come in arbitrary order. This
> should not fail TextQuery execution.
>
> Another point about rank value among different indices.
> - ranks are to be used for comparison between entities in praticular query
> response, they are not intended to be absolute over the system.
>
> Let me summarize the approaches:
> 1. Subclassing from Ranked.class.
>  pros: the simplest and ignite-natural approach
> cons: implicit nature, limits entity inheritance
>
> 2. Explicitly Introducing dedicated field  annotated  @QueryRankField
> pros:  ignite-natural approach, easy to introduce, explicitly controlled by
> developer
> cons: adds extra metadata to entity
>
> 3. Wrapping entity response with rank data, used for merge sort, not
> exposing it to client.
> pros: leaves entity design clean
> cons: rank is not available for client, development will require complex
> change in query execution / entity marshaling mechanisms
>
> I'd stay on p.2 as most balanced solution of these.
> What do you think?
>
> BR,
> Yuriy Shuliha
>
>
>
>
> ср, 11 бер. 2020 о 01:14 Ivan Pavlukhin  пише:
>
> > Igniters,
> >
> > Not intentionally the discussion continued outside of dev list. I am
> > returning it back. You can find it below. Do not hesitate to join if you
> > have some thoughts on raised questions. May be you have ideas how to enrich
> > text query results with score/rank information.
> >
> > вт, 10 мар. 2020 г. в 09:11, Yuriy Shuliga :
> >
> > > Yes, please do.
> > >
> > > вт, 10 бер. 2020, 02:26 користувач Ivan Pavlukhin 
> > > пише:
> > >
> > >> Yuriy,
> > >>
> > >> I noticed that from some point our discussion moved out of Ignite dev
> > >> list. Would you mind if I return it back to dev list?
> > >>
> > >> Best regards,
> > >> Ivan Pavlukhin
> > >>
> > >> вт, 10 мар. 2020 г. в 03:25, Ivan Pavlukhin :
> > >> >
> > >> > > PS As far as i see, the are no chance to get on 2.8 release train.
> > >> What will be the next version/date we can aim on with this update?
> > >> >
> > >> > Yes, 2.8 is already available and the community is working on
> > >> finalizing activities (e.g. publishing documentation). I do not have any
> > >> reliable expectations about next releases. I suppose that there could
> > be a
> > >> couple of maintenance releases like 2.8.1 as several problems were
> > already
> > >> discovered. I do not know whether next more significant release is
> > going to
> > >> be 2.9 even major release 3.0. It sounds realistic to facilitate 2.9
> > >> because there are already several "almost ready" features in master. In
> > my
> > >> mind it is a good idea to start a discussion about next releases on dev
> > >> list.
> > >> >
> > >> > Best regards,
> > >> > Ivan Pavlukhin
> > >> >
> > >> > вт, 10 мар. 2020 г. в 00:58, Ivan Pavlukhin :
> > >> > >
> > >> > > Hi Yuriy,
> > >> > >
> > >> > > Sorry for a late response.
> > >> > >
> > >> > > > Suitable solution without subclassing might be:
> > >> > > > 1. Explicitly add float field to entity
> > >> > > > 2. Annotate it with special @QueryRankField, (for instance)
> > >> > > > 3. Fill in this field with docScore in GrlidLuceneindex, pass back
> > >> to initiating node
> > >> > > > 4. Possibly still need to proxify entity with adding Comparable
> > >> interface.
> > >> > > > 5. Perform merge sort on initiating node
> > >> > >
> > >> > > Possibly I missed it but one moment is not clear for me. What will
> > >> > > happen if an entity class does not have a field annotated with
> > >> > > QueryRankField?
> > >> > >
> > >> > > And I am still not sure that it is a proper (enough) approach. The
> > >> > > thing which bothers me is a transient and dynamic nature of "rank"
> > >> > > field. It does belong to entity, it can have different values for
> > the
> > >> > > same entity (e.g. different indices are used).
> > >> > >
> > >> > > I would like to experiment with a code a little bit. But most
> > likely I
> > >> > > will have a chance only at the end of this week.
> > >> > >
> > >> > > Best regards,
> > >> > > Ivan Pavlukhin
> > 

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-03-13 Thread Yuriy Shuliga
Ivan,

I have made changes in the fork that reflects merge-sort strategy and now
query future iterator unblocks as soon all first pages are delivered from
nodes; then it waits for the next pages portions and so on.
https://github.com/shuliga/ignite/commit/c84f04c18f67e99ab7bc0a7893b75f1dc83a76bd

Please validate the design if you wish.

Regarding ranking field in the entity.

Entities for text queries in search domain are usually treated as
documents with some metadata.
This can be an id, issued/expired date, and document score returned for
given query.
It is common to include such fields in entity design.

Answer to your question about omitting QueryRankField:
- Then the response records just will come in arbitrary order. This
should not fail TextQuery execution.

Another point about rank value among different indices.
- ranks are to be used for comparison between entities in praticular query
response, they are not intended to be absolute over the system.

Let me summarize the approaches:
1. Subclassing from Ranked.class.
 pros: the simplest and ignite-natural approach
cons: implicit nature, limits entity inheritance

2. Explicitly Introducing dedicated field  annotated  @QueryRankField
pros:  ignite-natural approach, easy to introduce, explicitly controlled by
developer
cons: adds extra metadata to entity

3. Wrapping entity response with rank data, used for merge sort, not
exposing it to client.
pros: leaves entity design clean
cons: rank is not available for client, development will require complex
change in query execution / entity marshaling mechanisms

I'd stay on p.2 as most balanced solution of these.
What do you think?

BR,
Yuriy Shuliha




ср, 11 бер. 2020 о 01:14 Ivan Pavlukhin  пише:

> Igniters,
>
> Not intentionally the discussion continued outside of dev list. I am
> returning it back. You can find it below. Do not hesitate to join if you
> have some thoughts on raised questions. May be you have ideas how to enrich
> text query results with score/rank information.
>
> вт, 10 мар. 2020 г. в 09:11, Yuriy Shuliga :
>
> > Yes, please do.
> >
> > вт, 10 бер. 2020, 02:26 користувач Ivan Pavlukhin 
> > пише:
> >
> >> Yuriy,
> >>
> >> I noticed that from some point our discussion moved out of Ignite dev
> >> list. Would you mind if I return it back to dev list?
> >>
> >> Best regards,
> >> Ivan Pavlukhin
> >>
> >> вт, 10 мар. 2020 г. в 03:25, Ivan Pavlukhin :
> >> >
> >> > > PS As far as i see, the are no chance to get on 2.8 release train.
> >> What will be the next version/date we can aim on with this update?
> >> >
> >> > Yes, 2.8 is already available and the community is working on
> >> finalizing activities (e.g. publishing documentation). I do not have any
> >> reliable expectations about next releases. I suppose that there could
> be a
> >> couple of maintenance releases like 2.8.1 as several problems were
> already
> >> discovered. I do not know whether next more significant release is
> going to
> >> be 2.9 even major release 3.0. It sounds realistic to facilitate 2.9
> >> because there are already several "almost ready" features in master. In
> my
> >> mind it is a good idea to start a discussion about next releases on dev
> >> list.
> >> >
> >> > Best regards,
> >> > Ivan Pavlukhin
> >> >
> >> > вт, 10 мар. 2020 г. в 00:58, Ivan Pavlukhin :
> >> > >
> >> > > Hi Yuriy,
> >> > >
> >> > > Sorry for a late response.
> >> > >
> >> > > > Suitable solution without subclassing might be:
> >> > > > 1. Explicitly add float field to entity
> >> > > > 2. Annotate it with special @QueryRankField, (for instance)
> >> > > > 3. Fill in this field with docScore in GrlidLuceneindex, pass back
> >> to initiating node
> >> > > > 4. Possibly still need to proxify entity with adding Comparable
> >> interface.
> >> > > > 5. Perform merge sort on initiating node
> >> > >
> >> > > Possibly I missed it but one moment is not clear for me. What will
> >> > > happen if an entity class does not have a field annotated with
> >> > > QueryRankField?
> >> > >
> >> > > And I am still not sure that it is a proper (enough) approach. The
> >> > > thing which bothers me is a transient and dynamic nature of "rank"
> >> > > field. It does belong to entity, it can have different values for
> the
> >> > > same entity (e.g. different indices are used).
> >> > >
> >> > > I would like to experiment with a code a little bit. But most
> likely I
> >> > > will have a chance only at the end of this week.
> >> > >
> >> > > Best regards,
> >> > > Ivan Pavlukhin
> >> > >
> >> > > пн, 2 мар. 2020 г. в 20:09, Yuriy Shuliga :
> >> > > >
> >> > > > Hi Ivan,
> >> > > >
> >> > > > Have concerns about entity annotation variant.
> >> > > > Wrapping into dynamic proxy for passing back, will be quite a
> >> complex thing that requires changes in IgniteCacheObjectProcessor
> >> > > > and entity marshaling.
> >> > > >
> >> > > > Suitable solution without subclassing might be:
> >> > > > 1. Explicitly add float field to entity
> >> > > > 2. Annotate 

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-03-10 Thread Ivan Pavlukhin
Igniters,

Not intentionally the discussion continued outside of dev list. I am
returning it back. You can find it below. Do not hesitate to join if you
have some thoughts on raised questions. May be you have ideas how to enrich
text query results with score/rank information.

вт, 10 мар. 2020 г. в 09:11, Yuriy Shuliga :

> Yes, please do.
>
> вт, 10 бер. 2020, 02:26 користувач Ivan Pavlukhin 
> пише:
>
>> Yuriy,
>>
>> I noticed that from some point our discussion moved out of Ignite dev
>> list. Would you mind if I return it back to dev list?
>>
>> Best regards,
>> Ivan Pavlukhin
>>
>> вт, 10 мар. 2020 г. в 03:25, Ivan Pavlukhin :
>> >
>> > > PS As far as i see, the are no chance to get on 2.8 release train.
>> What will be the next version/date we can aim on with this update?
>> >
>> > Yes, 2.8 is already available and the community is working on
>> finalizing activities (e.g. publishing documentation). I do not have any
>> reliable expectations about next releases. I suppose that there could be a
>> couple of maintenance releases like 2.8.1 as several problems were already
>> discovered. I do not know whether next more significant release is going to
>> be 2.9 even major release 3.0. It sounds realistic to facilitate 2.9
>> because there are already several "almost ready" features in master. In my
>> mind it is a good idea to start a discussion about next releases on dev
>> list.
>> >
>> > Best regards,
>> > Ivan Pavlukhin
>> >
>> > вт, 10 мар. 2020 г. в 00:58, Ivan Pavlukhin :
>> > >
>> > > Hi Yuriy,
>> > >
>> > > Sorry for a late response.
>> > >
>> > > > Suitable solution without subclassing might be:
>> > > > 1. Explicitly add float field to entity
>> > > > 2. Annotate it with special @QueryRankField, (for instance)
>> > > > 3. Fill in this field with docScore in GrlidLuceneindex, pass back
>> to initiating node
>> > > > 4. Possibly still need to proxify entity with adding Comparable
>> interface.
>> > > > 5. Perform merge sort on initiating node
>> > >
>> > > Possibly I missed it but one moment is not clear for me. What will
>> > > happen if an entity class does not have a field annotated with
>> > > QueryRankField?
>> > >
>> > > And I am still not sure that it is a proper (enough) approach. The
>> > > thing which bothers me is a transient and dynamic nature of "rank"
>> > > field. It does belong to entity, it can have different values for the
>> > > same entity (e.g. different indices are used).
>> > >
>> > > I would like to experiment with a code a little bit. But most likely I
>> > > will have a chance only at the end of this week.
>> > >
>> > > Best regards,
>> > > Ivan Pavlukhin
>> > >
>> > > пн, 2 мар. 2020 г. в 20:09, Yuriy Shuliga :
>> > > >
>> > > > Hi Ivan,
>> > > >
>> > > > Have concerns about entity annotation variant.
>> > > > Wrapping into dynamic proxy for passing back, will be quite a
>> complex thing that requires changes in IgniteCacheObjectProcessor
>> > > > and entity marshaling.
>> > > >
>> > > > Suitable solution without subclassing might be:
>> > > > 1. Explicitly add float field to entity
>> > > > 2. Annotate it with special @QueryRankField, (for instance)
>> > > > 3. Fill in this field with docScore in GrlidLuceneindex, pass back
>> to initiating node
>> > > > 4. Possibly still need to proxify entity with adding Comparable
>> interface.
>> > > > 5. Perform merge sort on initiating node
>> > > >
>> > > > Would you consider this approach or return back to using Ranked
>> superclass?
>> > > >
>> > > > Regarding your proposal to implement megre sort - definitely yes.
>> > > > I will implement this.
>> > > > Sorry, didn't understand you earlier )
>> > > >
>> > > > BR,
>> > > > Yuriy Shuliha
>> > > >
>> > > > PS As far as i see, the are no chance to get on 2.8 release train.
>> What will be the next version/date we can aim on with this update?
>> > > >
>> > > >
>> > > > пт, 28 лют. 2020 о 23:08 Ivan Pavlukhin  пише:
>> > > >>
>> > > >> Hi Yuriy,
>> > > >>
>> > > >> Sorry for a late response and thank you for your comments.
>> > > >>
>> > > >> Approach with @Ranked annotation looks cleaner to me from API
>> point of view.
>> > > >>
>> > > >> Regarding merging responses from multiple nodes I suppose that good
>> > > >> enough solution is possible:
>> > > >> 1. Request one page of entries from each node.
>> > > >> 2. Return one page to a user (as there is definitely a page of the
>> > > >> best results already).
>> > > >> 3. Request next result pages from nodes corresponding to pages we
>> > > >> exposed to the user (actually nodes having lesser than 1 page of
>> > > >> pending results). Repeat from step 2.
>> > > >>
>> > > >> Some kind of sort merge plus backpressure. Backpressure part might
>> be
>> > > >> left as an improvement.
>> > > >>
>> > > >> What do you think?
>> > > >>
>> > > >> Best regards,
>> > > >> Ivan Pavlukhin
>> > > >>
>> > > >> вт, 18 февр. 2020 г. в 18:27, Yuriy Shuliga :
>> > > >>
>> > > >> >
>> > > >> > Hi Ivan,
>> > > >> >
>> > > >> > Thank you for keeping 

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-01-23 Thread Yuriy Shuliga
Hi Ivan,

Actually I have engaged another developer to help bring TextQueries to
correctly working state.
For now we have solution that adds Ordering functionality to distributed
TextQueries .
This is developed and tested locally. I can share details here, then we can
discuss and decide whether to create a corresponding ticket.

The starting point is that by nature Lucene's documents are always ordered
by docScore:float;
So we created abstract class Ranked, implementing Comparable and
Serializable; and containing float rank value;

Each entity expected to be ordered on TextQuery merge should be
derived from this class.
All subsequent actions will be done under the hood automatically due
to new CacheQueryFutureRankedDecorator

that contain special BlockingIterator used for correct merge of distributed
responses.
Text queries with Ranked entities are automatically wrapped with this new
decorator.

This is a contour of solution. Please ask if any questions.
Or i can create ticket and link PR with already tested (yet locally)
solution to it for detailed review.

BR,
Yuriy


вт, 21 січ. 2020 о 07:29 Ivan Pavlukhin  пише:

> Hi Yuriy,
>
> Just would like to realize current state. Are you still working on
> Ignite text queries? If not, are you going to continue with it?
>
> пт, 13 дек. 2019 г. в 11:52, Ivan Pavlukhin :
> >
> > Yuriy,
> >
> > Sure, I will be glad to help.
> >
> > > - incorrect nodes/partition selection during querying?
> > Apparently this is the problem. If you feel it really complicated to
> > understand and debug then I can dig deeper and share my vision how the
> > problem can be fixed.
> >
> > ср, 11 дек. 2019 г. в 18:46, Yuriy Shuliga :
> > >
> > > I will look to the MOVING partition issue.
> > > But also need a guidance there.
> > >
> > > Ivan, don't you mind to be that person?
> > >
> > > The question is whether we have an issue with:
> > > -  wrong storing targets during indexing OR
> > > - incorrect nodes/partition selection during querying?
> > >
> > > BR,
> > > Yuriy Shluiha
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-01-20 Thread Ivan Pavlukhin
Hi Yuriy,

Just would like to realize current state. Are you still working on
Ignite text queries? If not, are you going to continue with it?

пт, 13 дек. 2019 г. в 11:52, Ivan Pavlukhin :
>
> Yuriy,
>
> Sure, I will be glad to help.
>
> > - incorrect nodes/partition selection during querying?
> Apparently this is the problem. If you feel it really complicated to
> understand and debug then I can dig deeper and share my vision how the
> problem can be fixed.
>
> ср, 11 дек. 2019 г. в 18:46, Yuriy Shuliga :
> >
> > I will look to the MOVING partition issue.
> > But also need a guidance there.
> >
> > Ivan, don't you mind to be that person?
> >
> > The question is whether we have an issue with:
> > -  wrong storing targets during indexing OR
> > - incorrect nodes/partition selection during querying?
> >
> > BR,
> > Yuriy Shluiha
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
>
>
> --
> Best regards,
> Ivan Pavlukhin



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-13 Thread Ivan Pavlukhin
Yuriy,

Sure, I will be glad to help.

> - incorrect nodes/partition selection during querying?
Apparently this is the problem. If you feel it really complicated to
understand and debug then I can dig deeper and share my vision how the
problem can be fixed.

ср, 11 дек. 2019 г. в 18:46, Yuriy Shuliga :
>
> I will look to the MOVING partition issue.
> But also need a guidance there.
>
> Ivan, don't you mind to be that person?
>
> The question is whether we have an issue with:
> -  wrong storing targets during indexing OR
> - incorrect nodes/partition selection during querying?
>
> BR,
> Yuriy Shluiha
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-11 Thread Yuriy Shuliga
I will look to the MOVING partition issue.
But also need a guidance there. 

Ivan, don't you mind to be that person?

The question is whether we have an issue with:
-  wrong storing targets during indexing OR 
- incorrect nodes/partition selection during querying?

BR,
Yuriy Shluiha



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-10 Thread Ilya Kasnacheev
Hello!

Yes, I guess you are right :(

I can surely fix the range issue, It's just that it was so broken that I
could not figure the correct behavior for this case.

Regards,
-- 
Ilya Kasnacheev


пн, 2 дек. 2019 г. в 15:01, Ivan Pavlukhin :

> Ilya,
>
> I checked your test on a revision before "limit" and it fails there as
> well. Could you please double check?
>
> пн, 2 дек. 2019 г. в 13:21, Ilya Kasnacheev :
> >
> > Hello!
> >
> > The problem is NOT specific to range queries. Range queries were broken
> > previously and they are broken now, but now even a simple "token in field
> > with limit" returns duplicates.
> >
> > Before limits were introduced, any tested use cases were unaffected by
> > duplicates, but now they are.
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пн, 2 дек. 2019 г. в 12:23, Ivan Pavlukhin :
> >
> > > And is the problem specific to range queries or not?
> > >
> > > пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin :
> > > >
> > > > Yuriy,
> > > >
> > > > Thank you for investigating the problem [1]. Still cannot realize how
> > > > the problem relates to introduced "limit"? Is it right that there
> were
> > > > no duplicates before "limit" support? After that support is
> introduced
> > > > are only limited queries contain duplicates, or unlimited, or both?
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-12401
> > > >
> > > > чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev <
> ilya.kasnach...@gmail.com
> > > >:
> > > > >
> > > > > Hello!
> > > > >
> > > > > I have just found what I consider a major regression in Text
> Queries:
> > > it
> > > > > seems to me that text queries with limits will return same
> key-value
> > > > > entries multiple times.
> > > > >
> > > > > Please check the issue
> > > https://issues.apache.org/jira/browse/IGNITE-12401
> > > > > and corresponding build
> > > > > https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Ilya Kasnacheev
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> > >
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-03 Thread Ivan Pavlukhin
*on topologies

вт, 3 дек. 2019 г. в 17:15, Ivan Pavlukhin :
>
> Ilya, Yuriy,
>
> It seems that text queries can return incorrect results on tologies
> with MOVING partitions. I left a comment in JIRA [1].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-12401
>
> пн, 2 дек. 2019 г. в 15:00, Ivan Pavlukhin :
> >
> > Ilya,
> >
> > I checked your test on a revision before "limit" and it fails there as
> > well. Could you please double check?
> >
> > пн, 2 дек. 2019 г. в 13:21, Ilya Kasnacheev :
> > >
> > > Hello!
> > >
> > > The problem is NOT specific to range queries. Range queries were broken
> > > previously and they are broken now, but now even a simple "token in field
> > > with limit" returns duplicates.
> > >
> > > Before limits were introduced, any tested use cases were unaffected by
> > > duplicates, but now they are.
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > пн, 2 дек. 2019 г. в 12:23, Ivan Pavlukhin :
> > >
> > > > And is the problem specific to range queries or not?
> > > >
> > > > пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin :
> > > > >
> > > > > Yuriy,
> > > > >
> > > > > Thank you for investigating the problem [1]. Still cannot realize how
> > > > > the problem relates to introduced "limit"? Is it right that there were
> > > > > no duplicates before "limit" support? After that support is introduced
> > > > > are only limited queries contain duplicates, or unlimited, or both?
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-12401
> > > > >
> > > > > чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev 
> > > > >  > > > >:
> > > > > >
> > > > > > Hello!
> > > > > >
> > > > > > I have just found what I consider a major regression in Text 
> > > > > > Queries:
> > > > it
> > > > > > seems to me that text queries with limits will return same key-value
> > > > > > entries multiple times.
> > > > > >
> > > > > > Please check the issue
> > > > https://issues.apache.org/jira/browse/IGNITE-12401
> > > > > > and corresponding build
> > > > > > https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
> > > > > >
> > > > > > Regards,
> > > > > > --
> > > > > > Ilya Kasnacheev
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Ivan Pavlukhin
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > > >
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
>
>
>
> --
> Best regards,
> Ivan Pavlukhin



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-03 Thread Ivan Pavlukhin
Ilya, Yuriy,

It seems that text queries can return incorrect results on tologies
with MOVING partitions. I left a comment in JIRA [1].

[1] https://issues.apache.org/jira/browse/IGNITE-12401

пн, 2 дек. 2019 г. в 15:00, Ivan Pavlukhin :
>
> Ilya,
>
> I checked your test on a revision before "limit" and it fails there as
> well. Could you please double check?
>
> пн, 2 дек. 2019 г. в 13:21, Ilya Kasnacheev :
> >
> > Hello!
> >
> > The problem is NOT specific to range queries. Range queries were broken
> > previously and they are broken now, but now even a simple "token in field
> > with limit" returns duplicates.
> >
> > Before limits were introduced, any tested use cases were unaffected by
> > duplicates, but now they are.
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > пн, 2 дек. 2019 г. в 12:23, Ivan Pavlukhin :
> >
> > > And is the problem specific to range queries or not?
> > >
> > > пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin :
> > > >
> > > > Yuriy,
> > > >
> > > > Thank you for investigating the problem [1]. Still cannot realize how
> > > > the problem relates to introduced "limit"? Is it right that there were
> > > > no duplicates before "limit" support? After that support is introduced
> > > > are only limited queries contain duplicates, or unlimited, or both?
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-12401
> > > >
> > > > чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev  > > >:
> > > > >
> > > > > Hello!
> > > > >
> > > > > I have just found what I consider a major regression in Text Queries:
> > > it
> > > > > seems to me that text queries with limits will return same key-value
> > > > > entries multiple times.
> > > > >
> > > > > Please check the issue
> > > https://issues.apache.org/jira/browse/IGNITE-12401
> > > > > and corresponding build
> > > > > https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Ilya Kasnacheev
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> > >
>
>
>
> --
> Best regards,
> Ivan Pavlukhin



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ivan Pavlukhin
Ilya,

I checked your test on a revision before "limit" and it fails there as
well. Could you please double check?

пн, 2 дек. 2019 г. в 13:21, Ilya Kasnacheev :
>
> Hello!
>
> The problem is NOT specific to range queries. Range queries were broken
> previously and they are broken now, but now even a simple "token in field
> with limit" returns duplicates.
>
> Before limits were introduced, any tested use cases were unaffected by
> duplicates, but now they are.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 2 дек. 2019 г. в 12:23, Ivan Pavlukhin :
>
> > And is the problem specific to range queries or not?
> >
> > пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin :
> > >
> > > Yuriy,
> > >
> > > Thank you for investigating the problem [1]. Still cannot realize how
> > > the problem relates to introduced "limit"? Is it right that there were
> > > no duplicates before "limit" support? After that support is introduced
> > > are only limited queries contain duplicates, or unlimited, or both?
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-12401
> > >
> > > чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev  > >:
> > > >
> > > > Hello!
> > > >
> > > > I have just found what I consider a major regression in Text Queries:
> > it
> > > > seems to me that text queries with limits will return same key-value
> > > > entries multiple times.
> > > >
> > > > Please check the issue
> > https://issues.apache.org/jira/browse/IGNITE-12401
> > > > and corresponding build
> > > > https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
> > > >
> > > > Regards,
> > > > --
> > > > Ilya Kasnacheev
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
> >



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ilya Kasnacheev
Hello!

The problem is NOT specific to range queries. Range queries were broken
previously and they are broken now, but now even a simple "token in field
with limit" returns duplicates.

Before limits were introduced, any tested use cases were unaffected by
duplicates, but now they are.

Regards,
-- 
Ilya Kasnacheev


пн, 2 дек. 2019 г. в 12:23, Ivan Pavlukhin :

> And is the problem specific to range queries or not?
>
> пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin :
> >
> > Yuriy,
> >
> > Thank you for investigating the problem [1]. Still cannot realize how
> > the problem relates to introduced "limit"? Is it right that there were
> > no duplicates before "limit" support? After that support is introduced
> > are only limited queries contain duplicates, or unlimited, or both?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-12401
> >
> > чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev  >:
> > >
> > > Hello!
> > >
> > > I have just found what I consider a major regression in Text Queries:
> it
> > > seems to me that text queries with limits will return same key-value
> > > entries multiple times.
> > >
> > > Please check the issue
> https://issues.apache.org/jira/browse/IGNITE-12401
> > > and corresponding build
> > > https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ivan Pavlukhin
And is the problem specific to range queries or not?

пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin :
>
> Yuriy,
>
> Thank you for investigating the problem [1]. Still cannot realize how
> the problem relates to introduced "limit"? Is it right that there were
> no duplicates before "limit" support? After that support is introduced
> are only limited queries contain duplicates, or unlimited, or both?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-12401
>
> чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev :
> >
> > Hello!
> >
> > I have just found what I consider a major regression in Text Queries: it
> > seems to me that text queries with limits will return same key-value
> > entries multiple times.
> >
> > Please check the issue https://issues.apache.org/jira/browse/IGNITE-12401
> > and corresponding build
> > https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
> >
> > Regards,
> > --
> > Ilya Kasnacheev
>
>
>
> --
> Best regards,
> Ivan Pavlukhin



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ivan Pavlukhin
Yuriy,

Thank you for investigating the problem [1]. Still cannot realize how
the problem relates to introduced "limit"? Is it right that there were
no duplicates before "limit" support? After that support is introduced
are only limited queries contain duplicates, or unlimited, or both?

[1] https://issues.apache.org/jira/browse/IGNITE-12401

чт, 28 нояб. 2019 г. в 18:30, Ilya Kasnacheev :
>
> Hello!
>
> I have just found what I consider a major regression in Text Queries: it
> seems to me that text queries with limits will return same key-value
> entries multiple times.
>
> Please check the issue https://issues.apache.org/jira/browse/IGNITE-12401
> and corresponding build
> https://ci.ignite.apache.org/viewQueued.html?itemId=4799634
>
> Regards,
> --
> Ilya Kasnacheev



-- 
Best regards,
Ivan Pavlukhin


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-28 Thread Ilya Kasnacheev
Hello!

I have just found what I consider a major regression in Text Queries: it
seems to me that text queries with limits will return same key-value
entries multiple times.

Please check the issue https://issues.apache.org/jira/browse/IGNITE-12401
and corresponding build
https://ci.ignite.apache.org/viewQueued.html?itemId=4799634

Regards,
-- 
Ilya Kasnacheev


Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-28 Thread Yuriy Shuliga
Nice to hear, Ivan

It's good practice to make existing functionality extension to be proper
presented; as we expect if from Text Queries.
Lets make it work correctly at first.

I'm ok to prepare ticket for adding reduction for sorted responses to
GridCacheDistributedQueryFuture  or nearby.
Also theTextQuery response entity will be extended to carry Lucene's
'docScore' per record.
No open question has left then.

BR,
Yuriy Shuliha

чт, 28 лист. 2019 о 15:23 Ivan Pavlukhin  пише:

> Folks, Yuriy,
>
> I suppose that we are going to proceed with
>
> >>>
> Reducing on Ignite
>
> The obvious point of distributed response reduction is class
> GridCacheDistributedQueryFuture.
> Though, @Ivan Pavlukhin mentioned class with similar functionality:
> ReduceIndexSorted
> What I see here, that it is tangled with H2 related classes
> (org.h2.result.Row) and might not be unified with TextQuery reduction.
> >>
>
> From my side there is no strict opinion that we should unify
> reduction. Having a separate reduction implementation for text queries
> sounds for me as not bad option as well.
>
> Are there still any open questions?
>
> ср, 27 нояб. 2019 г. в 02:27, Denis Magda :
> >
> > I don't see anything wrong if Yuriy is willing to carry on and keep
> > enhancing our full-text search support that lacks basic capabilities.
> >
> > The basics should be available. If anybody needs an advanced feature they
> > can introduce Solr or ElastiSearch into the final architecture of the
> app.
> >
> > Folks, who of us can help Yuriy with the questions asked? Most like the
> SQL
> > experts are the best candidates here.
> >
> >
> > -
> > Denis
> >
> >
> > On Tue, Nov 26, 2019 at 8:52 AM Ivan Pavlukhin 
> wrote:
> >
> > > Folks,
> > >
> > > IEP is an Ignite-specific thing. In fact, I suppose that we are
> > > already doing it in ASF way by having this dev-list discussion =)
> > >
> > > As for me, implementing "limit" feature for text queries is not so big
> > > to make an IEP. But we might need to create one for next features.
> > >
> > > вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>:
> > > >
> > > > Hello!
> > > >
> > > > ASF way should probably start with an IEP :)
> > > >
> > > > Regards,
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > > >
> > > > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky
> > >  > > > >:
> > > >
> > > > >
> > > > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> > > > > functionality is helpful and PR it, why not ?
> > > > >
> > > > > isn`t it ?
> > > > >
> > > > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> > > > > ilya.kasnach...@gmail.com>:
> > > > > >
> > > > > >Hello!
> > > > > >
> > > > > >The problem here is that Solr is a multi-year effort by a lot of
> > > people.
> > > > > We
> > > > > >can't match that.
> > > > > >
> > > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding
> our
> > > > > cache
> > > > > >information into their storage for indexing and relying on their
> own
> > > > > >mechanisms for distributed IR sorting?
> > > > > >
> > > > > >Regards,
> > > > > >--
> > > > > >Ilya Kasnacheev
> > > > > >
> > > > > >
> > > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> > > > > arzamas...@mail.ru.invalid
> > > > > >>:
> > > > > >
> > > > > >>
> > > > > >> Ilya Kasnacheev, what a problem in Solr with Ignite
> functionality ?
> > > > > >>
> > > > > >> thanks !
> > > > > >>
> > > > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> > > > > >>  ilya.kasnach...@gmail.com >:
> > > > > >> >
> > > > > >> >Hello!
> > > > > >> >
> > > > > >> >I have a hunch that we are trying to build Apache Solr (or Solr
> > > Cloud)
> > > > > >> into
> > > > > >> >Apache Ignite. I think that's a lot of effort that is not very
> > > > > justified.
> > > > > >> >
> > > > > >> >I don't think we should try to implement sorting in Apache
> Ignite,
> > > > > because
> > > > > >> >it is a lot of work, and a lot of code in our code base which
> we
> > > don't
> > > > > >> >really want.
> > > > > >> >
> > > > > >> >Regards,
> > > > > >> >--
> > > > > >> >Ilya Kasnacheev
> > > > > >> >
> > > > > >> >
> > > > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <
> shul...@gmail.com
> > > >:
> > > > > >> >
> > > > > >> >> Dear Igniters,
> > > > > >> >>
> > > > > >> >> The first part of TextQuery improvement - a result limit -
> was
> > > > > developed
> > > > > >> >> and merged.
> > > > > >> >> Now we have to develop most important functionality here -
> proper
> > > > > >> sorting
> > > > > >> >> of Lucene index response and correct reducing of them for
> > > distributed
> > > > > >> >> queries.
> > > > > >> >>
> > > > > >> >> *There are two Lucene based aspects*
> > > > > >> >>
> > > > > >> >> 1. In case of using no sorting fields, the documents in
> response
> > > are
> > > > > >> still
> > > > > >> >> ordered by relevance.
> > > > > >> >> Actually this is ScoreDoc.score value.
> > > > > >> >> In order to reduce the 

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-28 Thread Ivan Pavlukhin
Folks, Yuriy,

I suppose that we are going to proceed with

>>>
Reducing on Ignite

The obvious point of distributed response reduction is class
GridCacheDistributedQueryFuture.
Though, @Ivan Pavlukhin mentioned class with similar functionality:
ReduceIndexSorted
What I see here, that it is tangled with H2 related classes
(org.h2.result.Row) and might not be unified with TextQuery reduction.
>>

>From my side there is no strict opinion that we should unify
reduction. Having a separate reduction implementation for text queries
sounds for me as not bad option as well.

Are there still any open questions?

ср, 27 нояб. 2019 г. в 02:27, Denis Magda :
>
> I don't see anything wrong if Yuriy is willing to carry on and keep
> enhancing our full-text search support that lacks basic capabilities.
>
> The basics should be available. If anybody needs an advanced feature they
> can introduce Solr or ElastiSearch into the final architecture of the app.
>
> Folks, who of us can help Yuriy with the questions asked? Most like the SQL
> experts are the best candidates here.
>
>
> -
> Denis
>
>
> On Tue, Nov 26, 2019 at 8:52 AM Ivan Pavlukhin  wrote:
>
> > Folks,
> >
> > IEP is an Ignite-specific thing. In fact, I suppose that we are
> > already doing it in ASF way by having this dev-list discussion =)
> >
> > As for me, implementing "limit" feature for text queries is not so big
> > to make an IEP. But we might need to create one for next features.
> >
> > вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev :
> > >
> > > Hello!
> > >
> > > ASF way should probably start with an IEP :)
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky
> >  > > >:
> > >
> > > >
> > > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> > > > functionality is helpful and PR it, why not ?
> > > >
> > > > isn`t it ?
> > > >
> > > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> > > > ilya.kasnach...@gmail.com>:
> > > > >
> > > > >Hello!
> > > > >
> > > > >The problem here is that Solr is a multi-year effort by a lot of
> > people.
> > > > We
> > > > >can't match that.
> > > > >
> > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our
> > > > cache
> > > > >information into their storage for indexing and relying on their own
> > > > >mechanisms for distributed IR sorting?
> > > > >
> > > > >Regards,
> > > > >--
> > > > >Ilya Kasnacheev
> > > > >
> > > > >
> > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> > > > arzamas...@mail.ru.invalid
> > > > >>:
> > > > >
> > > > >>
> > > > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?
> > > > >>
> > > > >> thanks !
> > > > >>
> > > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> > > > >>  ilya.kasnach...@gmail.com >:
> > > > >> >
> > > > >> >Hello!
> > > > >> >
> > > > >> >I have a hunch that we are trying to build Apache Solr (or Solr
> > Cloud)
> > > > >> into
> > > > >> >Apache Ignite. I think that's a lot of effort that is not very
> > > > justified.
> > > > >> >
> > > > >> >I don't think we should try to implement sorting in Apache Ignite,
> > > > because
> > > > >> >it is a lot of work, and a lot of code in our code base which we
> > don't
> > > > >> >really want.
> > > > >> >
> > > > >> >Regards,
> > > > >> >--
> > > > >> >Ilya Kasnacheev
> > > > >> >
> > > > >> >
> > > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <  shul...@gmail.com
> > >:
> > > > >> >
> > > > >> >> Dear Igniters,
> > > > >> >>
> > > > >> >> The first part of TextQuery improvement - a result limit - was
> > > > developed
> > > > >> >> and merged.
> > > > >> >> Now we have to develop most important functionality here - proper
> > > > >> sorting
> > > > >> >> of Lucene index response and correct reducing of them for
> > distributed
> > > > >> >> queries.
> > > > >> >>
> > > > >> >> *There are two Lucene based aspects*
> > > > >> >>
> > > > >> >> 1. In case of using no sorting fields, the documents in response
> > are
> > > > >> still
> > > > >> >> ordered by relevance.
> > > > >> >> Actually this is ScoreDoc.score value.
> > > > >> >> In order to reduce the distributed results correctly, the score
> > > > should
> > > > >> be
> > > > >> >> passed with response.
> > > > >> >>
> > > > >> >> 2. When sorting by conventional fields, then Lucene should have
> > these
> > > > >> >> fields properly indexed and
> > > > >> >> corresponding Sort object should be applied to Lucene's search
> > call.
> > > > >> >> In order to mark those fields a new annotation like '@SortField'
> > may
> > > > be
> > > > >> >> introduced.
> > > > >> >>
> > > > >> >> *Reducing on Ignite *
> > > > >> >>
> > > > >> >> The obvious point of distributed response reduction is class
> > > > >> >> GridCacheDistributedQueryFuture.
> > > > >> >> Though, @Ivan Pavlukhin mentioned class with similar
> > functionality:
> > > > >> >> ReduceIndexSorted
> > > > >> >> What I see here, that it is tangled with 

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Denis Magda
I don't see anything wrong if Yuriy is willing to carry on and keep
enhancing our full-text search support that lacks basic capabilities.

The basics should be available. If anybody needs an advanced feature they
can introduce Solr or ElastiSearch into the final architecture of the app.

Folks, who of us can help Yuriy with the questions asked? Most like the SQL
experts are the best candidates here.


-
Denis


On Tue, Nov 26, 2019 at 8:52 AM Ivan Pavlukhin  wrote:

> Folks,
>
> IEP is an Ignite-specific thing. In fact, I suppose that we are
> already doing it in ASF way by having this dev-list discussion =)
>
> As for me, implementing "limit" feature for text queries is not so big
> to make an IEP. But we might need to create one for next features.
>
> вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev :
> >
> > Hello!
> >
> > ASF way should probably start with an IEP :)
> >
> > Regards,
> > --
> > Ilya Kasnacheev
> >
> >
> > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky
>  > >:
> >
> > >
> > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> > > functionality is helpful and PR it, why not ?
> > >
> > > isn`t it ?
> > >
> > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> > > ilya.kasnach...@gmail.com>:
> > > >
> > > >Hello!
> > > >
> > > >The problem here is that Solr is a multi-year effort by a lot of
> people.
> > > We
> > > >can't match that.
> > > >
> > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our
> > > cache
> > > >information into their storage for indexing and relying on their own
> > > >mechanisms for distributed IR sorting?
> > > >
> > > >Regards,
> > > >--
> > > >Ilya Kasnacheev
> > > >
> > > >
> > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> > > arzamas...@mail.ru.invalid
> > > >>:
> > > >
> > > >>
> > > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?
> > > >>
> > > >> thanks !
> > > >>
> > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> > > >>  ilya.kasnach...@gmail.com >:
> > > >> >
> > > >> >Hello!
> > > >> >
> > > >> >I have a hunch that we are trying to build Apache Solr (or Solr
> Cloud)
> > > >> into
> > > >> >Apache Ignite. I think that's a lot of effort that is not very
> > > justified.
> > > >> >
> > > >> >I don't think we should try to implement sorting in Apache Ignite,
> > > because
> > > >> >it is a lot of work, and a lot of code in our code base which we
> don't
> > > >> >really want.
> > > >> >
> > > >> >Regards,
> > > >> >--
> > > >> >Ilya Kasnacheev
> > > >> >
> > > >> >
> > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <  shul...@gmail.com
> >:
> > > >> >
> > > >> >> Dear Igniters,
> > > >> >>
> > > >> >> The first part of TextQuery improvement - a result limit - was
> > > developed
> > > >> >> and merged.
> > > >> >> Now we have to develop most important functionality here - proper
> > > >> sorting
> > > >> >> of Lucene index response and correct reducing of them for
> distributed
> > > >> >> queries.
> > > >> >>
> > > >> >> *There are two Lucene based aspects*
> > > >> >>
> > > >> >> 1. In case of using no sorting fields, the documents in response
> are
> > > >> still
> > > >> >> ordered by relevance.
> > > >> >> Actually this is ScoreDoc.score value.
> > > >> >> In order to reduce the distributed results correctly, the score
> > > should
> > > >> be
> > > >> >> passed with response.
> > > >> >>
> > > >> >> 2. When sorting by conventional fields, then Lucene should have
> these
> > > >> >> fields properly indexed and
> > > >> >> corresponding Sort object should be applied to Lucene's search
> call.
> > > >> >> In order to mark those fields a new annotation like '@SortField'
> may
> > > be
> > > >> >> introduced.
> > > >> >>
> > > >> >> *Reducing on Ignite *
> > > >> >>
> > > >> >> The obvious point of distributed response reduction is class
> > > >> >> GridCacheDistributedQueryFuture.
> > > >> >> Though, @Ivan Pavlukhin mentioned class with similar
> functionality:
> > > >> >> ReduceIndexSorted
> > > >> >> What I see here, that it is tangled with H2 related classes (
> > > >> >> org.h2.result.Row) and might not be unified with TextQuery
> reduction.
> > > >> >>
> > > >> >> Still need a support here.
> > > >> >>
> > > >> >> Overall, the goal of this letter is to initiate discussion on
> > > TextQuery
> > > >> >> Sorting implementation and come closer to ticket creation.
> > > >> >>
> > > >> >> BR,
> > > >> >> Yuriy Shuliha
> > > >> >>
> > > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov <
> > > andrey.mashen...@gmail.com
> > > >> >
> > > >> >> пише:
> > > >> >>
> > > >> >> > Hi Dmitry, Yuriy.
> > > >> >> >
> > > >> >> > I've found GridCacheQueryFutureAdapter has newly added
> > > AtomicInteger
> > > >> >> > 'total' field and 'limit; field as primitive int.
> > > >> >> >
> > > >> >> > Both fields are used inside synchronized block only.
> > > >> >> > So, we can make both private and downgrade AtomicInteger to
> > > primitive
> > > >> >> int.
> > > >> >> >
> 

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Ivan Pavlukhin
Folks,

IEP is an Ignite-specific thing. In fact, I suppose that we are
already doing it in ASF way by having this dev-list discussion =)

As for me, implementing "limit" feature for text queries is not so big
to make an IEP. But we might need to create one for next features.

вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev :
>
> Hello!
>
> ASF way should probably start with an IEP :)
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky  >:
>
> >
> > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> > functionality is helpful and PR it, why not ?
> >
> > isn`t it ?
> >
> > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> > ilya.kasnach...@gmail.com>:
> > >
> > >Hello!
> > >
> > >The problem here is that Solr is a multi-year effort by a lot of people.
> > We
> > >can't match that.
> > >
> > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our
> > cache
> > >information into their storage for indexing and relying on their own
> > >mechanisms for distributed IR sorting?
> > >
> > >Regards,
> > >--
> > >Ilya Kasnacheev
> > >
> > >
> > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> > arzamas...@mail.ru.invalid
> > >>:
> > >
> > >>
> > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?
> > >>
> > >> thanks !
> > >>
> > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> > >>  ilya.kasnach...@gmail.com >:
> > >> >
> > >> >Hello!
> > >> >
> > >> >I have a hunch that we are trying to build Apache Solr (or Solr Cloud)
> > >> into
> > >> >Apache Ignite. I think that's a lot of effort that is not very
> > justified.
> > >> >
> > >> >I don't think we should try to implement sorting in Apache Ignite,
> > because
> > >> >it is a lot of work, and a lot of code in our code base which we don't
> > >> >really want.
> > >> >
> > >> >Regards,
> > >> >--
> > >> >Ilya Kasnacheev
> > >> >
> > >> >
> > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <  shul...@gmail.com >:
> > >> >
> > >> >> Dear Igniters,
> > >> >>
> > >> >> The first part of TextQuery improvement - a result limit - was
> > developed
> > >> >> and merged.
> > >> >> Now we have to develop most important functionality here - proper
> > >> sorting
> > >> >> of Lucene index response and correct reducing of them for distributed
> > >> >> queries.
> > >> >>
> > >> >> *There are two Lucene based aspects*
> > >> >>
> > >> >> 1. In case of using no sorting fields, the documents in response are
> > >> still
> > >> >> ordered by relevance.
> > >> >> Actually this is ScoreDoc.score value.
> > >> >> In order to reduce the distributed results correctly, the score
> > should
> > >> be
> > >> >> passed with response.
> > >> >>
> > >> >> 2. When sorting by conventional fields, then Lucene should have these
> > >> >> fields properly indexed and
> > >> >> corresponding Sort object should be applied to Lucene's search call.
> > >> >> In order to mark those fields a new annotation like '@SortField' may
> > be
> > >> >> introduced.
> > >> >>
> > >> >> *Reducing on Ignite *
> > >> >>
> > >> >> The obvious point of distributed response reduction is class
> > >> >> GridCacheDistributedQueryFuture.
> > >> >> Though, @Ivan Pavlukhin mentioned class with similar functionality:
> > >> >> ReduceIndexSorted
> > >> >> What I see here, that it is tangled with H2 related classes (
> > >> >> org.h2.result.Row) and might not be unified with TextQuery reduction.
> > >> >>
> > >> >> Still need a support here.
> > >> >>
> > >> >> Overall, the goal of this letter is to initiate discussion on
> > TextQuery
> > >> >> Sorting implementation and come closer to ticket creation.
> > >> >>
> > >> >> BR,
> > >> >> Yuriy Shuliha
> > >> >>
> > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov <
> > andrey.mashen...@gmail.com
> > >> >
> > >> >> пише:
> > >> >>
> > >> >> > Hi Dmitry, Yuriy.
> > >> >> >
> > >> >> > I've found GridCacheQueryFutureAdapter has newly added
> > AtomicInteger
> > >> >> > 'total' field and 'limit; field as primitive int.
> > >> >> >
> > >> >> > Both fields are used inside synchronized block only.
> > >> >> > So, we can make both private and downgrade AtomicInteger to
> > primitive
> > >> >> int.
> > >> >> >
> > >> >> > Most likely, these fields can be replaced with one field.
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov <
> > dpav...@apache.org
> > >> >
> > >> >> > wrote:
> > >> >> >
> > >> >> > > Hi Andrey,
> > >> >> > >
> > >> >> > > I've checked this ticket comments, and there is a TC Bot visa
> > (with
> > >> no
> > >> >> > > blockers).
> > >> >> > >
> > >> >> > > Do you have any concerns related to this patch?
> > >> >> > >
> > >> >> > > Sincerely,
> > >> >> > > Dmitriy Pavlov
> > >> >> > >
> > >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga <  shul...@gmail.com
> > >:
> > >> >> > >
> > >> >> > >> Andrey,
> > >> >> > >>
> > >> >> > >> Per you request, I created ticket
> > >> >> > >>  

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Ilya Kasnacheev
Hello!

ASF way should probably start with an IEP :)

Regards,
-- 
Ilya Kasnacheev


вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky :

>
> Ok, lets forgot Solr and go through ASF way, if Yuriy prove this
> functionality is helpful and PR it, why not ?
>
> isn`t it ?
>
> >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>:
> >
> >Hello!
> >
> >The problem here is that Solr is a multi-year effort by a lot of people.
> We
> >can't match that.
> >
> >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our
> cache
> >information into their storage for indexing and relying on their own
> >mechanisms for distributed IR sorting?
> >
> >Regards,
> >--
> >Ilya Kasnacheev
> >
> >
> >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky <
> arzamas...@mail.ru.invalid
> >>:
> >
> >>
> >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?
> >>
> >> thanks !
> >>
> >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
> >>  ilya.kasnach...@gmail.com >:
> >> >
> >> >Hello!
> >> >
> >> >I have a hunch that we are trying to build Apache Solr (or Solr Cloud)
> >> into
> >> >Apache Ignite. I think that's a lot of effort that is not very
> justified.
> >> >
> >> >I don't think we should try to implement sorting in Apache Ignite,
> because
> >> >it is a lot of work, and a lot of code in our code base which we don't
> >> >really want.
> >> >
> >> >Regards,
> >> >--
> >> >Ilya Kasnacheev
> >> >
> >> >
> >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <  shul...@gmail.com >:
> >> >
> >> >> Dear Igniters,
> >> >>
> >> >> The first part of TextQuery improvement - a result limit - was
> developed
> >> >> and merged.
> >> >> Now we have to develop most important functionality here - proper
> >> sorting
> >> >> of Lucene index response and correct reducing of them for distributed
> >> >> queries.
> >> >>
> >> >> *There are two Lucene based aspects*
> >> >>
> >> >> 1. In case of using no sorting fields, the documents in response are
> >> still
> >> >> ordered by relevance.
> >> >> Actually this is ScoreDoc.score value.
> >> >> In order to reduce the distributed results correctly, the score
> should
> >> be
> >> >> passed with response.
> >> >>
> >> >> 2. When sorting by conventional fields, then Lucene should have these
> >> >> fields properly indexed and
> >> >> corresponding Sort object should be applied to Lucene's search call.
> >> >> In order to mark those fields a new annotation like '@SortField' may
> be
> >> >> introduced.
> >> >>
> >> >> *Reducing on Ignite *
> >> >>
> >> >> The obvious point of distributed response reduction is class
> >> >> GridCacheDistributedQueryFuture.
> >> >> Though, @Ivan Pavlukhin mentioned class with similar functionality:
> >> >> ReduceIndexSorted
> >> >> What I see here, that it is tangled with H2 related classes (
> >> >> org.h2.result.Row) and might not be unified with TextQuery reduction.
> >> >>
> >> >> Still need a support here.
> >> >>
> >> >> Overall, the goal of this letter is to initiate discussion on
> TextQuery
> >> >> Sorting implementation and come closer to ticket creation.
> >> >>
> >> >> BR,
> >> >> Yuriy Shuliha
> >> >>
> >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov <
> andrey.mashen...@gmail.com
> >> >
> >> >> пише:
> >> >>
> >> >> > Hi Dmitry, Yuriy.
> >> >> >
> >> >> > I've found GridCacheQueryFutureAdapter has newly added
> AtomicInteger
> >> >> > 'total' field and 'limit; field as primitive int.
> >> >> >
> >> >> > Both fields are used inside synchronized block only.
> >> >> > So, we can make both private and downgrade AtomicInteger to
> primitive
> >> >> int.
> >> >> >
> >> >> > Most likely, these fields can be replaced with one field.
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov <
> dpav...@apache.org
> >> >
> >> >> > wrote:
> >> >> >
> >> >> > > Hi Andrey,
> >> >> > >
> >> >> > > I've checked this ticket comments, and there is a TC Bot visa
> (with
> >> no
> >> >> > > blockers).
> >> >> > >
> >> >> > > Do you have any concerns related to this patch?
> >> >> > >
> >> >> > > Sincerely,
> >> >> > > Dmitriy Pavlov
> >> >> > >
> >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga <  shul...@gmail.com
> >:
> >> >> > >
> >> >> > >> Andrey,
> >> >> > >>
> >> >> > >> Per you request, I created ticket
> >> >> > >>  https://issues.apache.org/jira/browse/IGNITE-12291 linked to
> >> >> > >>
> >>  https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189
> >> >> > >>
> >> >> > >> Could you please proceed with PR merge ?
> >> >> > >>
> >> >> > >> BR,
> >> >> > >> Yuriy Shuliha
> >> >> > >>
> >> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov <
> >>  andrey.mashen...@gmail.com
> >> >> >
> >> >> > >> пише:
> >> >> > >>
> >> >> > >> > Hi Yuri,
> >> >> > >> >
> >> >> > >> > To get access to TC Bot you should register as TeamCity user
> >> [1], if
> >> >> > you
> >> >> > >> > didn't do this already.
> >> >> > >> > Then you will be able to authorize on Ignite TC 

Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Zhenya Stanilovsky

Ok, lets forgot Solr and go through ASF way, if Yuriy prove this functionality 
is helpful and PR it, why not ?
 
isn`t it ?
  
>Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev 
>:
> 
>Hello!
>
>The problem here is that Solr is a multi-year effort by a lot of people. We
>can't match that.
>
>Maybe we could integrate with Solr/Solr Cloud instead, by feeding our cache
>information into their storage for indexing and relying on their own
>mechanisms for distributed IR sorting?
>
>Regards,
>--
>Ilya Kasnacheev
>
>
>вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky < arzamas...@mail.ru.invalid
>>:
>
>>
>> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?
>>
>> thanks !
>>
>> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev <
>>  ilya.kasnach...@gmail.com >:
>> >
>> >Hello!
>> >
>> >I have a hunch that we are trying to build Apache Solr (or Solr Cloud)
>> into
>> >Apache Ignite. I think that's a lot of effort that is not very justified.
>> >
>> >I don't think we should try to implement sorting in Apache Ignite, because
>> >it is a lot of work, and a lot of code in our code base which we don't
>> >really want.
>> >
>> >Regards,
>> >--
>> >Ilya Kasnacheev
>> >
>> >
>> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga <  shul...@gmail.com >:
>> >
>> >> Dear Igniters,
>> >>
>> >> The first part of TextQuery improvement - a result limit - was developed
>> >> and merged.
>> >> Now we have to develop most important functionality here - proper
>> sorting
>> >> of Lucene index response and correct reducing of them for distributed
>> >> queries.
>> >>
>> >> *There are two Lucene based aspects*
>> >>
>> >> 1. In case of using no sorting fields, the documents in response are
>> still
>> >> ordered by relevance.
>> >> Actually this is ScoreDoc.score value.
>> >> In order to reduce the distributed results correctly, the score should
>> be
>> >> passed with response.
>> >>
>> >> 2. When sorting by conventional fields, then Lucene should have these
>> >> fields properly indexed and
>> >> corresponding Sort object should be applied to Lucene's search call.
>> >> In order to mark those fields a new annotation like '@SortField' may be
>> >> introduced.
>> >>
>> >> *Reducing on Ignite *
>> >>
>> >> The obvious point of distributed response reduction is class
>> >> GridCacheDistributedQueryFuture.
>> >> Though, @Ivan Pavlukhin mentioned class with similar functionality:
>> >> ReduceIndexSorted
>> >> What I see here, that it is tangled with H2 related classes (
>> >> org.h2.result.Row) and might not be unified with TextQuery reduction.
>> >>
>> >> Still need a support here.
>> >>
>> >> Overall, the goal of this letter is to initiate discussion on TextQuery
>> >> Sorting implementation and come closer to ticket creation.
>> >>
>> >> BR,
>> >> Yuriy Shuliha
>> >>
>> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov <  andrey.mashen...@gmail.com
>> >
>> >> пише:
>> >>
>> >> > Hi Dmitry, Yuriy.
>> >> >
>> >> > I've found GridCacheQueryFutureAdapter has newly added AtomicInteger
>> >> > 'total' field and 'limit; field as primitive int.
>> >> >
>> >> > Both fields are used inside synchronized block only.
>> >> > So, we can make both private and downgrade AtomicInteger to primitive
>> >> int.
>> >> >
>> >> > Most likely, these fields can be replaced with one field.
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov <  dpav...@apache.org
>> >
>> >> > wrote:
>> >> >
>> >> > > Hi Andrey,
>> >> > >
>> >> > > I've checked this ticket comments, and there is a TC Bot visa (with
>> no
>> >> > > blockers).
>> >> > >
>> >> > > Do you have any concerns related to this patch?
>> >> > >
>> >> > > Sincerely,
>> >> > > Dmitriy Pavlov
>> >> > >
>> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga <  shul...@gmail.com >:
>> >> > >
>> >> > >> Andrey,
>> >> > >>
>> >> > >> Per you request, I created ticket
>> >> > >>  https://issues.apache.org/jira/browse/IGNITE-12291 linked to
>> >> > >>
>>  https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189
>> >> > >>
>> >> > >> Could you please proceed with PR merge ?
>> >> > >>
>> >> > >> BR,
>> >> > >> Yuriy Shuliha
>> >> > >>
>> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov <
>>  andrey.mashen...@gmail.com
>> >> >
>> >> > >> пише:
>> >> > >>
>> >> > >> > Hi Yuri,
>> >> > >> >
>> >> > >> > To get access to TC Bot you should register as TeamCity user
>> [1], if
>> >> > you
>> >> > >> > didn't do this already.
>> >> > >> > Then you will be able to authorize on Ignite TC Bot page with
>> same
>> >> > >> > credentials.
>> >> > >> >
>> >> > >> > [1]  https://ci.ignite.apache.org/registerUser.html
>> >> > >> >
>> >> > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga <  shul...@gmail.com
>> >
>> >> > wrote:
>> >> > >> >
>> >> > >> >> Andrew,
>> >> > >> >>
>> >> > >> >> I have corrected PR according to your notes. Please review.
>> >> > >> >> What will be the next steps in order to merge in?
>> >> > >> >>
>> >> > >> >> Y.
>> >> > >> >>
>> >> > >> >>