Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-03-14 Thread Ivan Pavlukhin
Yuriy, > Let me summarize the approaches: I agree with your reasoning, p.2 sounds the best one to me as well. Will look into merge-sort strategy some time later. Best regards, Ivan Pavlukhin пт, 13 мар. 2020 г. в 19:23, Yuriy Shuliga : > > Ivan, > > I have made changes in the fork that

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-03-13 Thread Yuriy Shuliga
Ivan, I have made changes in the fork that reflects merge-sort strategy and now query future iterator unblocks as soon all first pages are delivered from nodes; then it waits for the next pages portions and so on. https://github.com/shuliga/ignite/commit/c84f04c18f67e99ab7bc0a7893b75f1dc83a76bd

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-03-10 Thread Ivan Pavlukhin
Igniters, Not intentionally the discussion continued outside of dev list. I am returning it back. You can find it below. Do not hesitate to join if you have some thoughts on raised questions. May be you have ideas how to enrich text query results with score/rank information. вт, 10 мар. 2020 г.

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-01-23 Thread Yuriy Shuliga
Hi Ivan, Actually I have engaged another developer to help bring TextQueries to correctly working state. For now we have solution that adds Ordering functionality to distributed TextQueries . This is developed and tested locally. I can share details here, then we can discuss and decide whether to

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2020-01-20 Thread Ivan Pavlukhin
Hi Yuriy, Just would like to realize current state. Are you still working on Ignite text queries? If not, are you going to continue with it? пт, 13 дек. 2019 г. в 11:52, Ivan Pavlukhin : > > Yuriy, > > Sure, I will be glad to help. > > > - incorrect nodes/partition selection during querying? >

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-13 Thread Ivan Pavlukhin
Yuriy, Sure, I will be glad to help. > - incorrect nodes/partition selection during querying? Apparently this is the problem. If you feel it really complicated to understand and debug then I can dig deeper and share my vision how the problem can be fixed. ср, 11 дек. 2019 г. в 18:46, Yuriy

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-11 Thread Yuriy Shuliga
I will look to the MOVING partition issue. But also need a guidance there. Ivan, don't you mind to be that person? The question is whether we have an issue with: - wrong storing targets during indexing OR - incorrect nodes/partition selection during querying? BR, Yuriy Shluiha -- Sent

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-10 Thread Ilya Kasnacheev
Hello! Yes, I guess you are right :( I can surely fix the range issue, It's just that it was so broken that I could not figure the correct behavior for this case. Regards, -- Ilya Kasnacheev пн, 2 дек. 2019 г. в 15:01, Ivan Pavlukhin : > Ilya, > > I checked your test on a revision before

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-03 Thread Ivan Pavlukhin
*on topologies вт, 3 дек. 2019 г. в 17:15, Ivan Pavlukhin : > > Ilya, Yuriy, > > It seems that text queries can return incorrect results on tologies > with MOVING partitions. I left a comment in JIRA [1]. > > [1] https://issues.apache.org/jira/browse/IGNITE-12401 > > пн, 2 дек. 2019 г. в 15:00,

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-03 Thread Ivan Pavlukhin
Ilya, Yuriy, It seems that text queries can return incorrect results on tologies with MOVING partitions. I left a comment in JIRA [1]. [1] https://issues.apache.org/jira/browse/IGNITE-12401 пн, 2 дек. 2019 г. в 15:00, Ivan Pavlukhin : > > Ilya, > > I checked your test on a revision before

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ivan Pavlukhin
Ilya, I checked your test on a revision before "limit" and it fails there as well. Could you please double check? пн, 2 дек. 2019 г. в 13:21, Ilya Kasnacheev : > > Hello! > > The problem is NOT specific to range queries. Range queries were broken > previously and they are broken now, but now

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ilya Kasnacheev
Hello! The problem is NOT specific to range queries. Range queries were broken previously and they are broken now, but now even a simple "token in field with limit" returns duplicates. Before limits were introduced, any tested use cases were unaffected by duplicates, but now they are. Regards,

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ivan Pavlukhin
And is the problem specific to range queries or not? пн, 2 дек. 2019 г. в 11:12, Ivan Pavlukhin : > > Yuriy, > > Thank you for investigating the problem [1]. Still cannot realize how > the problem relates to introduced "limit"? Is it right that there were > no duplicates before "limit" support?

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-12-02 Thread Ivan Pavlukhin
Yuriy, Thank you for investigating the problem [1]. Still cannot realize how the problem relates to introduced "limit"? Is it right that there were no duplicates before "limit" support? After that support is introduced are only limited queries contain duplicates, or unlimited, or both? [1]

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-28 Thread Ilya Kasnacheev
Hello! I have just found what I consider a major regression in Text Queries: it seems to me that text queries with limits will return same key-value entries multiple times. Please check the issue https://issues.apache.org/jira/browse/IGNITE-12401 and corresponding build

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-28 Thread Yuriy Shuliga
Nice to hear, Ivan It's good practice to make existing functionality extension to be proper presented; as we expect if from Text Queries. Lets make it work correctly at first. I'm ok to prepare ticket for adding reduction for sorted responses to GridCacheDistributedQueryFuture or nearby. Also

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-28 Thread Ivan Pavlukhin
Folks, Yuriy, I suppose that we are going to proceed with >>> Reducing on Ignite The obvious point of distributed response reduction is class GridCacheDistributedQueryFuture. Though, @Ivan Pavlukhin mentioned class with similar functionality: ReduceIndexSorted What I see here, that it is

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Denis Magda
I don't see anything wrong if Yuriy is willing to carry on and keep enhancing our full-text search support that lacks basic capabilities. The basics should be available. If anybody needs an advanced feature they can introduce Solr or ElastiSearch into the final architecture of the app. Folks,

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Ivan Pavlukhin
Folks, IEP is an Ignite-specific thing. In fact, I suppose that we are already doing it in ASF way by having this dev-list discussion =) As for me, implementing "limit" feature for text queries is not so big to make an IEP. But we might need to create one for next features. вт, 26 нояб. 2019 г.

Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Ilya Kasnacheev
Hello! ASF way should probably start with an IEP :) Regards, -- Ilya Kasnacheev вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky : > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this > functionality is helpful and PR it, why not ? > > isn`t it ? > > >Вторник, 26 ноября 2019,

Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Zhenya Stanilovsky
Ok, lets forgot Solr and go through ASF way, if Yuriy prove this functionality is helpful and PR it, why not ?   isn`t it ?   >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev >: >  >Hello! > >The problem here is that Solr is a multi-year effort by a lot of people. We >can't match

Re: Re[2]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Ilya Kasnacheev
Hello! The problem here is that Solr is a multi-year effort by a lot of people. We can't match that. Maybe we could integrate with Solr/Solr Cloud instead, by feeding our cache information into their storage for indexing and relying on their own mechanisms for distributed IR sorting? Regards,

Re[2]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Zhenya Stanilovsky
Ilya Kasnacheev, what a problem in Solr with Ignite functionality ?   thanks !   >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev >: >  >Hello! > >I have a hunch that we are trying to build Apache Solr (or Solr Cloud) into >Apache Ignite. I think that's a lot of effort that is not very

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-26 Thread Ilya Kasnacheev
Hello! I have a hunch that we are trying to build Apache Solr (or Solr Cloud) into Apache Ignite. I think that's a lot of effort that is not very justified. I don't think we should try to implement sorting in Apache Ignite, because it is a lot of work, and a lot of code in our code base which we

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-11-22 Thread Yuriy Shuliga
Dear Igniters, The first part of TextQuery improvement - a result limit - was developed and merged. Now we have to develop most important functionality here - proper sorting of Lucene index response and correct reducing of them for distributed queries. *There are two Lucene based aspects* 1. In

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-22 Thread Andrey Mashenkov
Hi Dmitry, Yuriy. I've found GridCacheQueryFutureAdapter has newly added AtomicInteger 'total' field and 'limit; field as primitive int. Both fields are used inside synchronized block only. So, we can make both private and downgrade AtomicInteger to primitive int. Most likely, these fields can

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-21 Thread Dmitriy Pavlov
Hi Andrey, I've checked this ticket comments, and there is a TC Bot visa (with no blockers). Do you have any concerns related to this patch? Sincerely, Dmitriy Pavlov чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga : > Andrey, > > Per you request, I created ticket >

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-17 Thread Yuriy Shuliga
Andrey, Per you request, I created ticket https://issues.apache.org/jira/browse/IGNITE-12291 linked to https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189 Could you please proceed with PR merge ? BR, Yuriy Shuliha ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov пише: > Hi Yuri,

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Andrey Mashenkov
Yuriy, Just FYI we have a review checklist [1], coding guidelines [2]. To test a PR someone can use TeamCity [3] or TeamCityBot project [4]. The last way (using TCBot) makes test validation much easier and do not bother with flacky tests. Long story short you can trigger tests for the PR from

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Yuriy Shuliga
Andrew, I have corrected PR according to your notes. Please review. What will be the next steps in order to merge in? Y. чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov пише: > Yuri, > > I've done with review. > No crime found, but trivial compatibility bug. > > On Thu, Oct 3, 2019 at 3:54 PM Yuriy

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Ivan Pavlukhin
Yuriy, Thank you, fine with it. пт, 4 окт. 2019 г. в 11:01, Yuriy Shuliga : > > Ivan, > > Yes, your observation is correct. > > This behavior lasts from the very beginning when Lucene indexing was > implemented for distributed queries. > Implementation of the *limit* solves the problem of

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Yuriy Shuliga
Ivan, Yes, your observation is correct. This behavior lasts from the very beginning when Lucene indexing was implemented for distributed queries. Implementation of the *limit* solves the problem of redundant response size. Without it *ALL* off the records are fetched each time; that is not good,

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-04 Thread Ivan Pavlukhin
Yuriy, Am I getting it right that in your PR if we have a limit N than returned items (at most N) will not be strictly the most relevant ones? E.g. if one node returned N items faster than others but with not so good relevance? чт, 3 окт. 2019 г. в 17:47, Andrey Mashenkov : > > Yuri, > > I've

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-03 Thread Andrey Mashenkov
Yuri, I've done with review. No crime found, but trivial compatibility bug. On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga wrote: > Denis, > > Thank you for your attention to this. > as for now, the https://issues.apache.org/jira/browse/IGNITE-12189 ticket > is still pending review. > Do we have

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-03 Thread Yuriy Shuliga
Denis, Thank you for your attention to this. as for now, the https://issues.apache.org/jira/browse/IGNITE-12189 ticket is still pending review. Do we have a chance to move it forward somehow? BR, Yuriy Shuliha пн, 30 вер. 2019 о 23:35 Denis Magda пише: > Yuriy, > > I've seen you opening a

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-10-03 Thread Yuriy Shuliga
Ivan, Regarding you question about Lucene search response. *IndexSearcher.search()* always returns result sorted at least by *score *(*relevance*) or by defined *Sort *which includes ordering fields and rules. This means than even for now *GridLunceneIndex* result will be incorrect in case of

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-30 Thread Denis Magda
Yuriy, I've seen you opening a pull-request with the first changes: https://issues.apache.org/jira/browse/IGNITE-12189 Alex Scherbakov and Ivan are you the right guys to do the review? - Denis On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван wrote: > Yuriy, > > Thank you for providing details!

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-27 Thread Павлухин Иван
Yuriy, Thank you for providing details! Quite interesting. Yes, we already have support of distributed limit and merging sorted subresults for SQL queries. E.g. ReduceIndexSorted and MergeStreamIterator are used for merging sorted streams. Could you please also clarify about score/relevance? Is

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-25 Thread Yuriy Shuliga
Ivan, Thank you for interesting question! Text searches (or full text searches) are mostly human-oriented. And the point of user's interest is topmost part of response. Then user can read it, evaluate and use the given records for further purposes. Particularly in our case, we use Ignite for

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-19 Thread Павлухин Иван
Yuriy, Greatly appreciate your interest. Could you please elaborate a little bit about sorting? What tasks does it help to solve and how? It would be great to provide an example. ср, 18 сент. 2019 г. в 09:39, Alexei Scherbakov : > > Denis, > > I like the idea of throwing an exception for

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-18 Thread Alexei Scherbakov
Denis, I like the idea of throwing an exception for enabled text queries on persistent caches. Also I'm fine with proposed limit for unsorted searches. Yury, please proceed with ticket creation. вт, 17 сент. 2019 г., 22:06 Denis Magda : > Igniters, > > I see nothing wrong with Yury's proposal

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-17 Thread Denis Magda
Igniters, I see nothing wrong with Yury's proposal in regards full-text search API evolution as long as Yury is ready to push it forward. As for the in-memory mode only, it makes total sense for in-memory data grid deployments when Ignite caches data of an underlying DB like Postgres. As part of

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-09-17 Thread Yuriy Shuliga
Hello to all again, Thank you for important comments and notes given below! Let me answer and continue the discussion. (I) Overall needs in Lucene indexing Alexei has referenced to https://issues.apache.org/jira/browse/IGNITE-5371 where absence of index persistence was declared as an obstacle

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-08-30 Thread Alexei Scherbakov
Yuriy, Note what one of major blockers for text queries is [1] which makes lucene indexes unusable with persistence and main reason for discontinuation. Probably it's should be addressed first to make text queries a valid product feature. Distributed sorting and advanved querying is indeed not a

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-08-29 Thread Denis Magda
Yuriy, If you are ready to take over the full-text search indexes then please go ahead. The primary reason why the community wants to discontinue them first (and, probable, resurrect later) are the limitations listed by Andrey and minimal support from the community end. - Denis On Thu, Aug 29,

Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-08-29 Thread Andrey Mashenkov
Hi Yuriy, Unfortunatelly, there is a plan to discontinue TextQueries in Ignite [1]. Motivation here is text indexes are not persistent, not transactional and can't be user together with SQL or inside SQL. and there is a lack of interest from community side. You are weclome to take on these issues

Text queries/indexes (GridLuceneIndex, @QueryTextFiled)

2019-08-29 Thread Yuriy Shuliga
Dear community, By starting this chain I'd like to open discussion that would come to contribution results in subj. area. Ignite has indexing capabilities, backed up by different mechanisms, including Lucene. Currently, Lucene 7.5.0 is used (past year release). This is a wide spread and mature