Re: Fulltext matching

Ilya Kasnacheev Thu, 06 Sep 2018 08:09:21 -0700

Hello!

Unfortunately, fulltext doesn't seem to have much traction, so I recommend
doing investigations on your side, possibly creating JIRA issues in the
process.


Regards,
-- 
Ilya Kasnacheev


пн, 3 сент. 2018 г. в 22:34, Courtney Robinson <[email protected]>:

> Hi,
>
> We've got Ignite in production and decided to start using some fulltext
> matching as well.
> I've investigated and can't figure out why my queries are not matching.
>
> I construct a query entity e.g new QueryEntity(keyClass, valueClass) and
> in debug I can see it generates a list of fields
> e.g. a, b, c.a, c.b
> I then expected to be able to match on those fields that are marked as
> indexed. Everything is annotation driven. The appropriate fields have been
> annotated and appear to be detected as such
> when I inspect what gets put into the QueryEntityDescriptor. i.e. all
> expected indices and indexed fields are present.
>
> In LuceneGridIndex I see that the lucene document generated as fields a,b
> (c.a and c.b are not included). Now a couple of questions arise:
>
> 1. Is there a way to get Ignite to index the nested fields as well so that
> c.a and c.b end up in the doc?
>
> 2. If you use a composite object as a key, its fields are extracted into
> the top level so if you have Key.a and Value.a you cannot index both since
> Key.a becomes a which collides with Value.a - can this be changed, are
> there any known reasons why it couldn't be (i.e. I'm happy to send a PR
> doing so - but I suspect the answer to this is linked to the answer to the
> first question)
>
> 3. The docs simply say you can use lucene syntax, I presume it means the
> syntax that appears in
> https://lucene.apache.org/core/2_9_4/queryparsersyntax.html is all valid
> - checking the code that appears to be case as it does
> a MultiFieldQueryParser in GridLuceneIndex. However, when I try to run a
> query such as a:<my-text> - none of the indexed documents match. In debug
> mode I've enabled parser.setAllowLeadingWildcard(true); and if I do a
> simple searcher.search * I get back the list of expected documents.
>
> What's even more odd is I tried querying each of the 6 indexed fields as
> found in idxdFields in GridLuceneIndex and 1 of them match. The other
> values are being typed exactly but also doing wild cards or other free text
> forms do not match.
>
> 4. I couldn't see a way to provide a custom GridLuceneIndex, I found the
> two cases where it's constructed in the code base and doesn't look like I
> can inject instances. Is it ok to construct and use a custom
> GridLuceneDirectory/IndexWriter/Searcher and so on in the same way
> GridLuceneIndex does it so I can do a custom IndexingSpi to change how
> indexing happens?
> There are a number of things I'd like to customise and from looking at the
> current impl. these things aren't injectable, I guess it's not considered a
> prime use case maybe.
>
> Yeah, the analyzer and a number of things would be handy to change.
> Ideally also want to customise how a field is indexed e.g. to be able to do
> term matches with lucene queries
>
> Looking at this impl as well it passes Integer.MAX_VALUE and pulls back
> all matches. That'll surely kill our nodes for some of the use cases we're
> considering.
> I'd also like to implement paging, the searcher API has a nice option to
> pass through a last doc it can continue from to potentially implement
> something like deep-paging.
>
> 5. If I were to do a custom IndexingSpi to make all of this happen, how do
> I get additional parameters through so that I could have paging params
> passed
>
> Ideally I could customise the indexing, searching and paging through
> standard Ignite means but I can't find any means of doing that in the
> current code and short of doing a custom IndexingSpi I think I've gone as
> far as I can debugging and could do with a few pointers of how to go about
> this.
>
> FYI, SQL isn't a great option for this part of the product, we're
> generating and compiling Java classes at runtime and generating SQL to do
> the queries is an order of magnitude more work than indexing the relatively
> few fields we need and then searching but off the bat the paging would be
> an issue as there can be several million matches to a query. Can't have
> Ignite pulling all of those into memory.
>
> Thanks in advance
>
> Courtney
>

Re: Fulltext matching

Reply via email to