Hello! Unfortunately, fulltext doesn't seem to have much traction, so I recommend doing investigations on your side, possibly creating JIRA issues in the process.
Regards, -- Ilya Kasnacheev пн, 3 сент. 2018 г. в 22:34, Courtney Robinson <[email protected]>: > Hi, > > We've got Ignite in production and decided to start using some fulltext > matching as well. > I've investigated and can't figure out why my queries are not matching. > > I construct a query entity e.g new QueryEntity(keyClass, valueClass) and > in debug I can see it generates a list of fields > e.g. a, b, c.a, c.b > I then expected to be able to match on those fields that are marked as > indexed. Everything is annotation driven. The appropriate fields have been > annotated and appear to be detected as such > when I inspect what gets put into the QueryEntityDescriptor. i.e. all > expected indices and indexed fields are present. > > In LuceneGridIndex I see that the lucene document generated as fields a,b > (c.a and c.b are not included). Now a couple of questions arise: > > 1. Is there a way to get Ignite to index the nested fields as well so that > c.a and c.b end up in the doc? > > 2. If you use a composite object as a key, its fields are extracted into > the top level so if you have Key.a and Value.a you cannot index both since > Key.a becomes a which collides with Value.a - can this be changed, are > there any known reasons why it couldn't be (i.e. I'm happy to send a PR > doing so - but I suspect the answer to this is linked to the answer to the > first question) > > 3. The docs simply say you can use lucene syntax, I presume it means the > syntax that appears in > https://lucene.apache.org/core/2_9_4/queryparsersyntax.html is all valid > - checking the code that appears to be case as it does > a MultiFieldQueryParser in GridLuceneIndex. However, when I try to run a > query such as a:<my-text> - none of the indexed documents match. In debug > mode I've enabled parser.setAllowLeadingWildcard(true); and if I do a > simple searcher.search * I get back the list of expected documents. > > What's even more odd is I tried querying each of the 6 indexed fields as > found in idxdFields in GridLuceneIndex and 1 of them match. The other > values are being typed exactly but also doing wild cards or other free text > forms do not match. > > 4. I couldn't see a way to provide a custom GridLuceneIndex, I found the > two cases where it's constructed in the code base and doesn't look like I > can inject instances. Is it ok to construct and use a custom > GridLuceneDirectory/IndexWriter/Searcher and so on in the same way > GridLuceneIndex does it so I can do a custom IndexingSpi to change how > indexing happens? > There are a number of things I'd like to customise and from looking at the > current impl. these things aren't injectable, I guess it's not considered a > prime use case maybe. > > Yeah, the analyzer and a number of things would be handy to change. > Ideally also want to customise how a field is indexed e.g. to be able to do > term matches with lucene queries > > Looking at this impl as well it passes Integer.MAX_VALUE and pulls back > all matches. That'll surely kill our nodes for some of the use cases we're > considering. > I'd also like to implement paging, the searcher API has a nice option to > pass through a last doc it can continue from to potentially implement > something like deep-paging. > > 5. If I were to do a custom IndexingSpi to make all of this happen, how do > I get additional parameters through so that I could have paging params > passed > > Ideally I could customise the indexing, searching and paging through > standard Ignite means but I can't find any means of doing that in the > current code and short of doing a custom IndexingSpi I think I've gone as > far as I can debugging and could do with a few pointers of how to go about > this. > > FYI, SQL isn't a great option for this part of the product, we're > generating and compiling Java classes at runtime and generating SQL to do > the queries is an order of magnitude more work than indexing the relatively > few fields we need and then searching but off the bat the paging would be > an issue as there can be several million matches to a query. Can't have > Ignite pulling all of those into memory. > > Thanks in advance > > Courtney >
