hi i have filed a issue to lucene-core: https://issues.apache.org/jira/browse/LUCENE-9130 i just write a test case, and find that BooelanQuery with MUST filter mode is ok, but PhraseQuery fails
小鱼儿 <ctengc...@gmail.com> 于2020年1月10日周五 下午7:14写道: > explain api helps! thanks for hint~! > I have found out that one case failed becaused i carelessly add another > filter condition, but the other case (which is analyzed into 30 terms) > still failed, doesn't know why > I guess i need to write a unit testcase to use MultiTerms.getTerms API to > find out if there is any mismatch in analyzer's processing or if there is a > capacity limit in PhraseQuery... > > Mikhail Khludnev <m...@apache.org> 于2020年1月10日周五 下午6:21写道: > >> Hello, >> Sometimes IndexSearcher.explain(Query, int) allows to analyse mismatches. >> >> On Fri, Jan 10, 2020 at 1:13 PM 小鱼儿 <ctengc...@gmail.com> wrote: >> >> > After i directly call Analyzer.tokenStream() method to extract terms >> from >> > query, i still cannot get results. Doesn't know the why... >> > >> > Code when build index: >> > IndexWriterConfig iwc = new IndexWriterConfig(analyzer); >> //new >> > SmartChineseAnalyzer(); >> > >> > Code do query: >> > (1) extract terms from query text: >> > >> > public List<String> analysis(String fieldName, String text) { >> > List<String> terms = new ArrayList<String>(); >> > TokenStream stream = analyzer.tokenStream(fieldName, text); >> > try { >> > stream.reset(); >> > while(stream.incrementToken()) { >> > CharTermAttribute termAtt = >> stream.getAttribute(CharTermAttribute.class); >> > String term = termAtt.toString(); >> > terms.add(term); >> > } >> > stream.end(); >> > } catch (IOException e) { >> > e.printStackTrace(); >> > log.error(e.getMessage(), e); >> > } >> > return terms; >> > } >> > >> > (2) Code to construct a PhraseQuery: >> > >> > private Query buildPhraseQuery(Analyzer analyzer, String fieldName, >> String >> > queryText, int slop) { >> > PhraseQuery.Builder builder = new PhraseQuery.Builder(); >> > builder.setSlop(2); //? max is 2; >> > List<String> terms = analyzer.analysis(fieldName, queryText); >> > for(String termKeyword : terms) { >> > Term term = new Term(fieldName, termKeyword); >> > builder.add(term); >> > } >> > Query query = builder.build(); >> > return query; >> > } >> > >> > Use BooleanQuery also failed: >> > >> > private Query buildBooleanANDQuery(Analyzer analyzer, String fieldName, >> > String queryText) { >> > BooleanQuery.Builder builder = new BooleanQuery.Builder(); >> > List<String> terms = analyzer.analysis(fieldName, queryText); >> > log.info("terms: "+StringUtils.join(terms, ", ")); >> > for(String termKeyword : terms) { >> > Term term = new Term(fieldName, termKeyword); >> > builder.add(new TermQuery(term), BooleanClause.Occur.MUST); >> > } >> > return builder.build(); >> > } >> > >> > Adrien Grand <jpou...@gmail.com> 于2020年1月10日周五 下午4:53写道: >> > >> > > It should match. My guess is that you might not reusing the same >> > positions >> > > as set by the analysis chain when creating the phrase query? Can you >> show >> > > us how you build the phrase query? >> > > >> > > On Fri, Jan 10, 2020 at 9:24 AM 小鱼儿 <ctengc...@gmail.com> wrote: >> > > >> > > > I use SmartChineseAnalyzer to do the indexing, and add a document >> with >> > a >> > > > TextField whose value is a long sentence, when anaylized, will get >> 18 >> > > > terms. >> > > > >> > > > & then i use the same value to construct a PhraseQuery, setting >> slop to >> > > 2, >> > > > and adding the 18 terms concequently... >> > > > >> > > > I expect the search api to find this document, but it returns empty. >> > > > >> > > > Where am i wrong? >> > > > >> > > >> > > >> > > -- >> > > Adrien >> > > >> > >> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> >