hi i have filed a issue to lucene-core:
https://issues.apache.org/jira/browse/LUCENE-9130
i just write a test case, and find that BooelanQuery with MUST filter mode
is ok, but PhraseQuery fails

小鱼儿 <ctengc...@gmail.com> 于2020年1月10日周五 下午7:14写道:

> explain api helps! thanks for hint~!
> I have found out that one case failed becaused i carelessly add another
> filter condition, but the other case (which is analyzed into 30 terms)
> still failed, doesn't know why
> I guess i need to write a unit testcase to use MultiTerms.getTerms API to
> find out if there is any mismatch in analyzer's processing or if there is a
> capacity limit in PhraseQuery...
>
> Mikhail Khludnev <m...@apache.org> 于2020年1月10日周五 下午6:21写道:
>
>> Hello,
>> Sometimes IndexSearcher.explain(Query, int) allows to analyse mismatches.
>>
>> On Fri, Jan 10, 2020 at 1:13 PM 小鱼儿 <ctengc...@gmail.com> wrote:
>>
>> > After i directly call Analyzer.tokenStream() method to extract terms
>> from
>> > query, i still cannot get results. Doesn't know the why...
>> >
>> > Code when build index:
>> >            IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
>> //new
>> > SmartChineseAnalyzer();
>> >
>> > Code do query:
>> > (1) extract terms from query text:
>> >
>> >  public List<String> analysis(String fieldName, String text) {
>> > List<String> terms = new ArrayList<String>();
>> > TokenStream stream = analyzer.tokenStream(fieldName, text);
>> > try {
>> > stream.reset();
>> > while(stream.incrementToken()) {
>> > CharTermAttribute termAtt =
>> stream.getAttribute(CharTermAttribute.class);
>> > String term = termAtt.toString();
>> > terms.add(term);
>> > }
>> > stream.end();
>> > } catch (IOException e) {
>> > e.printStackTrace();
>> > log.error(e.getMessage(), e);
>> > }
>> > return terms;
>> > }
>> >
>> > (2) Code to construct a PhraseQuery:
>> >
>> > private Query buildPhraseQuery(Analyzer analyzer, String fieldName,
>> String
>> > queryText, int slop) {
>> > PhraseQuery.Builder builder = new PhraseQuery.Builder();
>> > builder.setSlop(2); //? max is 2;
>> > List<String> terms = analyzer.analysis(fieldName, queryText);
>> > for(String termKeyword : terms) {
>> > Term term = new Term(fieldName, termKeyword);
>> > builder.add(term);
>> > }
>> > Query query = builder.build();
>> > return query;
>> > }
>> >
>> > Use BooleanQuery also failed:
>> >
>> > private Query buildBooleanANDQuery(Analyzer analyzer, String fieldName,
>> > String queryText) {
>> > BooleanQuery.Builder builder = new BooleanQuery.Builder();
>> > List<String> terms = analyzer.analysis(fieldName, queryText);
>> > log.info("terms: "+StringUtils.join(terms, ", "));
>> > for(String termKeyword : terms) {
>> > Term term = new Term(fieldName, termKeyword);
>> > builder.add(new TermQuery(term), BooleanClause.Occur.MUST);
>> > }
>> > return builder.build();
>> > }
>> >
>> > Adrien Grand <jpou...@gmail.com> 于2020年1月10日周五 下午4:53写道:
>> >
>> > > It should match. My guess is that you might not reusing the same
>> > positions
>> > > as set by the analysis chain when creating the phrase query? Can you
>> show
>> > > us how you build the phrase query?
>> > >
>> > > On Fri, Jan 10, 2020 at 9:24 AM 小鱼儿 <ctengc...@gmail.com> wrote:
>> > >
>> > > > I use SmartChineseAnalyzer to do the indexing, and add a document
>> with
>> > a
>> > > > TextField whose value is a long sentence, when anaylized, will get
>> 18
>> > > > terms.
>> > > >
>> > > > & then i use the same value to construct a PhraseQuery, setting
>> slop to
>> > > 2,
>> > > > and adding the 18 terms concequently...
>> > > >
>> > > > I expect the search api to find this document, but it returns empty.
>> > > >
>> > > > Where am i wrong?
>> > > >
>> > >
>> > >
>> > > --
>> > > Adrien
>> > >
>> >
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>

Reply via email to