Hi Steven,
I have implemented sentence specific proximity search as suggested below.
However, unfortunately it still doesn't identify the sentence boundaries for
my search.
I am using # as a delimiter between my sentences while indexing the content:
------------
ArrayList<String> sentencesList = sentenceScanner.getAllSentences();
StringBuffer textWithToken = new StringBuffer();
for (String sentence : sentencesList){
textWithToken.append(sentence + " # ");
}
addFieldToDocument(document, IFIELD_TEXT, textWithToken.toString(), true,
true);
------------
* Used StandardAnalyzer to initialize the indexWriter while adding the
document
This is how I am performing my search:
------------
Query query = null;
strQuery = strQuery.replaceAll("\\s+", " ");
String[] spanTerms = strQuery.split(" ");
SpanQuery[] spanQueries = new SpanQuery[spanTerms.length];
for (int count = 0; count < spanTerms.length; count++) {
String spanTerm = spanTerms[count];
spanQueries[count] = new SpanTermQuery(new Term(field, spanTerm));
}
if(!withinSentence){
SpanQuery spanQuery = new SpanNearQuery(spanQueries, span, true);
query = spanQuery;
} else if (withinSentence){
SpanQuery queryInclude = new SpanNearQuery(spanQueries, span, true);
SpanQuery queryExclude = new SpanTermQuery(new Term(field, "#"));
SpanQuery spanNotQuery = new SpanNotQuery(queryInclude, queryExclude);
query = spanNotQuery;
}
bQuery.add(query, BooleanClause.Occur.MUST);
------------
When I eventually read my query on the console, this is how it looks in both
cases:
With no sentence boundary
+(author:amanda) +spanNear([text:efficiency, text:delta], 10, true)
+(year:2009 year:2010)
With sentence boundary
+(author:amanda) +spanNot(spanNear([text:efficiency, text:delta], 10, true),
text:#) +(year:2009 year:2010)
My guess is that probably, my index isn't saving the sentence boundary value
# as a separate term. Any hints or pointers on where exactly I am
mis-implementing would be highly appreciated.
Thanks.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Issue-with-sentence-specific-search-tp1644352p1651512.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]