Hi,
I've build something similar, basically a large freetext textbox for
simple queries. I choose instead to have a big concatenated
fulltext-field instead of searching into separate fields. I call
QueryParser.Parse several times, first with the fulltext-field, then
twice for my title- and contributor field. I then combine them together
in a BooleanQuery where only the fulltext-field is required (must). This
will boost documents with words found in the title- or contributor
field, but not requiring them to be present. I've later learned about
building custom queries, weights and generate your own boosts that way,
but never bothered to rebuild how my search works.
However, one step is to detect if the user has inputted any fields, and
if that's the case, use the user-provided query as-is. This works since
I've only sent it thru the QueryParser, not the MultiQueryParser. Here's
the code I got that takes a user-provided string and does all this. Some
methods are not present and are left as a exercise for the implementor. ;)
The FieldsInQuery method uses a QueryVisitor to iterate the query tree
and extract all field names. You can find one at
https://github.com/devhost/Corelicious/blob/master/Corelicious.Lucene/QueryVisitor.cs
You would basically need to override VisitField(String field) to store
all fields in a HashSet<String>. The Parse-method is just a wrapper for
QueryParser.Parse that handles any exceptions thrown.
MarkClausesAsOptional uses another QueryVisitor to set all clauses to
optional (override VisitOccur and return Occur.SHOULD).
private Query FreeTextQuery(String value, Analyzer analyzer) {
if (String.IsNullOrEmpty(value))
return null;
var ftQuery = Parse(IndexFields.FullTextField, value, analyzer,
QueryParser.Operator.OR);
if (ftQuery == null)
return null;
// Use the parsed query if there's a manually specified field.
if (FieldsInQuery(ftQuery).Any(f => f != IndexFields.FullTextField))
return ftQuery;
// We have a query without any specified fields. We should parse
this as
// all-words-required.
ftQuery = Parse(IndexFields.FullTextField, value, analyzer,
QueryParser.Operator.AND);
var outerQuery = new BooleanQuery();
outerQuery.Add(ftQuery, Occur.MUST);
var titleQuery = Parse(IndexFields.TitleField, value, analyzer);
if (titleQuery != null) {
titleQuery = MarkClausesAsOptional(titleQuery);
outerQuery.Add(titleQuery, Occur.SHOULD);
}
var contributorQuery = Parse(IndexFields.ContributorField, value,
analyzer);
if (contributorQuery != null) {
contributorQuery = MarkClausesAsOptional(contributorQuery);
outerQuery.Add(contributorQuery, Occur.SHOULD);
}
return outerQuery;
}
// Simon
On 2013-02-14 15:05, Omri Suissa wrote:
No one? :)
On Wed, Feb 6, 2013 at 12:41 PM, Omri Suissa <[email protected]>wrote:
Hi,
I'm using MultiFieldQueryParser to allow advance search in my application.
In some cases the user don't send the fields names and in some cases the
user send them.
In this case the user sent the following query:
(((name:10th AND name:10th) AND (name:10th AND name:10th) AND name:10th
AND name:10th) AND name:10th)
As you can see all the conditions are the same (name:10th).
My code looks like this:
MultiFieldQueryParser queryParser = new MultiFieldQueryParser
(Lucene.Net.Util.Version.LUCENE_30, fields, analyzer, boosts);
queryParser.AllowLeadingWildcard = true;
try
{
objQuery = queryParser.Parse(realQuery);
return objQuery;
}
catch (ParseException pe)
{
return null;
}
Where the [fields] variable is a list of default fields if the user didn't
send fields (not relevant in this case), the [analyzer] is
StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30) and some default boosts
(also not relevant in this case).
The result I got back is:
+(+(+(((name:"name 10th" +(((name:10th AND name:"name 10th ? name 10th")
+(+(((name:10th AND name:10th) AND (name:"name 10th ? name 10th ? name
10th" +(((name:10th AND name:10th) AND (name:10th AND name:"name 10th ?
name 10th ? name 10th ? name 10th") +(((name:10th AND name:10th) AND
(name:10th AND name:10th) AND name:"name 10th ? name 10th ? name 10th ?
name 10th ? name 10th" +(((name:10th AND name:10th) AND (name:10th AND
name:10th) AND name:10th AND name:"name 10th ? name 10th ? name 10th ? name
10th ? name 10th ? name 10th") +(((name:10th AND name:10th) AND (name:10th
AND name:10th) AND name:10th AND name:10th) AND name:"name 10th ? name 10th
? name 10th ? name 10th ? name 10th ? name 10th ? name 10th"
As you can see this is not what I was expected. The query search for
different things:
- name:10th
- name:"name 10th ? name 10th"
- name:"name 10th ? name 10th ? name 10th"
- name:"name 10th ? name 10th ? name 10th ? name 10th"
- and so on…
Why is that? the way I see it, the user sent the same condition over and
over again with some brackets and ANDs between them that should not effect
a thing…
If this was an "IF" condition in C# is was just like saying "if
(name.Contains("10th") == true)".
Thanks,
Omri