[jira] Commented: (LUCENE-1418) QueryParser can throw NullPointerException during parsing of some queries in case if default field passed to constructor is null

Alexei Dets (JIRA) Thu, 13 Nov 2008 12:11:06 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647381#action_12647381
 ]


Alexei Dets commented on LUCENE-1418:
-------------------------------------

> You shouldn't be passing null to the queryparser as a field right?
Wrong. It is not documented anywhere and in practice this _WORKS_ _PERFECTLY_ 
except for one case of one particular syntax error in a query string that is 
not handled correctly in this case. (All?) other syntax errors correctly cause 
ParseException and all valid queries are parsed successfully. BTW, this is 
exactly what does MultiFieldQueryParser in its constructor - it calls 
super(null, analyzer).

> its not part of the query syntax, so it doesn't make sense to feed a "" 
> instead of the null
And this is also wrong because Lucene query parser syntax allows for queries 
without explicit field specification (i.e. "cat AND dog" is a valid query). 
From the current docs it is not obvious how to properly construct a query 
parser that will correctly parse such queries, not clear what should be passed 
as a field in a QueryParser constructor because there is no any particular 
field, the same query parser instance is used to parse queries for different 
fields. If I'll pass not null and not empty string then AFAIR instead of "+cat 
+ dog" I'll get as a result "+field:cat +field:doc" and this is not acceptable 
as this will cause a creation of a separate query parser instance for each 
field - a real overkill I'd say.

Really there are two issues:
1) it should be documented how should be created "fieldless" query parser: that 
empty string should be used as a field name, that null is unsupported and can 
give undefined results (alternative is to fix this problem with 
NullPointerException and document that both ways are supported - null and empty 
string);
2) currently there is no way to use the same query parser instance for parsing 
queries for the different fields (except set default field to empty string and 
manually add "field:" prefix to each query before parsing).

What I propose:
1) document that null field gives undefined behavior (and optionally add null 
check in constructor  /IMHO documentation can be enough/ or document that null 
is officially supported);
2) add public setField method to be able to change the field (may be also 
setAnalyzer can be usefull?);
3) add default constructor QueryParser() - and document that in this case field 
(and analyzer) must be set later before parsing.

I guess currently the same thing can be achieved using static method 
MultiFieldQueryParser.parse(String[] queries, String[] fields, 
BooleanClause.Occur[] flags, Analyzer analyzer) but having the way to make it 
using regular QueryParser can be also usefull (i.e. significantly less overhead 
- static methods in MultiFieldQueryParser are internally constructing a new 
parser instance per field!). QueryParser can also benefit from a static parse 
method similar to MultiFieldQueryParser: public static Query 
QueryParser.parse(String query, String field, Analyzer analyzer).

I think I can come up with the patch if we all agree on something :-)

> QueryParser can throw NullPointerException during parsing of some queries in 
> case if default field passed to constructor is null
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1418
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1418
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: QueryParser
>    Affects Versions: 2.4
>         Environment: CentOS 5.2 (probably any applies)
>            Reporter: Alexei Dets
>            Priority: Minor
>
> In case if QueryParser was constructed using "QueryParser(String f,  Analyzer 
> a)" constructor and f equals null then QueryParser can fail with 
> NullPointerException during parsing of some queries that _does_ contain field 
> name but have unbalanced parenthesis.
> Example 1:
> Query:  field:(expr1) expr2)
> Result:
> java.lang.NullPointerException
>       at org.apache.lucene.index.Term.<init>(Term.java:50)
>       at org.apache.lucene.index.Term.<init>(Term.java:36)
>       at 
> org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:543)
>       at org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1324)
>       at 
> org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1211)
>       at 
> org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1168)
>       at 
> org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1128)
>       at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:170)
> Example2:
> Query:  field:(expr1) "expr2")
> Result:
> java.lang.NullPointerException
>       at org.apache.lucene.index.Term.<init>(Term.java:50)
>       at org.apache.lucene.index.Term.<init>(Term.java:36)
>       at 
> org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:543)
>       at 
> org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:612)
>       at org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1459)
>       at 
> org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1211)
>       at 
> org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1168)
>       at 
> org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1128)
>       at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:170)
> Workaround: pass in constructor empty string as a default field name - in 
> this case QueryParser.parse method will throw ParseException (expected result 
> because query string is wrong) instead of NullPointerException.
> It is not obvious to me how to fix this so I'll describe my usecase, may be 
> I'm doing something completely wrong.
> Basically I have a set of per-field queries entered by user and need to 
> programmatically construct (after some preprocessing) one real Lucene query 
> combined from these user-entered per-field subqueries.
> To achieve this I basically do the following (simplified a bit):
> QueryParser parser = new QueryParser(null, analyzer); // I'll always provide 
> a field name in a query string as it is different each time and I don't have 
> any default
> BooleanQuery query = new BooleanQuery();
> Query subQuery1 = parser.parse(field1 + ":(" + queryString1 + ')');
> query.add(subQuery1, operator1); // operator = BooleanClause.Occur.MUST, 
> BooleanClause.Occur.MUST_NOT or BooleanClause.Occur.SHOULD
> Query subQuery2 = parser.parse(field2 + ":(" + queryString2 + ')');
> query.add(subQuery2, operator2); 
> Query subQuery3 = parser.parse(field3 + ":(" + queryString3 + ')');
> query.add(subQuery3, operator3); 
> ...
> IMHO either QueryParser constructor should be changed to throw 
> NullPointerException/InvalidArgumentException in case of null field passed 
> (and API documentation updated) or QueryParser.parse behavior should be fixed 
> to correctly throw ParseException instead of NullPointerException. Also IMHO 
> of a great help can be _public_ setField/getField methods of QueryParser 
> (that set/get field), this can help in use cases like my:
> QueryParser parser = new QueryParser(null, analyzer); // or add constructor 
> with analyzer _only_ for such cases
> BooleanQuery query = new BooleanQuery();
> parser.setField(field1);
> Query subQuery1 = parser.parse(queryString1);
> query.add(subQuery1, operator1);
> parser.setField(field2);
> Query subQuery2 = parser.parse(queryString2);
> query.add(subQuery2, operator2); 
> ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1418) QueryParser can throw NullPointerException during parsing of some queries in case if default field passed to constructor is null

Reply via email to