[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734300#action_12734300
 ] 

Luis Alves commented on LUCENE-1486:
------------------------------------

Hi Mark H

I would like to propose 5,
5) Re-engineer the QueryParser.jj to support a formally defined syntax for 
acceptable "within phrase" operators e.g. *, ~, ( ) 
    I propose doing this using using the new QP implementation. (I can write 
the new javacc QP for this)
    (this implies that the code will be in contrib in 2.9 and be part of core 
on 3.0)

I also want to propose to change the complexphrase to use single quotes,
this way we can have both implementation for phrases.

Here is a summary:
- the complexqueryparser would support all Lucene syntax even for phrases
- and we could add singlequoted text to identify complexphrases 
    1) Wildcard/fuzzy/range clauses can be used to define a phrase element (as 
opposed to simply single terms)
    2) Brackets are used to group/define the acceptable variations for a given 
phrase element e.g. "(john OR jonathon) smith"
    3) supported operators: OR, *, ~, ( ), ?
    4) disallow fields, proximity, boosting and operators on single quoted 
phrases (I'm making an assumption here, Mark H please comment)
    5) singlequotes need to be escaped, double quotes will be treated as 
regular punctuation characters inside single quoted strings


Mark H, can you please elaborate more on the these other operators "+" "-" "^" 
"AND" "&&" "||" "NOT" "!" ":" "[" "]" "{" "}".

Example:
A query with single quoted (complexphrase) followed by a term and a normal 
phrase:

query: '(john OR jonathon) smith~0.3 order*' order:sell  "stock market"  



> Wildcards, ORs etc inside Phrase queries
> ----------------------------------------
>
>                 Key: LUCENE-1486
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1486
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: QueryParser
>    Affects Versions: 2.4
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: ComplexPhraseQueryParser.java, 
> junit_complex_phrase_qp_07_21_2009.patch, 
> junit_complex_phrase_qp_07_22_2009.patch, LUCENE-1486.patch, 
> LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
> TestComplexPhraseQuery.java
>
>
> An extension to the default QueryParser that overrides the parsing of 
> PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
> The implementation feels a little hacky - this is arguably better handled in 
> QueryParser itself. This works as a proof of concept  for much of the query 
> parser syntax. Examples from the Junit test include:
>               checkMatches("\"j*   smyth~\"", "1,2"); //wildcards and fuzzies 
> are OK in phrases
>               checkMatches("\"(jo* -john)  smith\"", "2"); // boolean logic 
> works
>               checkMatches("\"jo*  smith\"~2", "1,2,3"); // position logic 
> works.
>               
>               checkBadQuery("\"jo*  id:1 smith\""); //mixing fields in a 
> phrase is bad
>               checkBadQuery("\"jo* \"smith\" \""); //phrases inside phrases 
> is bad
>               checkBadQuery("\"jo* [sma TO smZ]\" \""); //range queries 
> inside phrases not supported
> Code plus Junit test to follow...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to