We are using QueryParser.parse(userEnteredQuery) to get a programmatic Query object. We would like to boost documents that contain some of the query terms as "mini phrases". For example, when the user searches for: *professional development leader*, we would like to get back all the documents that contain all the 3 terms, but rank higher documents that contain some of the terms next to each other like: "*professional development*" or "*development leader*" or "*professional development leader*". We want to keep using QueryParser and avoid dealing with it's syntax. Therefore, if the user query text contains special QueryParser characters, then we prefer to give up on adding the phrase boost. For example, for user query* professional^0.5 -development title:leader *we wouldn't use the phrase boost. We assume that's something that other people would need as well. Is there any standard solution?
The naive approach could be manually checking if the user query contains any Lucene syntax characters, like (+ - ~ ^ ) etc. Then splitting the user query into terms by white spaces, creating phrase queries from the combinations of terms and adding them as SHOULD (optionally with some boosting) to the original query (which is MUST). Any other ideas or known solutions? And what about the performance implications of the proposed naive solution? How adding a significant number of additional phrase queries with SHOULD is likely to affect the search time performance?