Re: Analyzer for supporting hyphenated words

2015-07-24 Thread Diego Socaceti
Hi Alessandro, after talking to our customer: Yes, it needs to be a mix of classic and quoted queries in one userCriteria. Before we look into the details of the QueryParser. I'm currently using org.apache.lucene.queryparser.classic.QueryParser of 5.2.1. Is this the right QueryParser to use?

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
Hi Alessandro, i guess code says more than worlds :) ... public static final String EXACT_SEARCH_FORMAT = \%s\; public static final String MULTIPLE_CHARACTER_WILDCARD = *; ... if (isExactCriteriaString(userCriteria)) { String userCriteriaEscaped = String.format(EXACT_SEARCH_FORMAT,

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
sorry little code refactoring typo: curTokenProcessed should be userCriteriaProcessed ... public static final String EXACT_SEARCH_FORMAT = \%s\; public static final String MULTIPLE_CHARACTER_WILDCARD = *; ... if (isExactCriteriaString(userCriteria)) { String userCriteriaEscaped =

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Alessandro Benedetti
Yes what I meant is that you actually can use your analyser when the query is not in the quotes. When in the quotes you can directly build a term Query out of it. Now of course it is not so simple scenario, do you think quoted query and not quoted query parts are 2 different set of queries, which

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Alessandro Benedetti
I read briefly, correct me if I am wrong, but that is to parse the content within the quotes . But we are still at a String level. I want to see how you build the phraseQuery :) Taking a look to the code the PhraseQuery allow you to add as many terms you want. What you need to do, it's to not

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
Hi Alessandro, sorry, that i forgot the important part. Here it is: ... public static final String EXACT_SEARCH_FORMAT = \%s\; public static final String MULTIPLE_CHARACTER_WILDCARD = *; ... if (isExactCriteriaString(userCriteria)) { String userCriteriaEscaped =

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Diego Socaceti
Hi Alessandro, yes, i want the user to be able to surround the query with to run the phrase query with a NOT tokenized phrase. What do i have to do? Thanks and Kind regards On Tue, Jul 21, 2015 at 2:47 PM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Hey Jack, reading the doc :

Re: Analyzer for supporting hyphenated words

2015-07-22 Thread Alessandro Benedetti
As a start Diego, how do you currently parse the user query to build the Lucene queries ? Cheers 2015-07-22 8:35 GMT+01:00 Diego Socaceti socac...@gmail.com: Hi Alessandro, yes, i want the user to be able to surround the query with to run the phrase query with a NOT tokenized phrase.

Re: Analyzer for supporting hyphenated words

2015-07-21 Thread Jack Krupansky
If you don't explicitly enable automatic phrase queries, the Lucene query parser will assume an OR operator on the sub-terms when a white space-delimited term analyzes into a sequence of terms. See:

Re: Analyzer for supporting hyphenated words

2015-07-21 Thread Alessandro Benedetti
Hi Diego, let me try to help : I find this a little bit confused : For our customer it is important to find the word - *wi-fi* by wi, *fi*, wifi, wi-fi - jean-pierre by jean, pierre, jean-pierre, jean-* But : The (exact) query *FD-A320-REC-SIM-1* returns FD-A320-REC-SIM-1

Re: Analyzer for supporting hyphenated words

2015-07-21 Thread Alessandro Benedetti
Hey Jack, reading the doc : Set to true if phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text. NOTE: this behavior may not be suitable for all languages. Set to false if phrase queries should only be generated when

Analyzer for supporting hyphenated words

2015-07-17 Thread Diego Socaceti
Hi all, i'm new to lucene and tried to write my own analyzer to support hyphenated words like wi-fi, jean-pierre, etc. For our customer it is important to find the word - wi-fi by wi, fi, wifi, wi-fi - jean-pierre by jean, pierre, jean-pierre, jean-* The analyzer: public class