I haven't really been following this thread, but it's gotten so long i got interested.
from whta i can tell skimming the discussion so far, it seems like the biggest confusion is about the definition of a "phrase" and what analyzers do with "quote" characters and what the QueryParser does with "quote" charcters -- when ultimately you don't seem to really care about "phrases" in a textual searching sense; nor do you seem to care about any of the "features" of the QueryParser. it seems that what you care about is: 1) making documents, and adding a list of "text chunks" to those documents (what you've been calling phrases) 2) you then want to be able to search for "almost-exact" matches on those "text chunks" ... these matches should be "exactish" because you don't want partial matches based on white spaces, or splitting on hyphens, but they shouldn't be truely exact because you want some simple normalization... : actually would like to "normalize" a phrase (spaces) or a hyphenated word or : an underscored word to the same value -- e.g. MS-WORD or ms_WORd or "MS : Word" --> ms_word. ...in which case, you should: a) write yourself an analyzer which does no "tokenizing" (ie: each input Field value generates a single token) but does the normalization you want. b) use this Analyzer when you add the fields to your documents, even though you don't want *real* tokenization, add make the field type TOKENIZED so your analyzer gets used. c) when you get some text input to serach on, pass it to the same Analyzer, take the Token you get back and manualy construct a TermQuery out of it for the neccessary field. ...that's it. that's all she wrote -- don't even look in QueryParser's general direction, at all. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]