The QueryParser also fails to correctly parse Hebrew acronyms; although not being an integral part of the current discussion, I thought this would be the best place to bring that up.
Hebrew acronyms are assembled of letters with a single double-quote char within, example: MNK"L (Hebrew for CEO). That double-quote char usually comes at the before-last position of the word, but for some cases it can come before (MNK"LIT). Since the QP expects two sets of double-quotes enclosing a phrase, an exception will be thrown if such a word has been passed to it, or an incorrect phrase query will be produced if two acronyms are used together in a query string. Not sure which is worse. Perhaps while you're at it you could make sure to only create a phrase query if a quote is followed by a space - hence is definitely at the end of a word, and not just assume it to be equivalent to a white space? Although there's no good open Hebrew analyzer for Lucene yet hence no motivation for this to be fixed, I'm working on one as we speak and hopefully will have something to show in the next few weeks/days. It would be nice to have at least this issue closed within the Lucene core code. Thanks, Itamar Syn-Hershko --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org