Hi there, I tested the new token definitions with the lucence sources from 2001-10-19. The query works fine with search terms starting with the german umlauts '�', '�', '�'. Ralf Zimmermann > From: Hal�csy P�ter [mailto:[EMAIL PROTECTED]] > > I think IDENTIFIER_CHAR doesn't need to be the first char so my > proposal is: > <TERM: ( ~["\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*", "?", > "~", "{", "}", "[", "]" ] )+ > That looks like the right approach to me. > On the other hand IDENTIFIER, ALPHA_CHAR, ALPHANUM_CHAR tokens are > definied but are not used. So let's remove them! > ps: I don't understand the definition of WILD_TERM. It states that a > wild term must end with identifier_char, so cannot end with > *. Is it the right definition? Yes. The code for handling a final asterisk (PrefixQuery) is different from term general term wildcarding code (WildCardQuery). These changes yield the following token definitions in QueryParser.jj: <*> TOKEN : { <#_NUM_CHAR: ["0"-"9"] > | <#_TERM_CHAR: ~["\"", " ", "\t", "(", ")", ":", "&", "|", "^", "*", "?", "~", "{", "}", "[", "]" ] > | <#_NEWLINE: ( "\r\n" | "\r" | "\n" ) > | <#_WHITESPACE: ( " " | "\t" ) > | <#_QCHAR: ( "\\" (<_NEWLINE> | ~["a"-"z", "A"-"Z", "0"-"9"] ) ) > | <#_RESTOFLINE: (~["\r", "\n"])* > } <DEFAULT> TOKEN : { <AND: ("AND" | "&&") > | <OR: ("OR" | "||") > | <NOT: ("NOT" | "!") > | <PLUS: "+" > | <MINUS: "-" > | <LPAREN: "(" > | <RPAREN: ")" > | <COLON: ":" > | <CARAT: "^" > | <STAR: "*" > | <QUOTED: "\"" (~["\""])+ "\""> | <NUMBER: (["+","-"])? (<_NUM_CHAR>)+ "." (<_NUM_CHAR>)+ > | <TERM: (<_TERM_CHAR>)+ > | <FUZZY: "~" > | <WILDTERM: <_TERM_CHAR> ( ~["\"", " ", "\t", "(", ")", ":", "&", "|", "^", "~", "{", "}", "[", "]" ] )+ <_TERM_CHAR>> | <RANGEIN: "[" (~["]"])+ "]"> | <RANGEEX: "{" (~["}"])+ "}"> } <DEFAULT> SKIP : { <<_WHITESPACE>> } Can folks try these and tell me if it solves the problem? Ideally we should add some cases for this to the junit tests, but I can't get junit to work at all right now... Have the junit tests ever run correctly from ant since the move to Jakarta? Can someone more familiar with junit have a look at this? Doug -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
