I don't want the users to have to use escape characters. I'd rather they didn't have to use quotes.
Of course I think someone needs to go into the internals anyway... on 1.4.3 I get an index out of array bounds error (not a nice parse exception) when it tries to parse the following (which it should be able to do): ["fred" TO "joe"] Maybe this is fixed in 1.9 but I tried it on the www.lucenebook.com search assuming that was using a recent version and that generates a server error! It's a real shame that the QueryParserTokenManager had no comments put in to explain what on earth it's doing! -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Hostetter Sent: 21 January 2006 18:46 To: [email protected] Subject: Re: Handling of colons in QueryParserTokenManager if you are flexible in the syntax you are willing to support, you can tell your users that they need to escape the colons that aren't ment as field identifiers... ID:CI\:123 ...alternately, you can tell them they have to quote colons... ID:"CI:123" ...then you can avoid the whole painfull mess of the parser internals. : Date: Sat, 21 Jan 2006 13:10:56 -0000 : From: Gwyn Carwardine <[EMAIL PROTECTED]> : Reply-To: [email protected] : To: [email protected] : Subject: Handling of colons in QueryParserTokenManager : : Hello, I'm new here. I've actually started using dotLucene but I think I : need to make a change to the QueryParser but it's so complicated to try and : understand what it's doing I thought I'd ask if maybe one of you guys could : point me in the right direction? : : In my implementation of Lucene I have the need to store keywords that are of : the form "<key>:<identity>" for example CI:123. Whilst I can store this in : Lucene using Field.Keyword("ID","CI:123") I can't easily look it up by using : QueryParser which I need to do. : : Whenever I parse the query ID:CI:123 it parses it as "ID:ci". Now I've : already made a small hack so that non-tokenized values are indexed as : lowercase so at least I can get them back if I use ID:CI\:123 but colons are : commonly used and I really don't want to have to escape them everywhere : : What I want to achieve is that query parser will parse ID:CI:123 as : field(ID) value(CI:123). I understand that colon is a special character but : it's only used to delimit fields and values in which case it makes sense to : react to the first colon, the second colon should be treated as part of the : text which the analyzer could strip out or keep (in my case because I'm : using a custom analyzer). : : Does this make sense? How do I go about changing the QueryParserTokenManager : to achieve this? Perhaps you can point me to some documentation that : describes the code even? : : Any help gratefully received! : : Thanks, : Gwyn Carwardine : : : --------------------------------------------------------------------- : To unsubscribe, e-mail: [EMAIL PROTECTED] : For additional commands, e-mail: [EMAIL PROTECTED] : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
