RE: Handling of colons in QueryParserTokenManager

Gwyn Carwardine Sat, 21 Jan 2006 11:20:50 -0800

I don't want the users to have to use escape characters. I'd rather they
didn't have to use quotes.


Of course I think someone needs to go into the internals anyway... on 1.4.3
I get an index out of array bounds error (not a nice parse exception) when
it tries to parse the following (which it should be able to do):

["fred" TO "joe"]

Maybe this is fixed in 1.9 but I tried it on the www.lucenebook.com search
assuming that was using a recent version and that generates a server error!

It's a real shame that the QueryParserTokenManager had no comments put in to
explain what on earth it's doing!



-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris Hostetter
Sent: 21 January 2006 18:46
To: [email protected]
Subject: Re: Handling of colons in QueryParserTokenManager


if you are flexible in the syntax you are willing to support, you can tell
your users that they need to escape the colons that aren't ment as field
identifiers...

        ID:CI\:123

...alternately, you can tell them they have to quote colons...

        ID:"CI:123"

...then you can avoid the whole painfull mess of the parser internals.


: Date: Sat, 21 Jan 2006 13:10:56 -0000
: From: Gwyn Carwardine <[EMAIL PROTECTED]>
: Reply-To: [email protected]
: To: [email protected]
: Subject: Handling of colons in QueryParserTokenManager
:
: Hello, I'm new here. I've actually started using dotLucene but I think I
: need to make a change to the QueryParser but it's so complicated to try
and
: understand what it's doing I thought I'd ask if maybe one of you guys
could
: point me in the right direction?
:
: In my implementation of Lucene I have the need to store keywords that are
of
: the form "<key>:<identity>" for example CI:123. Whilst I can store this in
: Lucene using Field.Keyword("ID","CI:123") I can't easily look it up by
using
: QueryParser which I need to do.
:
: Whenever I parse the query ID:CI:123 it parses it as "ID:ci". Now I've
: already made a small hack so that non-tokenized values are indexed as
: lowercase so at least I can get them back if I use ID:CI\:123 but colons
are
: commonly used and I really don't want to have to escape them everywhere
:
: What I want to achieve is that query parser will parse ID:CI:123 as
: field(ID) value(CI:123). I understand that colon is a special character
but
: it's only used to delimit fields and values in which case it makes sense
to
: react to the first colon, the second colon should be treated as part of
the
: text which the analyzer could strip out or keep (in my case because I'm
: using a custom analyzer).
:
: Does this make sense? How do I go about changing the
QueryParserTokenManager
: to achieve this? Perhaps you can point me to some documentation that
: describes the code even?
:
: Any help gratefully received!
:
: Thanks,
: Gwyn Carwardine
:
:
: ---------------------------------------------------------------------
: To unsubscribe, e-mail: [EMAIL PROTECTED]
: For additional commands, e-mail: [EMAIL PROTECTED]
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Handling of colons in QueryParserTokenManager

Reply via email to