Treating them as two separate words when quoted is indicative of your
analyzer not being sufficient for your domain. What Analyzer are you
using? Do you have knowledge of what it is tokenizing text into?
I have created a custom analyzer (CobolAnalyzer) which contains some custom
stop words
Heinrich [mailto:[EMAIL PROTECTED]
Sent: 15 December 2003 18:32
To: 'Lucene Users List'
Subject: RE: Disabling modifiers?
If you don't want to fiddle with the JavaCC source of QueryParser.jj, you
could work with a regular expression that works in front of the actual query
parser. I just did something
I think it is a problem with the indexing. I've found another example...
WS-CA-PP00-PROCESS-YYMM
I've looked at the index, and it has been tokenized into 3 words...
WS
CA-PP00-PROCESS
YYMM
Looks as though I might have to use a custom tokenizer as well as an
analyzer then, but any ideas as to
On Tuesday, December 16, 2003, at 05:46 AM, Iain Young wrote:
Treating them as two separate words when quoted is indicative of your
analyzer not being sufficient for your domain. What Analyzer are you
using? Do you have knowledge of what it is tokenizing text into?
I have created a custom
*
*
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 16 December 2003 12:31
To: Lucene Users List
Subject: Re: Disabling modifiers?
On Tuesday, December 16, 2003, at 07:28 AM, Erik Hatcher wrote:
And yes, if you are using StandardTokenizer, you are probably
this with
some simple changes to StandardTokenizer.jj.
- Original Message -
From: Iain Young [EMAIL PROTECTED]
To: 'Lucene Users List' [EMAIL PROTECTED]
Sent: Tuesday, December 16, 2003 7:46 AM
Subject: RE: Disabling modifiers?
I think it is a problem with the indexing. I've found another
*
* www.microfocus.com/devforum *
*
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 16 December 2003 12:31
To: Lucene Users List
Subject: Re: Disabling modifiers?
On Tuesday, December 16, 2003, at 07:28 AM, Erik Hatcher wrote
On Monday, December 15, 2003, at 12:12 PM, Iain Young wrote:
A quick question. Is there any way to disable the - and + modifiers in
the
QueryParser?
Not currently.
I've had a bit of success by putting quotes around the offending
names, (as
suggested on this list), but the results are still
If you don't want to fiddle with the JavaCC source of QueryParser.jj, you
could work with a regular expression that works in front of the actual query
parser. I just did something similar because I input Lucene's query strings
into a latent semantic analysis algorithm and remove words with + and ?