The issue is that you are using an analyzer on the search query and not at 
index time.  The StandardAnalyzer that you are using at search time is 
lowercasing the query before searching against the index.  You have a few 
options that I can think of:

1 - use a different analyzer at search time (one that doesn't effect case - if 
there is one or create one yourself)
2 - analyze the field at index time (optionally storing the original field in a 
non-analyzed state - if you want the original Domain)

The KeywordAnalyzer probably isn't what you want because if you use it at 
search time you won't be able to use wildcard searching (unless you don't care 
about wildcard searching).


-Jeff


-----Original Message-----
From: Michel Nadeau [mailto:aka...@gmail.com]
Sent: Mon 12/14/2009 4:36 PM
To: java-user@lucene.apache.org
Subject: Lower/Uppercase problem when searching in a not-analyzed field
 
Hi !

My Lucene 3.0.0 index contains a field "DOMAIN" that contains an Internet
domain name - like

* www.DomainName.com
* www.domainname.com
* www.DomainName.com/path/to/document/doc.html?a=2

This field is indexed like this -

doc.add(new Field("DOMAIN", sValue, Field.Store.YES,
Field.Index.NOT_ANALYZED));

When I search in this field, my search query looks like this:

DOMAIN:www.DomainName*

My problem is that it seems it never returns domains with uppercase letters.

For example, I display all documents (using ConstantScoreQuery), and see
this domain name: www.BidClerk.com
...So I know it's there - and so I search for: DOMAIN:www.BidC* - well it
will *never* be found !

But whatever all-lowecase domain will be found, all the time.

My guess is that the problem is the analyzer I'm using - a StandadAnalyzer:

QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content", new
StandardAnalyzer(Version.LUCENE_CURRENT));
q = parser.parse(QUERY);

So here are my questions:
* Should I use a KeywordAnalyzer instead?
* If I have domains like WWW.ASK.COM, www.ask.com, www.Ask.com,
WwW.AsK.CoM- and I search for "DOMAIN:
www.ask.com" ; will they all be found whatever the case?

Thanks!

- Mike
aka...@gmail.com



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to