Rob, look at the third hit:
http://www.lucenebook.com/search?query=bi-grams
Otis
- Original Message
From: Rob Young <[EMAIL PROTECTED]>
> That sounds like just what I'm looking for. Do you know if this is
> covered in Lucene in Action or where I can find more information about it.
E
using overlapping n-grams. Searching the list archive may
give you some background if Lucene in Action doesn't have enough info on this
topic.
-Original Message-
From: Rob Young [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 11, 2006 11:39 AM
To: java-user@lucene.apache.org
Subject: Re:
That sounds like just what I'm looking for. Do you know if this is
covered in Lucene in Action or where I can find more information about it.
Eric Isakson wrote:
You might consider using overlapping bi-gram tokenization with stripped out
whitespace and a PhraseQuery.
So your tokenized conten
D student at University of Trento, ITALY
==
- Original Message -
From: "Eric Isakson" <[EMAIL PROTECTED]>
To:
Sent: Thursday, May 11, 2006 3:54 PM
Subject: RE: Searching across spaces
You might consider using overlapping bi-gram tokenization with strippe
mean to do this too, but without knowing the
exact domain of compound words that you wish to support, this is probably the
best you will be able to do.
-Original Message-
From: Robert Young [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 10, 2006 2:09 PM
To: java-user@lucene.apache.org
Yes, I looked at the synonym sollution from Lucene in Action but, as
you point out, I have to know about it ahead of time. The only
sollution I've had so far is to index the term without the spaces as
well and then run two searches, one with spaces and one without. It
would work but it just seems
I suspect you have to do some fancy indexing. That is, index the following
terms: sponge bob square pants spongebob squarepants.
But this requires that you understand all the variations you want to hit on
ahead of time.
Or, you could conceivably deal with wildcard queries, but I think this is
th
Hi,
How can I search accross spaces in the document when the spaces aren't
present in the search. For example, if the document contains
"spongebob squarepants" but the user searches on "sponge bob" I would
like to get the result.
Thanks
Rob
--