Wildcard stuff gets fun and interesting. Have a look at
SpanRegexQuery in the contrib/regex codebase. It is a more
generalized version of WildcardQuery, but within the SpanQuery family.
Erik
On Apr 13, 2006, at 11:22 AM, Erick Erickson wrote:
More of the same....
So, now all I want to do is a SpanNearQuery on a wildcard. No
problem if I
can use a wildcard query, but I can't because of the
"TooManyClauses" issue.
We have two types of wildcards, truncation (which I can handle by
indexing
successively shorter terms and a special character at the end. i.e.
cat$ ca$
c$). This is neat, it gives me relevance, speed, all the virtues.
The second is a??b????c. Which I don't see any clever indexing
schemes to
handle, but let me think about that more at lunch <G>.
Without asking you to write *too* much for me, what should I look
at in
order to make this work with SpanNear? I'm guessing that I have to
iterate
the wildcard terms, collect a filter for documents "close enough"
and use
that filtered query (again thanks for all your help figuring this
out) to
restrict the returned documents.
What I have in mind is enumerating over the first wildcard term.
For each
successive term, ask if other terms are "close enough" and using
that answer
to possibly turn off that document in the filter. Eventually I'll
have a
filter that is what I want.
Now, this all seems expensive and complex. If the answer is "don't
go there,
use clever indexing or don't try it with very many documents
because it'll
take forever", then that in intself would save me a bunch of work.
Thanks again
Erick
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]