Wildcard stuff gets fun and interesting. Have a look at
SpanRegexQuery in the contrib/regex codebase. It is a more
generalized version of WildcardQuery, but within the SpanQuery family.
Erik
On Apr 13, 2006, at 11:22 AM, Erick Erickson wrote:
More of the same
So, now all I
OK, we've defined the problem away for the present. Which is probably a very
good thing. Scratch that, there's no doubt that it's a very good thing .
I'm still interested in anything you have to say, of course, but for now I'm
off and running
Thanks again for all your help
Erick
More of the same
So, now all I want to do is a SpanNearQuery on a wildcard. No problem if I
can use a wildcard query, but I can't because of the "TooManyClauses" issue.
We have two types of wildcards, truncation (which I can handle by indexing
successively shorter terms and a special characte
Chris:
Again, many thanks. Of course only *after* you mentioned that Hits is not
entirely efficient when looking at many docs did I remember TopDocs and
TopFieldDocs had been mentioned. Senior moments and all that
I completely missed the fact that ConstantScoreQuery doesn't take a query.
I'll
: Let's claim that all my clauses contain wildcards. What I *think* that means
: is that I can't very well use a filter "the normal way" since seachers
: require a query. And I don't want a query with a wildcard term.
the bueaty of ConstantScoreQuery is that it can wrap any filter ... so you
can
I think I'm almost there, thanks y'all have been a great help. Here are, I
hope, my last questions and I'll be all done beating on this
Let's claim that all my clauses contain wildcards. What I *think* that means
is that I can't very well use a filter "the normal way" since seachers
require a
Chris:
Thanks for that exposition, that helps me greatly. I didn't mention that I
tried increasing the MaxAllowedClause count and ran out of memory. And that
I don't trust those kinds of tweaks anyway. They'll blow up sometime,
somewhere and I'll get a phone call because our product is offline an
: If I understand this right, I could build my own BooleanQuery in chunks of,
: say, 1,000 terms each by just adding words given me by the WildCardTermEnum,
: right?
if you took that approach, you would avoid getting a TooManyClauses
exception, but you could far more easily avoid it by increasein
Erik:
Thanks, that helps a lot. I won't waste any more time chasing CSRQ, which is
definitely a plus.
I have to admit that I was hoping for a "RTFM, page ### (Read The FIELD
Manual))" response. Although since I completely missed WildCardTermEnum
maybe I *did* get the response I hoped for. I have
Eric,
Wildcard queries are tricky business. WildcardQuery by itself
without leveraging any analysis tricks is what you've got, but you
may want to consider injecting rotated tokens. For example, the word
cat would be indexed as "cat$", "at$c", "t$ca", and "$cat" (all in
the same positio
OK, I know I'm asking you to write my code for me (or at least point me to
an example), but I'm at my wits end, so please rescue me
This is a reprise of TooManyClauses. We have a large amount of text, and a
requirement to do a wildcard query. Of course, it's wy too big to use
Wildcard or t
11 matches
Mail list logo