Re: I just don't get wildcards at all.

2006-04-13 Thread Erik Hatcher
Wildcard stuff gets fun and interesting. Have a look at SpanRegexQuery in the contrib/regex codebase. It is a more generalized version of WildcardQuery, but within the SpanQuery family. Erik On Apr 13, 2006, at 11:22 AM, Erick Erickson wrote: More of the same So, now all I

Re: I just don't get wildcards at all.

2006-04-13 Thread Erick Erickson
OK, we've defined the problem away for the present. Which is probably a very good thing. Scratch that, there's no doubt that it's a very good thing . I'm still interested in anything you have to say, of course, but for now I'm off and running Thanks again for all your help Erick

Re: I just don't get wildcards at all.

2006-04-13 Thread Erick Erickson
More of the same So, now all I want to do is a SpanNearQuery on a wildcard. No problem if I can use a wildcard query, but I can't because of the "TooManyClauses" issue. We have two types of wildcards, truncation (which I can handle by indexing successively shorter terms and a special characte

Re: I just don't get wildcards at all.

2006-04-10 Thread Erick Erickson
Chris: Again, many thanks. Of course only *after* you mentioned that Hits is not entirely efficient when looking at many docs did I remember TopDocs and TopFieldDocs had been mentioned. Senior moments and all that I completely missed the fact that ConstantScoreQuery doesn't take a query. I'll

Re: I just don't get wildcards at all.

2006-04-10 Thread Chris Hostetter
: Let's claim that all my clauses contain wildcards. What I *think* that means : is that I can't very well use a filter "the normal way" since seachers : require a query. And I don't want a query with a wildcard term. the bueaty of ConstantScoreQuery is that it can wrap any filter ... so you can

Re: I just don't get wildcards at all.

2006-04-09 Thread Erick Erickson
I think I'm almost there, thanks y'all have been a great help. Here are, I hope, my last questions and I'll be all done beating on this Let's claim that all my clauses contain wildcards. What I *think* that means is that I can't very well use a filter "the normal way" since seachers require a

Re: I just don't get wildcards at all.

2006-04-08 Thread Erick Erickson
Chris: Thanks for that exposition, that helps me greatly. I didn't mention that I tried increasing the MaxAllowedClause count and ran out of memory. And that I don't trust those kinds of tweaks anyway. They'll blow up sometime, somewhere and I'll get a phone call because our product is offline an

Re: I just don't get wildcards at all.

2006-04-08 Thread Chris Hostetter
: If I understand this right, I could build my own BooleanQuery in chunks of, : say, 1,000 terms each by just adding words given me by the WildCardTermEnum, : right? if you took that approach, you would avoid getting a TooManyClauses exception, but you could far more easily avoid it by increasein

Re: I just don't get wildcards at all.

2006-04-08 Thread Erick Erickson
Erik: Thanks, that helps a lot. I won't waste any more time chasing CSRQ, which is definitely a plus. I have to admit that I was hoping for a "RTFM, page ### (Read The FIELD Manual))" response. Although since I completely missed WildCardTermEnum maybe I *did* get the response I hoped for. I have

Re: I just don't get wildcards at all.

2006-04-08 Thread Erik Hatcher
Eric, Wildcard queries are tricky business. WildcardQuery by itself without leveraging any analysis tricks is what you've got, but you may want to consider injecting rotated tokens. For example, the word cat would be indexed as "cat$", "at$c", "t$ca", and "$cat" (all in the same positio

I just don't get wildcards at all.

2006-04-07 Thread Erick Erickson
OK, I know I'm asking you to write my code for me (or at least point me to an example), but I'm at my wits end, so please rescue me This is a reprise of TooManyClauses. We have a large amount of text, and a requirement to do a wildcard query. Of course, it's wy too big to use Wildcard or t