Re: wildcard and span queries

2006-10-23 Thread Erick Erickson
I thought I'd update folks on the continuing saga. Many thanks to all who've contributed to my education. Here's our current resolution; It turns out that the PM will cope with restricting wildcards two ways. 1 there must be at least 3 non-wildcard characters 2 wildcards cannot appear in the

Re: wildcard and span queries

2006-10-11 Thread Erick Erickson
Problem 3482: I'm probably close to being able to start work. Except... How to count hits with SrndQuery? Or, more generally, with arbitrary wildcards and boolean operators? So, say I've indexed a book by page. That is, each page is a document. I know a particular page matches my query because

Re: wildcard and span queries

2006-10-11 Thread Erik Hatcher
Erick - what about using getSpans() from the SpanQuery that is generated? That should give you what you're after I think. Erik On Oct 11, 2006, at 2:17 PM, Erick Erickson wrote: Problem 3482: I'm probably close to being able to start work. Except... How to count hits with

Re: wildcard and span queries

2006-10-11 Thread Paul Elschot
On Wednesday 11 October 2006 20:30, Erik Hatcher wrote: Erick - what about using getSpans() from the SpanQuery that is generated? That should give you what you're after I think. Erik You can also use skipTo(docNr) on the spans to skip to the docNr of the book that you're after. A

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
OK, I'm using the surround code, and it seems to be working...with the following questions (always, more questions)... I'm gettng an exception sometimes of TooManyBasicQueries. I can control this by initializing BasicQueryFactory with a larger number. Do you have any cautions about upping this

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
OK, forget the stuff about TooManyBooleanClauses. I finally figured out that if I specify the surround to have the same semantics as a SpanRegex ( i.e, and(eri*, mal*)) it blows up with TooManyBooleanClauses. So that makes more sense to me now. Specifying 20w(eri*, mal*) is what I was using

Re: wildcard and span queries

2006-10-09 Thread Paul Elschot
Erick, On Monday 09 October 2006 21:20, Erick Erickson wrote: OK, forget the stuff about TooManyBooleanClauses. I finally figured out that if I specify the surround to have the same semantics as a SpanRegex ( i.e, and(eri*, mal*)) it blows up with TooManyBooleanClauses. So that makes more

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
I've already started that conversation with the PM, I'm just trying to get a better idea of what's possible. I'll whimper tooth and nail to keep from having to do a lot of work to add a feature to a product that nobody in their right mind would ever use G. As far as the grammar, we don't

Re: wildcard and span queries

2006-10-09 Thread Doron Cohen
Erick Erickson [EMAIL PROTECTED] wrote on 09/10/2006 13:09:21: ... The kicker is that what we are indexing is OCR data, some of which is pretty trashy. So you wind up with interesting words in your index, things like rtyHrS. So the whole question of allowing very specific queries on detailed

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
Doron: Thanks for the suggestion, I'll certainly put it on my list, depending upon what the PM decides. This app is geneaology reasearch, and users *can* put in their own wildcards... This is why I love this list... lots of smart people giving me suggestions I never would have thought of G...

Re: wildcard and span queries

2006-10-06 Thread Paul Elschot
On Friday 06 October 2006 14:37, Erick Erickson wrote: ... Fortunately, the PM agrees that it's silly to think about span queries involving OR or NOT for this app. So I'm left with something like Jo*n AND sm*th AND jon?es WITHIN 6. OR works much the same as term expansion for wildcards. The

Re: wildcard and span queries

2006-10-06 Thread Erick Erickson
Paul: Splendid! Now if I just understood a single thing about the SrndQuery family G. I followed your link, and took a look at the text file. That should give me enough to get started. But if you wanted to e-mail me any sample code or long explanations of what this all does, I would forever be

Re: wildcard and span queries

2006-10-06 Thread Paul Elschot
Erick, On Friday 06 October 2006 22:01, Erick Erickson wrote: Paul: Splendid! Now if I just understood a single thing about the SrndQuery family G. I followed your link, and took a look at the text file. That should give me enough to get started. But if you wanted to e-mail me any

Re: wildcard and span queries

2006-10-06 Thread Mark Miller
Paul's parser is beyond my feeble comprehension...but I would start by looking at SrndTruncQuery. It looks to me like this enumerates each possible match just like a SpanRegexQuery does...I am too lazy to figure out what the visitor pattern is doing so I don't know if they then get added to a

Re: wildcard and span queries

2006-10-06 Thread Paul Elschot
Mark, On Friday 06 October 2006 22:46, Mark Miller wrote: Paul's parser is beyond my feeble comprehension...but I would start by looking at SrndTruncQuery. It looks to me like this enumerates each possible match just like a SpanRegexQuery does...I am too lazy to figure out what the visitor