On Fri, Mar 2, 2012 at 6:22 PM, su ha <s_han...@yahoo.com> wrote: > Hi, > I'm new to Lucene. I'm indexed some documents with Lucene and need to > sanitize it to ensure > that they do not have any social security numbers (3-digits 2-digits > 4-digits). > > (How) Can I write a query (with the QueryParser) that searches for this > pattern? > > e.g. I can do [000 to 999] or [00 to 99] or [0000 to 9999], but this causes > hits with any 2, 3 or 4 digit number. > Something like "[000 to 999] [00 TO 99] [0000 TO 9999]", I get no hits at all. > > Is this possible with the default QueryParser? > Or is there some other programmatic way to do it?
The programmatic way is to use SpanMultiTermQueryWrapper around each RangeQuery and then SpanNearQuery around the lot. The default QueryParser probably can't do it. I believe someone was enhancing it for wildcards but I'm not sure if range queries were included in all that. TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org