Re: Wildcard Terms and total word or phrase count

2015-11-29 Thread Michael Wilkowski
.com] > Sent: Sunday, November 29, 2015 12:18 PM > To: java-user@lucene.apache.org > Subject: Re: Wildcard Terms and total word or phrase count > > You didn't post your code that creates the index. Make sure you are using a > tokenized TextField rather than a single-token Str

RE: Wildcard Terms and total word or phrase count

2015-11-29 Thread Kunzman, Douglas *
Sent: Sunday, November 29, 2015 12:18 PM To: java-user@lucene.apache.org Subject: Re: Wildcard Terms and total word or phrase count You didn't post your code that creates the index. Make sure you are using a tokenized TextField rather than a single-token StringField. -- Jack Krupansky On F

Re: Wildcard Terms and total word or phrase count

2015-11-29 Thread Jack Krupansky
You didn't post your code that creates the index. Make sure you are using a tokenized TextField rather than a single-token StringField. -- Jack Krupansky On Fri, Nov 27, 2015 at 4:06 PM, Kunzman, Douglas * < douglas.kunz...@fda.hhs.gov> wrote: > Hi - > > This is my first Lucene project, my other

RE: Wildcard Terms and total word or phrase count

2015-11-29 Thread Kunzman, Douglas *
-Original Message- From: Michael Wilkowski [mailto:m...@silenteight.com] Sent: Sunday, November 29, 2015 3:38 AM To: java-user@lucene.apache.org Subject: Re: Wildcard Terms and total word or phrase count It is because your index does not contain term quar* and this statistics function is not

Re: Wildcard Terms and total word or phrase count

2015-11-29 Thread Michael Wilkowski
It is because your index does not contain term quar* and this statistics function is not a query (you have to pass exact form of the term). To count terms that meet search criteria you may run search query with custom collector and count results. Or use normal search query returning TopDocs and jus

RE: Wildcard searches

2014-02-06 Thread Allison, Timothy B.
s.com] Sent: Thursday, February 06, 2014 5:19 PM To: java-user@lucene.apache.org Subject: RE: Wildcard searches Sorry, but I don't know what exactly you mean by compile from these locations. Do you mean I could download and customize the code? Regards, Raghu -Original Message- Fro

RE: Wildcard searches

2014-02-06 Thread raghavendra.k.rao
-user@lucene.apache.org Subject: RE: Wildcard searches Sorry, you're right. I'm not sure that it analyzes multiterm components, either. The Surround query parser also has similar limitations. Best bet might be to compile: https://issues.apache.org/jira/i#browse/LUCENE-5205 or https://issues.

RE: Wildcard searches

2014-02-06 Thread Allison, Timothy B.
--Original Message- From: raghavendra.k@barclays.com [mailto:raghavendra.k@barclays.com] Sent: Thursday, February 06, 2014 11:49 AM To: java-user@lucene.apache.org Subject: RE: Wildcard searches Thank you, Tim. I have read that ComplexPhraseQueryParser has issues while searching in more

RE: Wildcard searches

2014-02-06 Thread raghavendra.k.rao
ComplexPhraseQueryParser that you may be aware of? I am looking for some examples. Thanks! Regards, Raghu -Original Message- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Thursday, February 06, 2014 8:02 AM To: java-user@lucene.apache.org Subject: RE: Wildcard searches Ditto Jack on

RE: Wildcard searches

2014-02-06 Thread Allison, Timothy B.
Ditto Jack on ComplexPhraseQueryParser. See also: https://issues.apache.org/jira/i#browse/LUCENE-5205 -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, February 05, 2014 6:59 PM To: java-user@lucene.apache.org Subject: Re: Wildcard searches Take

Re: Wildcard searches

2014-02-05 Thread Michael Sokolov
On 2/5/2014 6:30 PM, raghavendra.k@barclays.com wrote: Hi, Can Lucene support wildcard searches such as the ones shown below? Indexed value is "XYZ CORPORATION LIMITED". If you index the value as a single token (KeywordTokenizer), there is nothing really special about the examples you gav

Re: Wildcard searches

2014-02-05 Thread Jack Krupansky
Take a look at the complex phrase query parser. See: http://lucene.apache.org/core/4_6_0/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html See also: https://issues.apache.org/jira/browse/LUCENE-1486 -- Jack Krupansky -Original Message- From: raghave

Re: wildcard search not working on file paths

2013-10-14 Thread Ian Lea
You seem to be indexing paths delimited by backslash then saying a search for Samples/* doesn't match anything. No surprises there, if I've read your code correctly. Since you are creating wildcard queries directly from Terms I don't think that lucene escaping is relevant here, But the presence

Re: wildcard search not working on file paths

2013-10-14 Thread nischal reddy
Hi Ian, Please find a sample program below which better illustrates the scenario public class TestWriter { public static void main(String[] args) throws IOException { createIndex(); searchIndex(); } public static void createIndex() throws IOException { Di

Re: wildcard search not working on file paths

2013-10-14 Thread Ian Lea
Seems to me that it should work. I suggest you show us a complete self-contained example program that demonstrates the problem. -- Ian. On Mon, Oct 14, 2013 at 12:42 PM, nischal reddy wrote: > Hi Ian, > > Actually im able to do wildcard searches on all the fields except the > "filePath" field

Re: wildcard search not working on file paths

2013-10-14 Thread nischal reddy
Hi Ian, Actually im able to do wildcard searches on all the fields except the "filePath" field. I am able to do both the leading and trailing wildcard searches on all the fields, but when i do the wildcard search on filepath field it is somehow not working, an eg file path would look some thing li

Re: wildcard search not working on file paths

2013-10-14 Thread Ian Lea
Do some googling on leading wildcards and read things like http://www.gossamer-threads.com/lists/lucene/java-user/175732 and pick an option you like. -- Ian. On Mon, Oct 14, 2013 at 9:12 AM, nischal reddy wrote: > Hi, > > I have problem with doing wild card search on file path fields. > > i ha

Re: Wildcard question

2013-10-09 Thread Jack Krupansky
You get to decide: class QueryParser extends QueryParserBase: /** * Set to true to allow leading wildcard characters. * * When set, * or ? are allowed as * the first character of a PrefixQuery and WildcardQuery. * Note that this can produce very slow * queries on big indexes. * * Default: fals

Re: Wildcard in PhraseQuery

2013-08-27 Thread mark harwood
See  http://lucene.apache.org/core/4_3_1/queryparser/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.html From: Ian Lea To: java-user@lucene.apache.org Sent: Tuesday, 27 August 2013, 10:16 Subject: Re: Wildcard in PhraseQuery See the

Re: Wildcard in PhraseQuery

2013-08-27 Thread Ian Lea
See the FAQ: http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_combine_wildcard_and_phrase_search.2C_e.g._.22foo_ba.2A.22.3F -- Ian. On Tue, Aug 27, 2013 at 5:11 AM, Chuming Chen wrote: > Hi All, > > Can I use wildcard in a phrase query in Lucene/Solr? Can anybody point me > some directions

Re: Wildcard in a text field

2013-02-08 Thread Steve Rowe
Hi Nicolas, For trailing '*' only: On the query side, you can use a front-side EdgeNGramTokenFilter with a large max gram size, followed by a PatternReplaceFilter with pattern "(.*)" and replacement "$1*". Steve On Feb 8, 2013, at 10:14 AM, Nicolas Roduit wrote: > For instance, I have a lis

Re: Wildcard in a text field

2013-02-08 Thread Jack Krupansky
From: Nicolas Roduit Sent: Friday, February 08, 2013 10:14 AM To: java-user@lucene.apache.org Subject: Re: Re: Wildcard in a text field For instance, I have a list of tags related to a text. Each text with its list of tags are put in a document and indexed by Lucene. If we consider that a tag

Re: Re: Wildcard in a text field

2013-02-08 Thread Ian Lea
So you want a query for "buddhist" to match indexed term "budd*"? Or "bud* or "bu*" or "b*"? Assuming you are using an analyzer that preserves the *, given "buddhist" you could search for exactly "b*", "bu*" etc. I'll let you think about ? rather than *. -- Ian. On Fri, Feb 8, 2013 at 3:14 P

Re: Re: Wildcard in a text field

2013-02-08 Thread Nicolas Roduit
For instance, I have a list of tags related to a text. Each text with its list of tags are put in a document and indexed by Lucene. If we consider that a tag is "buddh*" and I would like to make a query (e.g. "buddha" or "buddhism" or "buddhist") and find the document that contain "buddh*". T

Re: Wildcard in a text field

2013-02-08 Thread Jack Krupansky
That description is too vague. Could you provide a couple of examples of index text and queries and what you expect those queries to match. If you simply want to query for "*" and "?" in "string" fields, escape them with a backslash. But if you want to escape them in "text" fields, be sure to

Re: Wildcard Case Sensitivity

2011-01-20 Thread Jack Krupansky
Wildcards only work for a single term. At index time the underscore in TEST_TYPE is treated as if it were a space separator, producing two terms. At query time the existence of the wildcard suppresses ALL analysis of the term (although that behavior may vary between query parsers), so that the

RE: Wildcard searches????

2010-02-05 Thread Fuad Efendi
etter to post in SOLR; it is just configuration settings without hard coding...) > -Original Message- > From: Ted Dunning [mailto:ted.dunn...@gmail.com] > Sent: February-05-10 6:45 PM > To: gene...@lucene.apache.org > Cc: java-user@lucene.apache.org > Subje

RE: Wildcard searches????

2010-02-05 Thread Fuad Efendi
questions in solr-u...@lucene.apache.org... -Fuad > -Original Message- > From: Niclas Rothman [mailto:n...@lechill.com] > Sent: February-05-10 6:12 PM > To: gene...@lucene.apache.org > Cc: java-user@lucene.apache.org > Subject: RE: Wildcard searches > > Hi Fuad

RE: Wildcard searches????

2010-02-05 Thread Digy
http://en.wikipedia.org/wiki/Crossposting -Original Message- From: Niclas Rothman [mailto:n...@lechill.com] Sent: Saturday, February 06, 2010 12:12 AM To: gene...@lucene.apache.org Cc: java-user@lucene.apache.org Subject: RE: Wildcard searches Hi Fuad and thanks for your reply! The

RE: Wildcard searches????

2010-02-05 Thread Fuad Efendi
zed). -Fuad P.S. Don't use * when you create Lucene document; use it as part of query. > -Original Message- > From: Niclas Rothman [mailto:n...@lechill.com] > Sent: February-05-10 4:44 PM > To: gene...@lucene.apache.org > Cc: java-user@lucene.apache.org > Subj

RE: Wildcard searches and document boost

2009-12-20 Thread Uwe Schindler
That would be an option. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: TorAtle [mailto:toratle...@gmail.com] > Sent: Sunday, December 20, 2009 5:20 PM > To: java-user@lucene.apache.org &

Re: Wildcard searches and document boost

2009-12-20 Thread TorAtle
> You need to change multiTermRewriteMethod of QueryParser. > qp.setMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE); Thanks. So the normal way of doing this is setting the rewrite method to scoring, and if BooleanQuery.TooManyClauses is catched then switch back to constant sc

Re: Wildcard searches and document boost

2009-12-19 Thread AHMET ARSLAN
> For instance, when searching for "wash*" I want > "Washington" (the city) to > appear before "Washington Park", so I have boosted the > "Washington" > document. Unfortunately, when using WildcardQuery, the > score is always 1.0. > > Luke says my query has been rewritten to > "ConstantScore(myFie

RE: Wildcard searches and document boost

2009-12-19 Thread Uwe Schindler
WildcardQuery is a subclass of MultiTermQuery, read the javadocs about MTQ, than you might understand whats going on and what you can do. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: TorAtle [mail

Re: Wildcard query

2009-08-31 Thread Anshum
e word and then apply /expand the > wild card query. Is this a bug with queryparser? OR Analyzer will not stem > the word if it finds a wild card operator in it? > > Regards > Ganesh > > > - Original Message - > From: "Anshum" > To: > Sent: Mon

Re: Wildcard query

2009-08-31 Thread Ganesh
parser? OR Analyzer will not stem the word if it finds a wild card operator in it? Regards Ganesh - Original Message - From: "Anshum" To: Sent: Monday, August 31, 2009 3:01 PM Subject: Re: Wildcard query > Hi Ganesh, > > Its the snowball analyzer that uses English

Re: Wildcard query

2009-08-31 Thread Anshum
Hi Ganesh, Its the snowball analyzer that uses English Stemmer which is causing this behavior. When you search for* 'attention'*, the query gets parsed to*'attent' *whereas when you search for *'attenti'* it stays as it is because the stemmer is not able to fit it anywhere. Could you tell me what

Re: Wildcard search fails

2009-08-13 Thread Erick Erickson
Several, all of which boil down to "what analyzers are you usingduring indexing and searching?". Without that information, we can't say much. Also, I'd recommend you get a copy of Luke and examine your index to see whether what's in there is what you expect. And query.toString and (as Grant says)

Re: Wildcard query ...

2008-10-13 Thread Chris Hostetter
BooleanQuery picks a Scorer based on the number of clauses and what their options are ... all of teh scorers it might pick from are smart enough to continuously reorder the clauses having them "skip ahead" to the next document they match, beyond whatever docIds it already knows can't match (ba

Re: Wildcard and Literal Searches combined

2008-06-25 Thread Chris Hostetter
: My users require wildcard searches. Sometimes their search phrases contain : spaces. I am having trouble trying to implement a wildcard search on strings : containing spaces, so if the term includes spaces I force a literal search : by adding double quotes to the search term. : So the search str

Re: Wildcard and Literal Searches combined

2008-06-24 Thread Erick Erickson
Do you require that the words be right next to each other? You can, of course, set your default to AND (it's OR unless you change it explicitly). That'll give you documents that have both Dublin and City. You don't need wildcards at all in this case. If you require exact matches, you can use Phras

RE: Wildcard and Literal Searches combined

2008-06-24 Thread Jon Loken
: mick l [mailto:[EMAIL PROTECTED] Sent: 24 June 2008 14:22 To: java-user@lucene.apache.org Subject: RE: Wildcard and Literal Searches combined Cheers, Out of interest, how did you go about it in the end? jloken wrote: > > Hi, > > I posed a similar question on 09 May 08. > >

RE: Wildcard and Literal Searches combined

2008-06-24 Thread mick l
Cheers, Out of interest, how did you go about it in the end? jloken wrote: > > Hi, > > I posed a similar question on 09 May 08. > > The response was as below. I did not go down this route however, as a > wild carded 'exact' phrase is in a way contradictory. > Regards > Jon > > > > Hi, >

RE: Wildcard and Literal Searches combined

2008-06-24 Thread Jon Loken
Hi, I posed a similar question on 09 May 08. The response was as below. I did not go down this route however, as a wild carded 'exact' phrase is in a way contradictory. Regards Jon Hi, Here's a searchable mailing list archive: http://www.gossamer-threads.com/lists/lucene/java-user/ As re

RE: Wildcard & filters

2007-10-12 Thread Beard, Brian
Mark, That's working right out of the box. Thanks again. -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Friday, October 12, 2007 2:06 PM To: java-user@lucene.apache.org Subject: Re: Wildcard & filters No probwas a bit hasty though: replace = n

Re: Wildcard & filters

2007-10-12 Thread Mark Miller
Subject: Re: Wildcard & filters Something along these lines: public class WildcardFilter extends Filter { private Term term; public WildcardFilter(Term term) { this.term = term; } @Override public BitSet bits(IndexReader reader) throws IOException { BitSet

RE: Wildcard & filters

2007-10-12 Thread Beard, Brian
Mark, Thanks so much. -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Friday, October 12, 2007 1:54 PM To: java-user@lucene.apache.org Subject: Re: Wildcard & filters Something along these lines: public class WildcardFilter extends Filter { private Term

Re: Wildcard & filters

2007-10-12 Thread Mark Miller
Something along these lines: public class WildcardFilter extends Filter { private Term term; public WildcardFilter(Term term) { this.term = term; } @Override public BitSet bits(IndexReader reader) throws IOException { BitSet bits = new BitSet(); WildcardTermE

Re: Wildcard vs Term query

2007-09-26 Thread Erik Hatcher
WildcardQuery won't be slower than TermQuery if there are no wildcard characters. Beyond what QueryParser does, WildcardQuery itself reverts to a TermQuery: public Query rewrite(IndexReader reader) throws IOException { if (this.termContainsWildcard) { return super.rewrite(r

Re: Wildcard vs Term query

2007-09-26 Thread John Byrne
I'm not using the QueryParser at all. I need to do a little more with the terms, so i'm explicitly creating a Query from a single term. What I was hoping was to avoid something like this: ... if(term.contains("*") || terms.contains("?") { return new WildcardQuery(... } else { return new T

Re: Wildcard vs Term query

2007-09-26 Thread mark harwood
Are you using the out of the box Lucene QueryParser? It will automatically detect wildcard queries by the presence of * or ? chars. If the user input does not contain these characters a plain TermQuery is used. BooleanQuery.setMaxClauseCount can be used to control the upper limit on terms produ

Re: Wildcard queries don't work on untokenized fields (Lucene 2.2.0)

2007-07-12 Thread Mark Miller
On the QueryParser: /** * Whether terms of wildcard, prefix, fuzzy and range queries are to be automatically * lower-cased or not. Default is true. */ public void setLowercaseExpandedTerms(boolean lowercaseExpandedTerms) { this.lowercaseExpandedTerms = lowercaseExpandedTerms; } -

Re: Wildcard query with untokenized punctuation (again)

2007-06-15 Thread Erick Erickson
he query <> is parsed to > PhraseQuery("smith ann"). > And that seems right, from a user standpoint. > > In fact, considering this, I realize <> should be parsed > to MultiPhraseQuery("smith", "ann*"), not <<+smith +ann*>> as I sa

RE: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Renaud Waldura
his issue: how to get QueryParser to generate MultiPhraseQueries. Got some good ideas from it, but unfortunately no complete solution. I'll keep on hacking. --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Thursday, June 14, 2007 12:07 PM To: java-user@

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Mark Miller
uot;, "ann*"), not <<+smith +ann*>> as I said earlier. B. Getting hairy. Any hope? --Renaud -Original Message- From: Mark Miller [mailto:[EMAIL PROTECTED] Sent: Thursday, June 14, 2007 6:43 AM To: java-user@lucene.apache.org Subject: Re: Wildcard query with unt

RE: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Renaud Waldura
r [mailto:[EMAIL PROTECTED] Sent: Thursday, June 14, 2007 6:43 AM To: java-user@lucene.apache.org Subject: Re: Wildcard query with untokenized punctuation (again) Gotto agree with Erick here...best idea is just to preprocess the query before sending it to the QueryParser. My first thought i

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Mark Miller
Gotto agree with Erick here...best idea is just to preprocess the query before sending it to the QueryParser. My first thought is always to get out the sledgehammer... - Mark Erick Erickson wrote: Well, perhaps the simplest thing would be to pre-process the query and make the comma into a whi

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Mathieu Lecarme
if you don't use the same tokenizer for indexing and searching, you will have troubles like this. Mixing exact match (with ") and wildcard (*) is a strange idea. Typographical rules says that you have a space after a comma, no? Your field is tokenized? M. Renaud Waldura a écrit : > My very simple

Re: Wildcard query with untokenized punctuation (again)

2007-06-14 Thread Erick Erickson
Well, perhaps the simplest thing would be to pre-process the query and make the comma into a whitespace before sending anything to the query parser. I don't know how generalizable that sort of solution is in your problem space though Best Erick On 6/13/07, Renaud Waldura <[EMAIL PROTECTED]>

Re: Wildcard query with untokenized punctuation (again)

2007-06-13 Thread Mark Miller
After taking a quick look, I don't see how you can do this without modifying the QueryParser. In QueryParser.jj you will find the conflict of interest at line 891. This line will cause a match on smith,ann* and trigger a wildcard term match on the whole piece. This is again caused by the fact

Re: Wildcard searches with * or ? as the first character - Thanks

2007-03-14 Thread Oystein Reigem
Thanks Steven and Antony. I read the FAQ not very long ago, but that slipped my attention. Or perhaps it's a recent change. - Øystein - -- Øystein Reigem, The department of culture, language and information technology (Aksis), Allegt 27, N-5007 Bergen, Norway. Tel: +47 55 58 32 42. Fax: +47

Re: Wildcard searches with * or ? as the first character

2007-03-13 Thread Antony Bowesman
I have read that with Lucene it is not possible to do wildcard searches with * or ? as the first character. Wildcard searches with * as the Lucene supports it. If you are using QueryParser to parse your queries see http://lucene.apache.org/java/docs/api/org/apache/lucene/queryParser/QueryPars

RE: Wildcard searches with * or ? as the first character

2007-03-13 Thread Steven Parkes
It's possible to do leading wildcard searches in Lucene as of 2.1. See http://wiki.apache.org/lucene-java/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695 (http://tinyurl.com/366suf) -Original Message- From: Oystein Reigem [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 13, 2007 11

RE: Wildcard query with untokenized punctuation

2007-03-12 Thread Chris Hostetter
: You're entirely correct about the analyzer (I'm using one that breaks on : non-alphanumeric characters, so all punctuation is ignored). To be : honest, I hadn't thought about altering this, but I guess I could; just : reticent that there might be unforeseen consequences. this is where the PerF

RE: Wildcard query with untokenized punctuation

2007-03-10 Thread Doron Cohen
reasoning behind not analyzing wildcard queries is also explained in the FAQ: "Are Wildcard, Prefix, and Fuzzy queries case sensitive?" Regards, Doron > > --Colin McGuigan > > -Original Message- > From: Doron Cohen [mailto:[EMAIL PROTECTED] > Sent: Saturday, Mar

RE: Wildcard query with untokenized punctuation

2007-03-10 Thread McGuigan, Colin
arch 10, 2007 2:08 AM To: java-user@lucene.apache.org Subject: Re: Wildcard query with untokenized punctuation Hi Colin, Is it possible that you are using an analyzer that breaks words on non letters? For instance SimpleAnalyzer? if so, the doc text: pagefile.sys is indexed as two words: pagefile

Re: Wildcard query with untokenized punctuation

2007-03-10 Thread Doron Cohen
Hi Colin, Is it possible that you are using an analyzer that breaks words on non letters? For instance SimpleAnalyzer? if so, the doc text: pagefile.sys is indexed as two words: pagefile sys At search time, the query text: pagefile.sys is also parsed-tokenized into a two words query: prof

RE: Wildcard query with untokenized punctuation

2007-03-09 Thread McGuigan, Colin
-Original Message- From: Steffen Heinrich [mailto:[EMAIL PROTECTED] Sent: Fri 3/9/2007 4:31 PM To: java-user@lucene.apache.org Subject: Re: Wildcard query with untokenized punctuation On 9 Mar 2007 at 15:10, McGuigan, Colin wrote: >> I have a "filename" field in Luc

Re: Wildcard query with untokenized punctuation

2007-03-09 Thread Steffen Heinrich
On 9 Mar 2007 at 15:10, McGuigan, Colin wrote: > I have a "filename" field in Lucene that holds a value, like this: > pagefile.sys > Hi Colin, I'm still _very_ new to lucene, but isn't that what the un-tokenized indexing is for? Like in 1.9.1 doc.add(Field.Keyword("filename", "pagefile.sys"));

Re: wildcard and span queries

2006-10-23 Thread Erick Erickson
I thought I'd update folks on the continuing saga. Many thanks to all who've contributed to my education. Here's our current resolution; It turns out that the PM will cope with restricting wildcards two ways. 1> there must be at least 3 non-wildcard characters 2> wildcards cannot appear in the fi

RE: Wildcard Search and "Note: You cannot use a * or ? symbol as the first character of a search"

2006-10-20 Thread Steven Parkes
ospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, October 20, 2006 11:00 AM To: java-user@lucene.apache.org Subject: Re: Wildcard Search and "Note: You cannot use a * or ? symbol as the first character of a search" Larry, the patch for that is already in JIRA, the issue-tracking

RE: Wildcard Search and "Note: You cannot use a * or ? symbol as the first character of a search"

2006-10-20 Thread Vladimir Olenin
Don't know Lucene internals, but I'd say you'd have to create your own 'reverse' B-Tree of some kind (Lucene gurus will probably advise you on the place where this can be changed in the Lucene). Even if this functionality can't be redefined in Lucene itself, you can easily implement it by yourself

Re: Wildcard Search and "Note: You cannot use a * or ? symbol as the first character of a search"

2006-10-20 Thread Otis Gospodnetic
Larry, the patch for that is already in JIRA, the issue-tracking system, and might be committed soon. Otis - Original Message From: "Fillion, Larry" <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, October 20, 2006 1:58:04 PM Subject: Wildcard Search and "Note: You cann

Re: wildcard and span queries

2006-10-11 Thread Paul Elschot
On Wednesday 11 October 2006 20:30, Erik Hatcher wrote: > Erick - what about using getSpans() from the SpanQuery that is > generated? That should give you what you're after I think. > > Erik You can also use skipTo(docNr) on the spans to skip to the docNr of the book that you're after.

Re: wildcard and span queries

2006-10-11 Thread Erik Hatcher
Erick - what about using getSpans() from the SpanQuery that is generated? That should give you what you're after I think. Erik On Oct 11, 2006, at 2:17 PM, Erick Erickson wrote: Problem 3482: I'm probably close to being able to start work. Except... How to count hits with SrndQu

Re: wildcard and span queries

2006-10-11 Thread Erick Erickson
Problem 3482: I'm probably close to being able to start work. Except... How to count hits with SrndQuery? Or, more generally, with arbitrary wildcards and boolean operators? So, say I've indexed a book by page. That is, each page is a document. I know a particular page matches my query because

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
Doron: Thanks for the suggestion, I'll certainly put it on my list, depending upon what the PM decides. This app is geneaology reasearch, and users *can* put in their own wildcards... This is why I love this list... lots of smart people giving me suggestions I never would have thought of ... Th

Re: wildcard and span queries

2006-10-09 Thread Doron Cohen
"Erick Erickson" <[EMAIL PROTECTED]> wrote on 09/10/2006 13:09:21: > ... The kicker is that what we are indexing is > OCR data, some of which is pretty trashy. So you wind up with "interesting" > words in your index, things like rtyHrS. So the whole question of allowing > very specific queries on d

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
I've already started that conversation with the PM, I'm just trying to get a better idea of what's possible. I'll whimper tooth and nail to keep from having to do a lot of work to add a feature to a product that nobody in their right mind would ever use . As far as the grammar, we don't actually

Re: wildcard and span queries

2006-10-09 Thread Paul Elschot
Erick, On Monday 09 October 2006 21:20, Erick Erickson wrote: > OK, forget the stuff about "TooManyBooleanClauses". I finally figured out > that if I specify the surround to have the same semantics as a SpanRegex ( > i.e, and(eri*, mal*)) it blows up with TooManyBooleanClauses. So that makes > mor

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
OK, forget the stuff about "TooManyBooleanClauses". I finally figured out that if I specify the surround to have the same semantics as a SpanRegex ( i.e, and(eri*, mal*)) it blows up with TooManyBooleanClauses. So that makes more sense to me now. Specifying 20w(eri*, mal*) is what I was using bef

Re: wildcard and span queries

2006-10-09 Thread Erick Erickson
OK, I'm using the surround code, and it seems to be working...with the following questions (always, more questions)... I'm gettng an exception sometimes of TooManyBasicQueries. I can control this by initializing BasicQueryFactory with a larger number. Do you have any cautions about upping this

Re: wildcard and span queries

2006-10-06 Thread Paul Elschot
Mark, On Friday 06 October 2006 22:46, Mark Miller wrote: > Paul's parser is beyond my feeble comprehension...but I would start by > looking at SrndTruncQuery. It looks to me like this enumerates each > possible match just like a SpanRegexQuery does...I am too lazy to figure > out what the visi

Re: wildcard and span queries

2006-10-06 Thread Mark Miller
Paul's parser is beyond my feeble comprehension...but I would start by looking at SrndTruncQuery. It looks to me like this enumerates each possible match just like a SpanRegexQuery does...I am too lazy to figure out what the visitor pattern is doing so I don't know if they then get added to a b

Re: wildcard and span queries

2006-10-06 Thread Paul Elschot
Erick, On Friday 06 October 2006 22:01, Erick Erickson wrote: > Paul: > > Splendid! Now if I just understood a single thing about the SrndQuery family > . > > I followed your link, and took a look at the text file. That should give me > enough to get started. > > But if you wanted to e-mail me

Re: wildcard and span queries

2006-10-06 Thread Erick Erickson
Paul: Splendid! Now if I just understood a single thing about the SrndQuery family . I followed your link, and took a look at the text file. That should give me enough to get started. But if you wanted to e-mail me any sample code or long explanations of what this all does, I would forever be y

Re: wildcard and span queries

2006-10-06 Thread Paul Elschot
On Friday 06 October 2006 14:37, Erick Erickson wrote: ... > Fortunately, the PM agrees that it's silly to think about span queries > involving OR or NOT for this app. So I'm left with something like Jo*n AND > sm*th AND jon?es WITHIN 6. OR works much the same as term expansion for wildcards. > T

Re: Wildcard with Words Ending in 'e'

2006-05-04 Thread Chris Hostetter
: now very useful when combined. I'm going to investigate other means of : taking into account the wildcard queries and handling them in a : different way. Rest assured that if I find a solution I'll post it to : the user group. the most straight forward solution is to have two indexed files -- o

Re: Wildcard with Words Ending in 'e'

2006-05-04 Thread Gordon Rogers
Chris Your shot in the dark was correct, I am using Porter Stemming. Thanks to your excellent response I understand now why wildcards and stemming are now very useful when combined. I'm going to investigate other means of taking into account the wildcard queries and handling them in a different

Re: Wildcard with Words Ending in 'e'

2006-05-03 Thread Chris Hostetter
I'm going to take a shot in the dark and guess that the field in question is analyzed using some form of Stemming which strips off trailing "e" characters (Porter Stemmer seems to do this). In the past, there was a bug in the way WildcardQuery works, so that the metacharacter "?" matches 0 or 1 c

Re: wildcard support

2006-03-21 Thread Erik Hatcher
If you've rewritten a WildcardQuery and get an empty query, that means that no terms matched the wildcard expression in the index pointed to by the IndexReader you provided. Erik On Mar 21, 2006, at 5:08 AM, Raghavendra Prabhu wrote: Hi I am using the highlightertest.java to extr

Re: wildcard search constraints?

2006-02-17 Thread Chris Hostetter
: I have some strange behaviors: : : (1) The field query slug:abc*hurr2* works only if the field type is : "Keyword". The query fails if the type is "Text". : : (2) On the other hand, query slug:abc-nws-hurr29 only works if the field : type is "Text" and fails if the type is "Keyword" I think the

Re: Wildcard and Fuzzy queries - no best fragments generated - ??

2005-12-27 Thread Erik Hatcher
d wise, to always call Query.rewrite before highlighting. No worries or pitfalls that I know of. Erik Thanks again, - Dmitry From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tue 12/27/2005 12:13 PM To: java-user@lucene.apache.org Subject: Re: Wi

RE: Wildcard and Fuzzy queries - no best fragments generated - ??

2005-12-27 Thread Dmitry Goldenberg
ik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tue 12/27/2005 12:13 PM To: java-user@lucene.apache.org Subject: Re: Wildcard and Fuzzy queries - no best fragments generated - ?? On Dec 27, 2005, at 2:34 PM, Dmitry Goldenberg wrote: > What do you mean by _rewriting_ the query? I checked all the >

Re: Wildcard and Fuzzy queries - no best fragments generated - ??

2005-12-27 Thread Erik Hatcher
a "..." String result = highlighter.getBestFragments(tokenStream, text, 3, "..."); System.out.println(result); } From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tue 12/27/2005 11:03 AM To: java-user@lucene.apache.org Subjec

RE: Wildcard and Fuzzy queries - no best fragments generated - ??

2005-12-27 Thread Dmitry Goldenberg
EMAIL PROTECTED] Sent: Tue 12/27/2005 11:03 AM To: java-user@lucene.apache.org Subject: Re: Wildcard and Fuzzy queries - no best fragments generated - ?? You have to _rewrite_ the Query for this to work. This, I believe, is mentioned in the javadocs. I think you are hijacking a thread with your r

Re: Wildcard and Fuzzy queries - no best fragments generated - ??

2005-12-27 Thread Erik Hatcher
You have to _rewrite_ the Query for this to work. This, I believe, is mentioned in the javadocs. I think you are hijacking a thread with your recent postings. Please create a new message rather than reply to one and change the subject. Thanks. Erik On Dec 27, 2005, at 1:55 PM

Re: Wildcard

2005-12-03 Thread Erik Hatcher
On Dec 2, 2005, at 6:21 PM, John Powers wrote: Hello, Lucene only lets you use a wildcard after a term, not before, correct? What work arounds are there for that? If I have an item 108585-123 And another 332323-123 How can I look for all the -123 family of items? To clarify something that no

RE: Wildcard

2005-12-02 Thread Pasha Bizhan
Hi, > From: John Powers [mailto:[EMAIL PROTECTED] > Lucene only lets you use a wildcard after a term, not before, correct? > What work arounds are there for that? RegexQuery? http://svn.apache.org/repos/asf/lucene/java/trunk/src/java/org/apache/lucene /search/regex/ Also: http://www.mail-arch

Re: Wildcard

2005-12-02 Thread Marc Hadfield
The standard way to do this is to additionally index the reverse of all strings/tokens, potentially in a different field "reverse:", ie index forward:abcd as well as reverse:dcba. Then in queries of the form "*cd", reverse the query to "dc*" so that you end up with "reverse:dc*" in your

  1   2   >