Re: WildCardQuery
Robinson Raju writes: The way i have done is , if there is a wildcard , Use WildCardQuery , else other. Here searchFields is an array which contains the column names . search string is the value to be searched. if ((searchString.indexOf(IOOSConstants.ASTERISK) -1) || (searchString.indexOf(IOOSConstants.QUESTION_MARK) -1)) { WildcardQuery wQuery = new WildcardQuery(new Term( searchFields[0], searchString)); booleanQuery.add(wQuery, true, false); if (searchFields.length 1) { WildcardQuery wQuery2 = new WildcardQuery(new Term( searchFields[1], searchString)); booleanQuery.add(wQuery2, true, false); } } else { Query query = MultiFieldQueryParser.parse(searchString, searchFields, flags, analyzer); booleanQuery.add(query, true, false); } Query queryfilter = MultiFieldQueryParser.parse(filterString, filterFields, flags, analyzer); QueryFilter queryFilter = new QueryFilter(queryfilter); hits = parallelMultiSearcher.search(booleanQuery, queryFilter); In the meanwhile , i thought i would tokenize the string based on space if the input contains spaces and then add them one by one into booleanQuery. But this gave a StringIndexOutOfBoundsException. So am still trying... Thanks for your help . would appreciate greately if you could give me more pointers . Did you look at the output of query.toString(defaultfield)? That's usually the best way to see, if a constructed query is what you expect it to be. Why isn't creating wildcard queries left to the query parser? Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: WildCardQuery
On Fri, 1 Oct 2004, Robinson Raju wrote: analyzer is StandardAnalyzer. i use MultiFieldQueryParser to parse. The flow is this: I have indexed a Database view. Now i need to search against a few columns i take in the search criteria and search field , construct a wildcard query and add it to a boolean query WildcardQuery wQuery = new WildcardQuery(new Term(searchFields[0], searchString)); What is the value of searchString? Is it a word? QueryParser syntax is not applied here. Whats does ab* return? booleanQuery.add(wQuery, true, false); Query queryfilter = MultiFieldQueryParser.parse(filterString, filterFields, flags, analyzer); hits = parallelMultiSearcher.search(booleanQuery,queryFilter); when i dont use wild cards , it is taken as +((ITM_SHRT_DSC:natal ITM_SHRT_DSC:tylenol) (ITM_LONG_DSC:natal ITM_LONG_DSC:tylenol)) But when wildcard is used , it is taken as +ITM_SHRT_DSC:nat* tylenol +ITM_LONG_DSC:nat* Tylenol ITM_XXX fields are tokenized? sv the first return around 300 records , the second , 0. any help would be appreciated Thanks Robin On Fri, 1 Oct 2004 02:06:04 -0400 (EDT), Stephane James Vaucher [EMAIL PROTECTED] wrote: Can you be a little more precise about how you process your documents? 1) What's your analyser? SimpleAnalyzer? 2) How do you parse the query? Out-of-the-box QueryParser? can we not enter space or do an OR search with two words one of which has a wildcard ? Simple answer, yes. Complicated answer, words are delimited by your tokeniser. That's included in your analyser (hence my question above). The asterix syntax comes from using a query parser that transforms the query into a PrefixQuery object. sv On Fri, 1 Oct 2004, Robinson Raju w Hi , Would there be a problem if one enters space while using wildcards ? say i search for 'abc' . i get 100 hits as results 'man' gives - 200 'abc man' gives 300 but 'ab* man' 'abc ma*' ab* ma*' ab* OR ma* .. all of these return 0 results. can we not enter space or do an OR search with two words one of which has a wildcard ? Regards, Robin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: WildCardQuery
Can you be a little more precise about how you process your documents? 1) What's your analyser? SimpleAnalyzer? 2) How do you parse the query? Out-of-the-box QueryParser? can we not enter space or do an OR search with two words one of which has a wildcard ? Simple answer, yes. Complicated answer, words are delimited by your tokeniser. That's included in your analyser (hence my question above). The asterix syntax comes from using a query parser that transforms the query into a PrefixQuery object. sv On Fri, 1 Oct 2004, Robinson Raju w Hi , Would there be a problem if one enters space while using wildcards ? say i search for 'abc' . i get 100 hits as results 'man' gives - 200 'abc man' gives 300 but 'ab* man' 'abc ma*' ab* ma*' ab* OR ma* .. all of these return 0 results. can we not enter space or do an OR search with two words one of which has a wildcard ? Regards, Robin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: WildCardQuery
analyzer is StandardAnalyzer. i use MultiFieldQueryParser to parse. The flow is this: I have indexed a Database view. Now i need to search against a few columns i take in the search criteria and search field , construct a wildcard query and add it to a boolean query WildcardQuery wQuery = new WildcardQuery(new Term(searchFields[0], searchString)); booleanQuery.add(wQuery, true, false); Query queryfilter = MultiFieldQueryParser.parse(filterString, filterFields, flags, analyzer); hits = parallelMultiSearcher.search(booleanQuery,queryFilter); when i dont use wild cards , it is taken as +((ITM_SHRT_DSC:natal ITM_SHRT_DSC:tylenol) (ITM_LONG_DSC:natal ITM_LONG_DSC:tylenol)) But when wildcard is used , it is taken as +ITM_SHRT_DSC:nat* tylenol +ITM_LONG_DSC:nat* Tylenol the first return around 300 records , the second , 0. any help would be appreciated Thanks Robin On Fri, 1 Oct 2004 02:06:04 -0400 (EDT), Stephane James Vaucher [EMAIL PROTECTED] wrote: Can you be a little more precise about how you process your documents? 1) What's your analyser? SimpleAnalyzer? 2) How do you parse the query? Out-of-the-box QueryParser? can we not enter space or do an OR search with two words one of which has a wildcard ? Simple answer, yes. Complicated answer, words are delimited by your tokeniser. That's included in your analyser (hence my question above). The asterix syntax comes from using a query parser that transforms the query into a PrefixQuery object. sv On Fri, 1 Oct 2004, Robinson Raju w Hi , Would there be a problem if one enters space while using wildcards ? say i search for 'abc' . i get 100 hits as results 'man' gives - 200 'abc man' gives 300 but 'ab* man' 'abc ma*' ab* ma*' ab* OR ma* .. all of these return 0 results. can we not enter space or do an OR search with two words one of which has a wildcard ? Regards, Robin - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Regards, Robin 9886394650 The merit of an action lies in finishing it to the end - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: WildCardQuery
Hi , I think it doesn't have anything to do with number of characters with a wildcard. Because 'z*' works and 'a*' does not. Does lucene have a limitation on the number of hits fetched ? The error that I get is org.apache.lucene.search.BooleanQuery$TooManyClauses at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:79) at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:71) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:61) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:228) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:162) at org.apache.lucene.search.Query.weight(Query.java:84) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:85) at org.apache.lucene.search.MultiSearcherThread.run(ParallelMultiSearcher.j ava:251) java.lang.NullPointerException at org.apache.lucene.search.MultiSearcherThread.hits(ParallelMultiSearcher. java:281) at org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearc her.java:85) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64) at org.apache.lucene.search.Hits.init(Hits.java:43) at org.apache.lucene.search.Searcher.search(Searcher.java:33) at com.abcea.oos.search.helper.TestSearcher.search(TestSearcher.java:161) at com.abcea.oos.search.helper.TestSearcher.main(TestSearcher.java:270) -Original Message- From: Raju, Robinson (Cognizant) Sent: Tuesday, September 21, 2004 10:21 AM To: 'Lucene Users List' Subject: WildCardQuery Is there a limitation in Lucene when it comes to wildcard search ? Is it a problem if we use less than 3 characters along with a wildcard(*). Gives me error if I try using 45* , *34 , *3 ..etc . Too Many Clauses Error Doesn't happen if '?' is used instead of '*'. The intriguing thing is , that it is not consistent . 00* doesn't fail. Am I missing something ? Robin This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: WildCardQuery
Thanks a lot Paul , for solving the problem. I added booleanQuery.setMaxClauseCount(1) and there was no prob after that. Regards, Robin -Original Message- From: Raju, Robinson (Cognizant) Sent: Wednesday, September 22, 2004 8:01 PM To: 'Lucene Users List' Subject: RE: WildCardQuery Hi , I think it doesn't have anything to do with number of characters with a wildcard. Because 'z*' works and 'a*' does not. Does lucene have a limitation on the number of hits fetched ? The error that I get is org.apache.lucene.search.BooleanQuery$TooManyClauses at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:79) at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:71) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:61) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:228) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:162) at org.apache.lucene.search.Query.weight(Query.java:84) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:85) at org.apache.lucene.search.MultiSearcherThread.run(ParallelMultiSearcher.j ava:251) java.lang.NullPointerException at org.apache.lucene.search.MultiSearcherThread.hits(ParallelMultiSearcher. java:281) at org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearc her.java:85) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64) at org.apache.lucene.search.Hits.init(Hits.java:43) at org.apache.lucene.search.Searcher.search(Searcher.java:33) at com.abcea.oos.search.helper.TestSearcher.search(TestSearcher.java:161) at com.abcea.oos.search.helper.TestSearcher.main(TestSearcher.java:270) -Original Message- From: Raju, Robinson (Cognizant) Sent: Tuesday, September 21, 2004 10:21 AM To: 'Lucene Users List' Subject: WildCardQuery Is there a limitation in Lucene when it comes to wildcard search ? Is it a problem if we use less than 3 characters along with a wildcard(*). Gives me error if I try using 45* , *34 , *3 ..etc . Too Many Clauses Error Doesn't happen if '?' is used instead of '*'. The intriguing thing is , that it is not consistent . 00* doesn't fail. Am I missing something ? Robin This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: WildCardQuery
On Tuesday 21 September 2004 06:50, Raju, Robinson (Cognizant) wrote: Is there a limitation in Lucene when it comes to wildcard search ? Is it a problem if we use less than 3 characters along with a wildcard(*). Gives me error if I try using 45* , *34 , *3 ..etc . Too Many Clauses Error Doesn't happen if '?' is used instead of '*'. The intriguing thing is , that it is not consistent . 00* doesn't fail. Am I missing something ? The number of clauses added to the query equals the number of indexed terms that match the wildcard. As each clause ends up using some buffer memory internally, a maximum was introduced to avoid running out of memory. You can change the maximum nr of added clauses using BooleanQuery.setMaxClauseCount() but then it is advisable to monitor memory usage, and evt. increase heap space for the JVM. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
WildCardQuery
Is there a limitation in Lucene when it comes to wildcard search ? Is it a problem if we use less than 3 characters along with a wildcard(*). Gives me error if I try using 45* , *34 , *3 ..etc . Too Many Clauses Error Doesn't happen if '?' is used instead of '*'. The intriguing thing is , that it is not consistent . 00* doesn't fail. Am I missing something ? Robin This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
WildcardQuery (and FuzzyQuery) problems or bugs?
Hi, i've got problems while using wildcard and fuzzy searches. While indexing a bunch of documents it could happen that a field of all documents will be left empty (empty String). I don't know wether this field will be included into the index at all. But this leads to a problem while searching. If i create some Queries (actually this does the QueryParser for me) on that field and run a search with these queries, i would expect no results. However, WildcardQuery and FuzziQuery sometimes generate a NullPointerException in WildcardTermEnum.termCompare(...). It seems to me that this does not happen if i add an additional unused field to all documents but it may be that this works for me due to other reasons. To explain what i mean, i've attached some lines of test code. btw: I'm using lucene nightly-build 20020814. Am i missing something or is this a bug? Another question: An WildcardQuery that is constructed of only non-wildcards gives me an StringIndexOutOfBoundsException while searching. Shouldn't this be handled in a smarter way? Thanks in advance, Bjoern import org.apache.lucene.search.*; import org.apache.lucene.index.*; import org.apache.lucene.store.*; import org.apache.lucene.analysis.*; import org.apache.lucene.document.*; import org.apache.lucene.queryParser.*; public class Tester { public void testMe (boolean additionalField) throws Exception { RAMDirectory indexStore = new RAMDirectory(); IndexWriter writer = new IndexWriter(indexStore, new SimpleAnalyzer(), true); Document doc = new Document(); doc.add(Field.Text(bar0, foo)); doc.add(Field.Text(bar1, )); // magical unused extra field if (additionalField) doc.add(Field.Text(bar2, foo)); writer.addDocument(doc); writer.optimize(); writer.close(); IndexSearcher searcher = new IndexSearcher(indexStore); Query[] queries = new Query[3]; queries[0] = new WildcardQuery(new Term(bar0, f?o));// works as expected queries[1] = new TermQuery(new Term(bar1, foo));// this one too queries[2] = new WildcardQuery(new Term(bar1, f?o));// strange Hits result; for (int q = 0; q queries.length; q++) { System.out.println (Query + q + ( + queries[q].getClass().getName() + )); System.out.println (\ttoString (\bar0\) = + queries[q].toString (bar0)); System.out.println (\ttoString (\bar1\) = + queries[q].toString (bar1)); result = searcher.search(queries[q]); if (result.length() == 0) { System.out.println (\tno results); } for (int i = 0; i result.length(); i++) { Document d = result.doc(i); System.out.println (\tresult: + d.get (bar0) + - + d.get (bar1)); } } } public static void main(String[] _argv) { Tester t = new Tester(); try { System.out.println (## With additional unused extra field it works); t.testMe(true); } catch (Throwable th) { System.err.println (OOPS: + th.getMessage()); th.printStackTrace(); } try { System.out.println (\n## Without additional unused extra field it won't work); t.testMe(false); } catch (Throwable th) { System.err.println (OOPS: + th.getMessage()); th.printStackTrace(); } } } -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
AW: WildcardQuery
It works with the nightly builds and probably with 1.2-RC5 :-) Christian -Ursprungliche Nachricht- Von: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Gesendet: 07 May 2002 17:31 An: Lucene Users List Betreff: Re: WildcardQuery Yes, me too. I just tried it on some Lucene index (the search at blink.com) and it doesn't seem to work (try searching for travel and then *vel). I'm assuming the original poster confused something... Otis --- Joel Bernstein [EMAIL PROTECTED] wrote: I thought Lucene didn't support left wildcards like the following: *ucene - Original Message - From: Christian Schrader [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, May 06, 2002 7:14 PM Subject: WildcardQuery I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Health - your guide to health and wellness http://health.yahoo.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: AW: WildcardQuery
Left wildcards seem to work if you explicitly use a WildcardQuery e.g. Term t = new Term(id, *ucene); Query query = new WildcardQuery(t); but if use QueryParser with an analyzer e.g. Analyzer analyzer = new StandardAnalyzer(); Query query = QueryParser.parse(*ucene, id, analyzer); get an exception: org.apache.lucene.queryParser.ParseException: Lexical error at line 1, column 1. Encountered: * (42), after : at org.apache.lucene.queryParser.QueryParser.parse(Unknown Source) Tested on RC5. Haven't tried other ways of building a query. In my simple tests terms with left and right wildcards like *lucene* worked too, even if the whole word was included. -- Ian. [EMAIL PROTECTED] [EMAIL PROTECTED] (Christian Schrader) wrote It works with the nightly builds and probably with 1.2-RC5 :-) Christian -Ursprungliche Nachricht- Von: Otis Gospodnetic [mailto:[EMAIL PROTECTED]] Gesendet: 07 May 2002 17:31 An: Lucene Users List Betreff: Re: WildcardQuery Yes, me too. I just tried it on some Lucene index (the search at blink.com) and it doesn't seem to work (try searching for travel and then *vel). I'm assuming the original poster confused something... Otis --- Joel Bernstein [EMAIL PROTECTED] wrote: I thought Lucene didn't support left wildcards like the following: *ucene - Original Message - From: Christian Schrader [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, May 06, 2002 7:14 PM Subject: WildcardQuery I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- Searchable personal storage and archiving from http://www.digimem.net/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: AW: WildcardQuery
Hi... I am both a newbie to Lucene and to using this list, so please forgive me if I make some mistakes. I am trailing onto this post because I cannot seem to get the wildcard function to work at all, while all of the other features seem to work just fine. I am using a very standard application (actually, it is just the demo version slightly modified) with the StandardAnalyzer and the QueryParser. But the wildcard feature (using either ? or *) just doesn't work. I must be missing something very basic. I would appreciate any ideas. Thanks! Basic wildcard support (i.e. ignoring things like left wildcards) comes pretty much out of the box. Attached is a copy of the program I was playing with before sending the earlier message. It uses StandardAnalyzer and the static QueryParser.parse() method so doesn't work with left wildcards. I haven't tried ? rather than *. Hope this helps. -- Ian. [EMAIL PROTECTED] -- Searchable personal storage and archiving from http://www.digimem.net/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: AW: WildcardQuery
Sorry - I said I was going to send some code ... -- Ian. import org.apache.lucene.queryParser.*; import org.apache.lucene.search.*; import org.apache.lucene.index.*; import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import org.apache.lucene.document.*; import org.apache.lucene.store.*; public class LuceneTest { RAMDirectory ramdir; Analyzer analyzer; IndexWriter writer; IndexReader reader; Searcher searcher; public LuceneTest() { analyzer = new StandardAnalyzer(); ramdir = new RAMDirectory(); } public static void main(String args[]) throws Exception { LuceneTest ld = new LuceneTest(); ld.load(); ld.search(); } void load() throws Exception { writer = new IndexWriter(ramdir, analyzer, true); add(january); add(february); add(june); add(july); writer.close(); } void add(String s) throws Exception { Document d = new Document(); d.add(Field.Keyword(id, s)); System.out.println(Adding +s); writer.addDocument(d); } void search() throws Exception { reader = IndexReader.open(ramdir); searcher = new IndexSearcher(reader); search(jan*); search(jan*y); search(j*y); search(j*); search(*y); } void search(String s) throws Exception { Query query = QueryParser.parse(s, id, analyzer); Hits hits = searcher.search(query); System.out.println(s+ matched +hits.length()); for (int i = 0; i hits.length(); i++) { System.out.println( + hits.doc(i).get(id)); } } } -- Searchable personal storage and archiving from http://www.digimem.net/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: WildcardQuery
I thought Lucene didn't support left wildcards like the following: *ucene - Original Message - From: Christian Schrader [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, May 06, 2002 7:14 PM Subject: WildcardQuery I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: WildcardQuery
Hi Joel, lucene does seem to support left wild cards take a look at this http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01473.html -jaggi Joel Bernstein wrote: I thought Lucene didn't support left wildcards like the following: *ucene - Original Message - From: Christian Schrader [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, May 06, 2002 7:14 PM Subject: WildcardQuery I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: WildcardQuery
Yes, me too. I just tried it on some Lucene index (the search at blink.com) and it doesn't seem to work (try searching for travel and then *vel). I'm assuming the original poster confused something... Otis --- Joel Bernstein [EMAIL PROTECTED] wrote: I thought Lucene didn't support left wildcards like the following: *ucene - Original Message - From: Christian Schrader [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, May 06, 2002 7:14 PM Subject: WildcardQuery I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Health - your guide to health and wellness http://health.yahoo.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
WildcardQuery
I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: WildcardQuery
This is fixed in the nightly builds. --Peter On 5/6/02 4:14 PM, Christian Schrader [EMAIL PROTECTED] wrote: I am pretty happy with the results of WildcardQueries like *ucen* that matches lucene, but *lucene* doesn't match lucene. Is there a reason for this? And what would be the patch. It should be in WildcardTermEnum. I am wondering if somebody already patched it? Thanks, Chris -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: WildcardQuery
If I understand you correctly, you tried to search for '*new*'. I believe you can't use an asterisk (*) as the first query of the query term. So, new* is valid, while *new or *new* is not. Otis --- Serge A. Redchuk [EMAIL PROTECTED] wrote: Hello sampreet, Tuesday, December 11, 2001, 6:44:29 AM, you wrote: sic Hi All, sic This must be simple enough, but can anyone please explain me when a sic WildcardQuery is created in QueryParser i.e. what special characters in the sic query string are required to build a WildcardQuery within QueryParser? Moreover, when I achieved complex search like this: path:*new* comp* by combining WildcardQueries in BooleanQuery (NOT BY QueryParser), and then got that query using boolq.toString(...); - the QueryParser COULD NOT parse this string !!! Is not it strange ? : QueryParser.parse( bquery.toString( ... ) ) - do not work :-( -- Best regards, Sergemailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Check out Yahoo! Shopping and Yahoo! Auctions for all of your unique holiday gifts! Buy at http://shopping.yahoo.com or bid at http://auctions.yahoo.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re[2]: WildcardQuery
Hello Otis, Strongly can not agree with you, because I really _can_ search for anything like '*new*'. _Simply_Beacuse_I_have_working_code_that_do_it_ Here's a slice of output of my program: Boolean wildcard search: built query: bee* news41:beem; news42:beem; news4:beem; Boolean wildcard search: built query: *ee f3:qthree; Boolean wildcard search: built query: +be* +path:*ws42 news42:beem; Boolean wildcard search: built query: +path:*ws4 +be* news4:beem; As you can see the first search returned 3 entries, but the 3-rd - only one. As well as the 4-th. And the 2-nd search returned only entry f3:qthree; (as we've expected: built query: *ee). And I've achieve it combining WildcardQueries in BooleanQuery, but did not achieve it by simple call of QueryParser.parser. Tuesday, December 11, 2001, 4:22:04 PM, you wrote: OG If I understand you correctly, you tried to search for '*new*'. I OG believe you can't use an asterisk (*) as the first query of the query OG term. So, new* is valid, while *new or *new* is not. OG Otis OG --- Serge A. Redchuk [EMAIL PROTECTED] wrote: Hello sampreet, Tuesday, December 11, 2001, 6:44:29 AM, you wrote: sic Hi All, sic This must be simple enough, but can anyone please explain me when a sic WildcardQuery is created in QueryParser i.e. what special characters in the sic query string are required to build a WildcardQuery within QueryParser? Moreover, when I achieved complex search like this: path:*new* comp* by combining WildcardQueries in BooleanQuery (NOT BY QueryParser), and then got that query using boolq.toString(...); - the QueryParser COULD NOT parse this string !!! Is not it strange ? : QueryParser.parse( bquery.toString( ... ) ) - do not work :-( -- Best regards, Sergemailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] OG __ OG Do You Yahoo!? OG Check out Yahoo! Shopping and Yahoo! Auctions for all of OG your unique holiday gifts! Buy at http://shopping.yahoo.com OG or bid at http://auctions.yahoo.com -- Best regards, Sergemailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re[4]: WildcardQuery
Hello Benjamin, Tuesday, December 11, 2001, 5:28:46 PM, you wrote: BK Sergej BK Could you please provide a sample code to demonstrate how you did that? Of course: (please correct me if I'll become wrong finally, but hope that I have not hallucinations :-) // search function void searchBooleanWildcard( HashMap terms, boolean req ) throws IOException { System.out.println( Boolean wildcard search: ); HashSet entries = new HashSet( terms.entrySet() ); BooleanQuery bQuery = new BooleanQuery(); for( Iterator it = entries.iterator(); it.hasNext(); ){ Object itn = it.next(); String where = (String)((Map.Entry)itn).getKey(); String what = (String)((Map.Entry)itn).getValue(); WildcardQuery wQuery = new WildcardQuery( new Term( where, what ) ); //System.out.println( Add to query: [ + where + , + what + ] ); bQuery.add( wQuery, req, false ); } System.out.println( built query: + bQuery.toString( body ) ); Searcher searcher = new IndexSearcher( rdir ); this.showHits( searcher.search( bQuery ) ); } // used at the end of the function above private void showHits( Hits hits ) throws IOException { for( int i=0; ihits.length(); i++ ){ System.out.println( hits.doc( i ).get( path ) + : + hits.doc( i ).get( body ) + ; Score: + hits.score( i ) ); }; System.out.println( ); } Please do not forget that HashMap can't contain more then one values with the same key. So the function searchBooleanWildcard(HashMap hmap) can combine search request only for different field names. (Hope that this explaination is quite clear). For example if we built search directories from 3 type of fields: [body, ...], [path, ...], [type, ...] we can add no more then 3 pairs to HashMap hmap. And an example of search: HashMap phQeryTerms = new HashMap(); phQeryTerms.put( body, *e*n ); sr.searchBooleanWildcard( phQeryTerms, true ); Corresponding output: Boolean wildcard search: built query: +*e*n news7:bean; Score: 1.0 news73:beeemN; Score: 0.25 news71:jEaN; Score: 0.25 Of course, when the next pairs are indexed ( path , body ): news7, bean news71, jEaN news72, lion news73, beeemN BK Best regards BK Benjamin -Original Message- From: Serge A. Redchuk [mailto:[EMAIL PROTECTED]] Sent: 11 December 2001 15:24 To: [EMAIL PROTECTED] Subject: Re[2]: WildcardQuery Hello Otis, Strongly can not agree with you, because I really _can_ search for anything like '*new*'. _Simply_Beacuse_I_have_working_code_that_do_it_ Here's a slice of output of my program: Boolean wildcard search: built query: bee* news41:beem; news42:beem; news4:beem; Boolean wildcard search: built query: *ee f3:qthree; Boolean wildcard search: built query: +be* +path:*ws42 news42:beem; Boolean wildcard search: built query: +path:*ws4 +be* news4:beem; As you can see the first search returned 3 entries, but the 3-rd - only one. As well as the 4-th. And the 2-nd search returned only entry f3:qthree; (as we've expected: built query: *ee). And I've achieve it combining WildcardQueries in BooleanQuery, but did not achieve it by simple call of QueryParser.parser. Tuesday, December 11, 2001, 4:22:04 PM, you wrote: OG If I understand you correctly, you tried to search for '*new*'. I OG believe you can't use an asterisk (*) as the first query of the query OG term. So, new* is valid, while *new or *new* is not. OG Otis OG --- Serge A. Redchuk [EMAIL PROTECTED] wrote: Hello sampreet, Tuesday, December 11, 2001, 6:44:29 AM, you wrote: sic Hi All, sic This must be simple enough, but can anyone please explain me when a sic WildcardQuery is created in QueryParser i.e. what special characters in the sic query string are required to build a WildcardQuery within QueryParser? Moreover, when I achieved complex search like this: path:*new* comp* by combining WildcardQueries in BooleanQuery (NOT BY QueryParser), and then got that query using boolq.toString(...); - the QueryParser COULD NOT parse this string !!! Is not it strange ? : QueryParser.parse( bquery.toString( ... ) ) - do not work :-( -- Best regards, Sergemailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Re[4]: WildcardQuery
Thank you Sergej :) -Original Message- From: Serge A. Redchuk [mailto:[EMAIL PROTECTED]] Sent: 11 December 2001 15:55 To: Lucene Users List Subject: Re[4]: WildcardQuery Hello Benjamin, Tuesday, December 11, 2001, 5:28:46 PM, you wrote: BK Sergej BK Could you please provide a sample code to demonstrate how you did that? Of course: (please correct me if I'll become wrong finally, but hope that I have not hallucinations :-) // search function void searchBooleanWildcard( HashMap terms, boolean req ) throws IOException { System.out.println( Boolean wildcard search: ); HashSet entries = new HashSet( terms.entrySet() ); BooleanQuery bQuery = new BooleanQuery(); for( Iterator it = entries.iterator(); it.hasNext(); ){ Object itn = it.next(); String where = (String)((Map.Entry)itn).getKey(); String what = (String)((Map.Entry)itn).getValue(); WildcardQuery wQuery = new WildcardQuery( new Term( where, what ) ); //System.out.println( Add to query: [ + where + , + what + ] ); bQuery.add( wQuery, req, false ); } System.out.println( built query: + bQuery.toString( body ) ); Searcher searcher = new IndexSearcher( rdir ); this.showHits( searcher.search( bQuery ) ); } // used at the end of the function above private void showHits( Hits hits ) throws IOException { for( int i=0; ihits.length(); i++ ){ System.out.println( hits.doc( i ).get( path ) + : + hits.doc( i ).get( body ) + ; Score: + hits.score( i ) ); }; System.out.println( ); } Please do not forget that HashMap can't contain more then one values with the same key. So the function searchBooleanWildcard(HashMap hmap) can combine search request only for different field names. (Hope that this explaination is quite clear). For example if we built search directories from 3 type of fields: [body, ...], [path, ...], [type, ...] we can add no more then 3 pairs to HashMap hmap. And an example of search: HashMap phQeryTerms = new HashMap(); phQeryTerms.put( body, *e*n ); sr.searchBooleanWildcard( phQeryTerms, true ); Corresponding output: Boolean wildcard search: built query: +*e*n news7:bean; Score: 1.0 news73:beeemN; Score: 0.25 news71:jEaN; Score: 0.25 Of course, when the next pairs are indexed ( path , body ): news7, bean news71, jEaN news72, lion news73, beeemN BK Best regards BK Benjamin -Original Message- From: Serge A. Redchuk [mailto:[EMAIL PROTECTED]] Sent: 11 December 2001 15:24 To: [EMAIL PROTECTED] Subject: Re[2]: WildcardQuery Hello Otis, Strongly can not agree with you, because I really _can_ search for anything like '*new*'. _Simply_Beacuse_I_have_working_code_that_do_it_ Here's a slice of output of my program: Boolean wildcard search: built query: bee* news41:beem; news42:beem; news4:beem; Boolean wildcard search: built query: *ee f3:qthree; Boolean wildcard search: built query: +be* +path:*ws42 news42:beem; Boolean wildcard search: built query: +path:*ws4 +be* news4:beem; As you can see the first search returned 3 entries, but the 3-rd - only one. As well as the 4-th. And the 2-nd search returned only entry f3:qthree; (as we've expected: built query: *ee). And I've achieve it combining WildcardQueries in BooleanQuery, but did not achieve it by simple call of QueryParser.parser. Tuesday, December 11, 2001, 4:22:04 PM, you wrote: OG If I understand you correctly, you tried to search for '*new*'. I OG believe you can't use an asterisk (*) as the first query of the query OG term. So, new* is valid, while *new or *new* is not. OG Otis OG --- Serge A. Redchuk [EMAIL PROTECTED] wrote: Hello sampreet, Tuesday, December 11, 2001, 6:44:29 AM, you wrote: sic Hi All, sic This must be simple enough, but can anyone please explain me when a sic WildcardQuery is created in QueryParser i.e. what special characters in the sic query string are required to build a WildcardQuery within QueryParser? Moreover, when I achieved complex search like this: path:*new* comp* by combining WildcardQueries in BooleanQuery (NOT BY QueryParser), and then got that query using boolq.toString(...); - the QueryParser COULD NOT parse this string !!! Is not it strange ? : QueryParser.parse( bquery.toString( ... ) ) - do not work :-( -- Best regards, Sergemailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Flaws in WildcardQuery design....
--- Robert J. Lebowitz [EMAIL PROTECTED] wrote: I've been experimenting with the new WildcardQuery class and since there isn't really any documentation on its use, I've been sort of poking at it to see how it is used. From what I've seen so far, you must construct the query by passing it a Term object. However, the String that is passed as the constructor for the Term must end with an asterix. Hmm.. are you certain that WildcardQuery is used? If you are using QueryParser, then Terms ending with an asterix are handled by PrefixQuery and not WildcardQuery. Although I think WildcardQuery possibly could be used with terms ending with asterix, it was never tested this way as it is assumed that PrefixQuery would handle such cases. Question 1: Has the QueryParser been updated such that it can handle wildcard terms using the new WildcardQuery? I.E., can it return some kind of BooleanQuery that incorporates some terms utilizing Wildcard searches (and others that don't)? Yes. __ Do You Yahoo!? Make a great connection at Yahoo! Personals. http://personals.yahoo.com