Re: WildCardQuery

2004-10-05 Thread Morus Walter
Robinson Raju writes:
 The way i have done is , 
 if there is a wildcard , Use WildCardQuery , 
 else other.
 Here searchFields is an array which contains the column names . search
 string is the value to be searched.
 
 if ((searchString.indexOf(IOOSConstants.ASTERISK)  -1)
 || (searchString.indexOf(IOOSConstants.QUESTION_MARK)  -1))
 {
 WildcardQuery wQuery = new WildcardQuery(new Term(
 searchFields[0], searchString));
 booleanQuery.add(wQuery, true, false);
 if (searchFields.length  1)
 {
 WildcardQuery wQuery2 = new WildcardQuery(new Term(
 searchFields[1], searchString));
 booleanQuery.add(wQuery2, true, false);
 }
 }
 else
 {
 Query query = MultiFieldQueryParser.parse(searchString,
 searchFields, flags, analyzer);
 booleanQuery.add(query, true, false);
 }
 Query queryfilter = MultiFieldQueryParser.parse(filterString,
 filterFields, flags, analyzer);
 QueryFilter queryFilter = new QueryFilter(queryfilter);
 hits = parallelMultiSearcher.search(booleanQuery, queryFilter);
 
 In the meanwhile , i thought i would tokenize the string based on
 space if the input contains spaces and then add them one by one into
 booleanQuery. But this gave a StringIndexOutOfBoundsException.
 
 So am still trying...
 Thanks for your help . would appreciate greately if you could give me
 more pointers .
 
Did you look at the output of query.toString(defaultfield)?
That's usually the best way to see, if a constructed query is what you 
expect it to be.

Why isn't creating wildcard queries left to the query parser?

Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: WildCardQuery

2004-10-04 Thread Stephane James Vaucher
On Fri, 1 Oct 2004, Robinson Raju wrote:

 analyzer is StandardAnalyzer.
 i use MultiFieldQueryParser to parse.

 The flow is this:
 I have indexed a Database view. Now i need to search against a few columns
 i take in the search criteria and search field ,
 construct a wildcard query and add it to a boolean query

 WildcardQuery wQuery = new WildcardQuery(new Term(searchFields[0],
 searchString));

What is the value of searchString? Is it a word? QueryParser syntax is not
applied here.
Whats does ab* return?

 booleanQuery.add(wQuery, true, false);
 Query queryfilter = MultiFieldQueryParser.parse(filterString,
 filterFields, flags, analyzer);
 hits = parallelMultiSearcher.search(booleanQuery,queryFilter);

 when i dont use wild cards , it is taken as
 +((ITM_SHRT_DSC:natal ITM_SHRT_DSC:tylenol) (ITM_LONG_DSC:natal
 ITM_LONG_DSC:tylenol))
 But when wildcard is used , it is taken as
   +ITM_SHRT_DSC:nat* tylenol +ITM_LONG_DSC:nat* Tylenol

ITM_XXX fields are tokenized?

sv

 the first return around 300 records , the second , 0.

 any help would be appreciated
 Thanks
 Robin

 On Fri, 1 Oct 2004 02:06:04 -0400 (EDT), Stephane James Vaucher
 [EMAIL PROTECTED] wrote:
  Can you be a little more precise about how you process your documents?
 
  1) What's your analyser? SimpleAnalyzer?
  2) How do you parse the query? Out-of-the-box QueryParser?
 
   can we not enter space or do an OR search with two words one of which
   has a wildcard ?
 
  Simple answer, yes.
 
  Complicated answer, words are delimited by your tokeniser. That's included
  in your analyser (hence my question above). The asterix syntax comes
  from using a query parser that transforms the query into a PrefixQuery
  object.
 
  sv
 
  On Fri, 1 Oct 2004, Robinson Raju w Hi ,
 
 
  Would there be a problem if one enters space while using wildcards ?
   say i search for 'abc' . i get 100 hits as results
   'man' gives - 200
   'abc man' gives 300
   but
   'ab* man'
   'abc ma*'
   ab* ma*'
   ab* OR ma*
   ..
   all of these return 0 results.
   can we not enter space or do an OR search with two words one of which
   has a wildcard ?
  
   Regards,
   Robin
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
 
 






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: WildCardQuery

2004-10-01 Thread Stephane James Vaucher
Can you be a little more precise about how you process your documents?

1) What's your analyser? SimpleAnalyzer?
2) How do you parse the query? Out-of-the-box QueryParser?

 can we not enter space or do an OR search with two words one of which
 has a wildcard ?

Simple answer, yes.

Complicated answer, words are delimited by your tokeniser. That's included
in your analyser (hence my question above). The asterix syntax comes
from using a query parser that transforms the query into a PrefixQuery
object.

sv

On Fri, 1 Oct 2004, Robinson Raju w Hi ,
Would there be a problem if one enters space while using wildcards ?
 say i search for 'abc' . i get 100 hits as results
 'man' gives - 200
 'abc man' gives 300
 but
 'ab* man'
 'abc ma*'
 ab* ma*'
 ab* OR ma*
 ..
 all of these return 0 results.
 can we not enter space or do an OR search with two words one of which
 has a wildcard ?

 Regards,
 Robin

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: WildCardQuery

2004-10-01 Thread Robinson Raju
analyzer is StandardAnalyzer.
i use MultiFieldQueryParser to parse. 

The flow is this:
I have indexed a Database view. Now i need to search against a few columns
i take in the search criteria and search field , 
construct a wildcard query and add it to a boolean query

WildcardQuery wQuery = new WildcardQuery(new Term(searchFields[0],
searchString));
booleanQuery.add(wQuery, true, false);
Query queryfilter = MultiFieldQueryParser.parse(filterString,
filterFields, flags, analyzer);
hits = parallelMultiSearcher.search(booleanQuery,queryFilter);

when i dont use wild cards , it is taken as
+((ITM_SHRT_DSC:natal ITM_SHRT_DSC:tylenol) (ITM_LONG_DSC:natal
ITM_LONG_DSC:tylenol))
But when wildcard is used , it is taken as 
+ITM_SHRT_DSC:nat* tylenol +ITM_LONG_DSC:nat* Tylenol

the first return around 300 records , the second , 0. 

any help would be appreciated
Thanks
Robin

On Fri, 1 Oct 2004 02:06:04 -0400 (EDT), Stephane James Vaucher
[EMAIL PROTECTED] wrote:
 Can you be a little more precise about how you process your documents?
 
 1) What's your analyser? SimpleAnalyzer?
 2) How do you parse the query? Out-of-the-box QueryParser?
 
  can we not enter space or do an OR search with two words one of which
  has a wildcard ?
 
 Simple answer, yes.
 
 Complicated answer, words are delimited by your tokeniser. That's included
 in your analyser (hence my question above). The asterix syntax comes
 from using a query parser that transforms the query into a PrefixQuery
 object.
 
 sv
 
 On Fri, 1 Oct 2004, Robinson Raju w Hi ,
 
 
 Would there be a problem if one enters space while using wildcards ?
  say i search for 'abc' . i get 100 hits as results
  'man' gives - 200
  'abc man' gives 300
  but
  'ab* man'
  'abc ma*'
  ab* ma*'
  ab* OR ma*
  ..
  all of these return 0 results.
  can we not enter space or do an OR search with two words one of which
  has a wildcard ?
 
  Regards,
  Robin
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 



-- 
Regards,
Robin
9886394650
The merit of an action lies in finishing it to the end

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: WildCardQuery

2004-09-22 Thread Raju, Robinson (Cognizant)

Hi ,
I think it doesn't have anything to do with number of characters
with a wildcard. Because 'z*' works and 'a*' does not.
Does lucene have a limitation on the number of hits fetched ?
The error that I get is

org.apache.lucene.search.BooleanQuery$TooManyClauses
at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:79)
at
org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:71)
at
org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:61)
at
org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:228)
at
org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:162)
at org.apache.lucene.search.Query.weight(Query.java:84)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:85)
at
org.apache.lucene.search.MultiSearcherThread.run(ParallelMultiSearcher.j
ava:251)
java.lang.NullPointerException
at
org.apache.lucene.search.MultiSearcherThread.hits(ParallelMultiSearcher.
java:281)
at
org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearc
her.java:85)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
at org.apache.lucene.search.Hits.init(Hits.java:43)
at org.apache.lucene.search.Searcher.search(Searcher.java:33)
at
com.abcea.oos.search.helper.TestSearcher.search(TestSearcher.java:161)
at
com.abcea.oos.search.helper.TestSearcher.main(TestSearcher.java:270)

-Original Message-
From: Raju, Robinson (Cognizant)
Sent: Tuesday, September 21, 2004 10:21 AM
To: 'Lucene Users List'
Subject: WildCardQuery

Is there a limitation in Lucene when it comes to wildcard search ?
Is it a problem if we use less than 3 characters along with a
wildcard(*).
Gives me error if I try using 45* , *34 , *3 ..etc .
Too Many Clauses Error
Doesn't happen if '?' is used instead of '*'.
The intriguing thing is , that it is not consistent . 00* doesn't fail.
Am I missing something ?

Robin

This e-mail and any files transmitted with it are for the sole use of the intended 
recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and 
destroy all copies of the original message.
Any unauthorised review, use, disclosure, dissemination, forwarding, printing or 
copying of this email or any action taken in reliance on this e-mail is strictly
prohibited and may be unlawful.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: WildCardQuery

2004-09-22 Thread Raju, Robinson (Cognizant)

Thanks a lot Paul , for solving the problem.
I added booleanQuery.setMaxClauseCount(1) and there was no prob
after that.

Regards,
Robin

-Original Message-
From: Raju, Robinson (Cognizant)
Sent: Wednesday, September 22, 2004 8:01 PM
To: 'Lucene Users List'
Subject: RE: WildCardQuery

Hi ,
I think it doesn't have anything to do with number of characters
with a wildcard. Because 'z*' works and 'a*' does not.
Does lucene have a limitation on the number of hits fetched ?
The error that I get is

org.apache.lucene.search.BooleanQuery$TooManyClauses
at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:79)
at
org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:71)
at
org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:61)
at
org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:228)
at
org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:162)
at org.apache.lucene.search.Query.weight(Query.java:84)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:85)
at
org.apache.lucene.search.MultiSearcherThread.run(ParallelMultiSearcher.j
ava:251)
java.lang.NullPointerException
at
org.apache.lucene.search.MultiSearcherThread.hits(ParallelMultiSearcher.
java:281)
at
org.apache.lucene.search.ParallelMultiSearcher.search(ParallelMultiSearc
her.java:85)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
at org.apache.lucene.search.Hits.init(Hits.java:43)
at org.apache.lucene.search.Searcher.search(Searcher.java:33)
at
com.abcea.oos.search.helper.TestSearcher.search(TestSearcher.java:161)
at
com.abcea.oos.search.helper.TestSearcher.main(TestSearcher.java:270)

-Original Message-
From: Raju, Robinson (Cognizant)
Sent: Tuesday, September 21, 2004 10:21 AM
To: 'Lucene Users List'
Subject: WildCardQuery

Is there a limitation in Lucene when it comes to wildcard search ?
Is it a problem if we use less than 3 characters along with a
wildcard(*).
Gives me error if I try using 45* , *34 , *3 ..etc .
Too Many Clauses Error
Doesn't happen if '?' is used instead of '*'.
The intriguing thing is , that it is not consistent . 00* doesn't fail.
Am I missing something ?

Robin

This e-mail and any files transmitted with it are for the sole use of the intended 
recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and 
destroy all copies of the original message.
Any unauthorised review, use, disclosure, dissemination, forwarding, printing or 
copying of this email or any action taken in reliance on this e-mail is strictly
prohibited and may be unlawful.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: WildCardQuery

2004-09-21 Thread Paul Elschot
On Tuesday 21 September 2004 06:50, Raju, Robinson (Cognizant) wrote:
 Is there a limitation in Lucene when it comes to wildcard search ?
 Is it a problem if we use less than 3 characters along with a
 wildcard(*).
 Gives me error if I try using 45* , *34 , *3 ..etc .
 Too Many Clauses Error
 Doesn't happen if '?' is used instead of '*'.
 The intriguing thing is , that it is not consistent . 00* doesn't fail.
 Am I missing something ?

The number of clauses added to the query equals the number of
indexed terms that match the wildcard. As each clause ends up using
some buffer memory internally, a maximum was introduced to
avoid running out of memory.
You can change the maximum nr of added clauses using
BooleanQuery.setMaxClauseCount() but then it is advisable
to monitor memory usage, and evt. increase heap space for the JVM.

Regards,
Paul Elschot


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



WildCardQuery

2004-09-20 Thread Raju, Robinson (Cognizant)

Is there a limitation in Lucene when it comes to wildcard search ?
Is it a problem if we use less than 3 characters along with a
wildcard(*).
Gives me error if I try using 45* , *34 , *3 ..etc .
Too Many Clauses Error
Doesn't happen if '?' is used instead of '*'.
The intriguing thing is , that it is not consistent . 00* doesn't fail.
Am I missing something ?

Robin

This e-mail and any files transmitted with it are for the sole use of the intended 
recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient, please contact the sender by reply e-mail and 
destroy all copies of the original message.
Any unauthorised review, use, disclosure, dissemination, forwarding, printing or 
copying of this email or any action taken in reliance on this e-mail is strictly
prohibited and may be unlawful.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



WildcardQuery (and FuzzyQuery) problems or bugs?

2002-08-15 Thread Bjoern Feustel

Hi,

  i've got problems while using wildcard and fuzzy searches.

While indexing a bunch of documents it could happen that a field of all
documents will be left empty (empty String).

I don't know wether this field will be included into the index at all.

But this leads to a problem while searching. If i create some Queries
(actually this does the QueryParser for me) on that field and run a
search with these queries, i would expect no results.

However, WildcardQuery and FuzziQuery sometimes generate a
NullPointerException in WildcardTermEnum.termCompare(...).

It seems to me that this does not happen if i add an additional unused
field to all documents but it may be that this works for me due to other
reasons. To explain what i mean, i've attached some lines of test code.

btw: I'm using lucene nightly-build 20020814.

Am i missing something or is this a bug?

Another question: An WildcardQuery that is constructed of only
non-wildcards gives me an StringIndexOutOfBoundsException while
searching. Shouldn't this be handled in a smarter way?

Thanks in advance,

  Bjoern



import org.apache.lucene.search.*;
import org.apache.lucene.index.*;
import org.apache.lucene.store.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.document.*;
import org.apache.lucene.queryParser.*;

public class Tester {

public void testMe (boolean additionalField) throws Exception {
RAMDirectory indexStore = new RAMDirectory();
IndexWriter writer = new IndexWriter(indexStore, new SimpleAnalyzer(), 
true);
Document doc = new Document();
doc.add(Field.Text(bar0, foo));
doc.add(Field.Text(bar1, ));

// magical unused extra field
if (additionalField)
doc.add(Field.Text(bar2, foo));

writer.addDocument(doc);
writer.optimize();
writer.close();

IndexSearcher searcher = new IndexSearcher(indexStore);

Query[] queries = new Query[3];
queries[0] = new WildcardQuery(new Term(bar0, f?o));// 
works as expected
queries[1] = new TermQuery(new Term(bar1, foo));// 
this one too
queries[2] = new WildcardQuery(new Term(bar1, f?o));// 
strange

Hits result;

for (int q = 0; q  queries.length; q++) {
System.out.println (Query  + q +  ( + 
queries[q].getClass().getName() + ));
System.out.println (\ttoString (\bar0\) =  + 
queries[q].toString (bar0));
System.out.println (\ttoString (\bar1\) =  + 
queries[q].toString (bar1));

result = searcher.search(queries[q]);

if (result.length() == 0) {
System.out.println (\tno results);
}
for (int i = 0; i  result.length(); i++) {
Document d = result.doc(i);
System.out.println (\tresult:  + d.get (bar0) +  
-  + d.get (bar1));
}
}
}

public static void main(String[] _argv) {
Tester t = new Tester();
try {
System.out.println (## With additional unused extra field it 
works);
t.testMe(true);
}
catch (Throwable th) {
System.err.println (OOPS:  + th.getMessage());
th.printStackTrace();
}
try {
System.out.println (\n## Without additional unused extra 
field it won't work);
t.testMe(false);
}
catch (Throwable th) {
System.err.println (OOPS:  + th.getMessage());
th.printStackTrace();
}
}
}



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]


AW: WildcardQuery

2002-05-24 Thread Christian Schrader

It works with the nightly builds and probably with 1.2-RC5 :-)

Christian
 -Ursprungliche Nachricht-
 Von: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
 Gesendet: 07 May 2002 17:31
 An: Lucene Users List
 Betreff: Re: WildcardQuery
 
 
 Yes, me too.  I just tried it on some Lucene index (the search at
 blink.com) and it doesn't seem to work (try searching for travel and
 then *vel).
 I'm assuming the original poster confused something...
 
 Otis
 
 --- Joel Bernstein [EMAIL PROTECTED] wrote:
  I thought Lucene didn't support left wildcards like the following:
  
  *ucene
  
  - Original Message -
  From: Christian Schrader [EMAIL PROTECTED]
  To: Lucene Users List [EMAIL PROTECTED]
  Sent: Monday, May 06, 2002 7:14 PM
  Subject: WildcardQuery
  
  
   I am pretty happy with the results of WildcardQueries like *ucen*
  that
   matches lucene, but *lucene* doesn't match lucene. Is there a
  reason for
   this? And what would be the patch.
   It should be in WildcardTermEnum. I am wondering if somebody
  already
  patched
   it?
  
   Thanks, Chris
  
  
   --
   To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
   For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
  
  
  
  --
  To unsubscribe, e-mail:  
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
  
 
 
 __
 Do You Yahoo!?
 Yahoo! Health - your guide to health and wellness
 http://health.yahoo.com
 
 --
 To unsubscribe, e-mail:   
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail: 
 mailto:[EMAIL PROTECTED]
 
 

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: AW: WildcardQuery

2002-05-24 Thread Ian Lea

Left wildcards seem to work if you explicitly use a
WildcardQuery e.g. 

Term t = new Term(id, *ucene);
Query query = new WildcardQuery(t);


but if use QueryParser with an analyzer e.g.

Analyzer analyzer = new StandardAnalyzer();
Query query = QueryParser.parse(*ucene, id, analyzer);

get an exception:

org.apache.lucene.queryParser.ParseException: 
Lexical error at line 1, column 1.  Encountered: * (42), after : 
  at org.apache.lucene.queryParser.QueryParser.parse(Unknown Source)


Tested on RC5.  Haven't tried other ways of building a query.  In my
simple tests terms with left and right wildcards like *lucene*
worked too, even if the whole word was included.



--
Ian.
[EMAIL PROTECTED]


 [EMAIL PROTECTED] (Christian Schrader) wrote 

 It works with the nightly builds and probably with 1.2-RC5 :-)
 
 Christian
  -Ursprungliche Nachricht-
  Von: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
  Gesendet: 07 May 2002 17:31
  An: Lucene Users List
  Betreff: Re: WildcardQuery
  
  
  Yes, me too.  I just tried it on some Lucene index (the search at
  blink.com) and it doesn't seem to work (try searching for travel and
  then *vel).
  I'm assuming the original poster confused something...
  
  Otis
  
  --- Joel Bernstein [EMAIL PROTECTED] wrote:
   I thought Lucene didn't support left wildcards like the following:
   
   *ucene
   
   - Original Message -
   From: Christian Schrader [EMAIL PROTECTED]
   To: Lucene Users List [EMAIL PROTECTED]
   Sent: Monday, May 06, 2002 7:14 PM
   Subject: WildcardQuery
   
   
I am pretty happy with the results of WildcardQueries like *ucen*
   that
matches lucene, but *lucene* doesn't match lucene. Is there a
   reason for
this? And what would be the patch.
It should be in WildcardTermEnum. I am wondering if somebody
   already
   patched
it?
   
Thanks, Chris

--
Searchable personal storage and archiving from http://www.digimem.net/



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]


RE: AW: WildcardQuery

2002-05-24 Thread Ian Lea

 Hi... I am both a newbie to Lucene and to using this list, so please
 forgive me if I make some mistakes. I am trailing onto this post because I
 cannot seem to get the wildcard function to work at all, while all of the
 other features seem to work just fine.  I am using a very standard
 application (actually, it is just the demo version slightly modified) with
 the StandardAnalyzer and the QueryParser.  But the wildcard feature (using
 either ? or *) just doesn't work. I must be missing something very
 basic.  I would appreciate any ideas. Thanks!

Basic wildcard support (i.e. ignoring things like left wildcards)
comes pretty much out of the box.  Attached is a copy of the
program I was playing with before sending the earlier message.
It uses StandardAnalyzer and the static QueryParser.parse()
method so doesn't work with left wildcards.  I haven't tried
? rather than *.


Hope this helps.



--
Ian.
[EMAIL PROTECTED]

--
Searchable personal storage and archiving from http://www.digimem.net/



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]


RE: AW: WildcardQuery

2002-05-24 Thread Ian Lea

Sorry - I said I was going to send some code ...


--
Ian.

import org.apache.lucene.queryParser.*; 
import org.apache.lucene.search.*; 
import org.apache.lucene.index.*; 
import org.apache.lucene.analysis.*; 
import org.apache.lucene.analysis.standard.*;
import org.apache.lucene.document.*; 
import org.apache.lucene.store.*;


public class LuceneTest { 

RAMDirectory ramdir;
Analyzer analyzer;
IndexWriter writer;
IndexReader reader;
Searcher searcher;

public LuceneTest() {
analyzer = new StandardAnalyzer(); 
ramdir = new RAMDirectory();
}


public static void main(String args[]) throws Exception { 
LuceneTest ld = new LuceneTest();
ld.load();
ld.search();
}


void load() throws Exception {
writer = new IndexWriter(ramdir, analyzer, true); 
add(january);
add(february);
add(june);
add(july);
writer.close();
}   



void add(String s) throws Exception {
Document d = new Document();
d.add(Field.Keyword(id, s));
System.out.println(Adding +s);
writer.addDocument(d);
}



void search() throws Exception {
reader = IndexReader.open(ramdir);
searcher = new IndexSearcher(reader);
search(jan*);
search(jan*y);
search(j*y);
search(j*);
search(*y);
}


void search(String s) throws Exception {
Query query = QueryParser.parse(s, id, analyzer);
Hits hits = searcher.search(query);
System.out.println(s+ matched +hits.length());
for (int i = 0; i  hits.length(); i++) {
System.out.println( + 
   hits.doc(i).get(id));
}
}
}


--
Searchable personal storage and archiving from http://www.digimem.net/



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]


Re: WildcardQuery

2002-05-07 Thread Joel Bernstein

I thought Lucene didn't support left wildcards like the following:

*ucene

- Original Message -
From: Christian Schrader [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, May 06, 2002 7:14 PM
Subject: WildcardQuery


 I am pretty happy with the results of WildcardQueries like *ucen* that
 matches lucene, but *lucene* doesn't match lucene. Is there a reason for
 this? And what would be the patch.
 It should be in WildcardTermEnum. I am wondering if somebody already
patched
 it?

 Thanks, Chris


 --
 To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
mailto:[EMAIL PROTECTED]



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: WildcardQuery

2002-05-07 Thread Jagadesh Nandasamy

Hi Joel,
 lucene does seem to support left wild cards take a look at this

http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg01473.html

-jaggi

Joel Bernstein wrote:

I thought Lucene didn't support left wildcards like the following:

*ucene

- Original Message -
From: Christian Schrader [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Monday, May 06, 2002 7:14 PM
Subject: WildcardQuery


I am pretty happy with the results of WildcardQueries like *ucen* that
matches lucene, but *lucene* doesn't match lucene. Is there a reason for
this? And what would be the patch.
It should be in WildcardTermEnum. I am wondering if somebody already

patched

it?

Thanks, Chris


--
To unsubscribe, e-mail:

mailto:[EMAIL PROTECTED]

For additional commands, e-mail:

mailto:[EMAIL PROTECTED]



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]







Re: WildcardQuery

2002-05-07 Thread Otis Gospodnetic

Yes, me too.  I just tried it on some Lucene index (the search at
blink.com) and it doesn't seem to work (try searching for travel and
then *vel).
I'm assuming the original poster confused something...

Otis

--- Joel Bernstein [EMAIL PROTECTED] wrote:
 I thought Lucene didn't support left wildcards like the following:
 
 *ucene
 
 - Original Message -
 From: Christian Schrader [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]
 Sent: Monday, May 06, 2002 7:14 PM
 Subject: WildcardQuery
 
 
  I am pretty happy with the results of WildcardQueries like *ucen*
 that
  matches lucene, but *lucene* doesn't match lucene. Is there a
 reason for
  this? And what would be the patch.
  It should be in WildcardTermEnum. I am wondering if somebody
 already
 patched
  it?
 
  Thanks, Chris
 
 
  --
  To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




WildcardQuery

2002-05-06 Thread Christian Schrader

I am pretty happy with the results of WildcardQueries like *ucen* that
matches lucene, but *lucene* doesn't match lucene. Is there a reason for
this? And what would be the patch.
It should be in WildcardTermEnum. I am wondering if somebody already patched
it?

Thanks, Chris


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: WildcardQuery

2002-05-06 Thread Peter Carlson

This is fixed in the nightly builds.

--Peter

On 5/6/02 4:14 PM, Christian Schrader [EMAIL PROTECTED] wrote:

 I am pretty happy with the results of WildcardQueries like *ucen* that
 matches lucene, but *lucene* doesn't match lucene. Is there a reason for
 this? And what would be the patch.
 It should be in WildcardTermEnum. I am wondering if somebody already patched
 it?
 
 Thanks, Chris
 
 
 --
 To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
 For additional commands, e-mail: mailto:[EMAIL PROTECTED]
 
 


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: WildcardQuery

2001-12-11 Thread Otis Gospodnetic

If I understand you correctly, you tried to search for '*new*'.  I
believe you can't use an asterisk (*) as the first query of the query
term. So, new* is valid, while *new or *new* is not.

Otis

--- Serge A. Redchuk [EMAIL PROTECTED] wrote:
 Hello sampreet,
 
 Tuesday, December 11, 2001, 6:44:29 AM, you wrote:
 
 sic Hi All,
 
 sic This must be simple enough, but can anyone please explain me
 when a
 sic WildcardQuery is created in QueryParser i.e. what special
 characters in the
 sic query string are required to build a WildcardQuery within
 QueryParser?
 
 Moreover, when I achieved complex search like this: path:*new*
 comp*
 by combining WildcardQueries in BooleanQuery (NOT BY QueryParser),
 and
 then got that query using boolq.toString(...); - the QueryParser
 COULD
 NOT parse this string !!!
 
 Is not it strange ? :
 
QueryParser.parse( bquery.toString( ... ) )   - do not work
 :-(
 
 -- 
 Best regards,
  Sergemailto:[EMAIL PROTECTED]
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Check out Yahoo! Shopping and Yahoo! Auctions for all of
your unique holiday gifts! Buy at http://shopping.yahoo.com
or bid at http://auctions.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re[2]: WildcardQuery

2001-12-11 Thread Serge A. Redchuk

Hello Otis,

Strongly can not agree with you, because I really _can_ search for
anything like '*new*'.

_Simply_Beacuse_I_have_working_code_that_do_it_

Here's a slice of output of my program:

Boolean wildcard search:
built query: bee*
news41:beem;
news42:beem;
news4:beem;

Boolean wildcard search:
built query: *ee
f3:qthree;

Boolean wildcard search:
built query: +be* +path:*ws42
news42:beem;

Boolean wildcard search:
built query: +path:*ws4 +be*
news4:beem;

As you can see the first search returned 3 entries, but the 3-rd -
only one. As well as the 4-th.
And the 2-nd search returned only entry f3:qthree;
(as we've expected: built query: *ee).

And I've achieve it combining WildcardQueries in BooleanQuery, but
did not achieve it by simple call of QueryParser.parser.

Tuesday, December 11, 2001, 4:22:04 PM, you wrote:

OG If I understand you correctly, you tried to search for '*new*'.  I
OG believe you can't use an asterisk (*) as the first query of the query
OG term. So, new* is valid, while *new or *new* is not.

OG Otis

OG --- Serge A. Redchuk [EMAIL PROTECTED] wrote:
 Hello sampreet,
 
 Tuesday, December 11, 2001, 6:44:29 AM, you wrote:
 
 sic Hi All,
 
 sic This must be simple enough, but can anyone please explain me
 when a
 sic WildcardQuery is created in QueryParser i.e. what special
 characters in the
 sic query string are required to build a WildcardQuery within
 QueryParser?
 
 Moreover, when I achieved complex search like this: path:*new*
 comp*
 by combining WildcardQueries in BooleanQuery (NOT BY QueryParser),
 and
 then got that query using boolq.toString(...); - the QueryParser
 COULD
 NOT parse this string !!!
 
 Is not it strange ? :
 
QueryParser.parse( bquery.toString( ... ) )   - do not work
 :-(
 
 -- 
 Best regards,
  Sergemailto:[EMAIL PROTECTED]
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


OG __
OG Do You Yahoo!?
OG Check out Yahoo! Shopping and Yahoo! Auctions for all of
OG your unique holiday gifts! Buy at http://shopping.yahoo.com
OG or bid at http://auctions.yahoo.com



-- 
Best regards,
 Sergemailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re[4]: WildcardQuery

2001-12-11 Thread Serge A. Redchuk

Hello Benjamin,

Tuesday, December 11, 2001, 5:28:46 PM, you wrote:

BK Sergej

BK Could you please provide a sample code to demonstrate how you did that?

Of course:
(please correct me if I'll become wrong finally, but hope that I have
not hallucinations :-)

// search function
void searchBooleanWildcard( HashMap terms, boolean req ) throws IOException {
System.out.println( Boolean wildcard search: );
HashSet entries = new HashSet( terms.entrySet() );
BooleanQuery bQuery = new BooleanQuery();
for( Iterator it = entries.iterator(); it.hasNext(); ){
  Object itn = it.next();
  String where = (String)((Map.Entry)itn).getKey();
  String what = (String)((Map.Entry)itn).getValue();
  WildcardQuery wQuery = new WildcardQuery( new Term( where, what ) );
  //System.out.println( Add to query: [ + where + ,  + what + ] );
  bQuery.add( wQuery, req, false );
}
System.out.println( built query:  + bQuery.toString( body ) );
Searcher searcher = new IndexSearcher( rdir );
this.showHits( searcher.search( bQuery ) );
  }

// used at the end of the function above
private void showHits( Hits hits ) throws IOException {
for( int i=0; ihits.length(); i++ ){
  System.out.println( hits.doc( i ).get( path ) + :
+ hits.doc( i ).get( body ) + ; Score:  + hits.score( i ) );
};
System.out.println(  );
  }

Please do not forget that HashMap can't contain more then one values
with the same key. So the function searchBooleanWildcard(HashMap hmap) can
combine search request only for different field names. (Hope that this
explaination is quite clear).

For example if we built search directories from 3 type of fields:
[body, ...], [path, ...], [type, ...]
we can add no more then 3 pairs to HashMap hmap.

And an example of search:
HashMap phQeryTerms = new HashMap();
phQeryTerms.put( body, *e*n );
sr.searchBooleanWildcard( phQeryTerms, true );

Corresponding output:
Boolean wildcard search:
built query: +*e*n
news7:bean; Score: 1.0
news73:beeemN; Score: 0.25
news71:jEaN; Score: 0.25

Of course, when the next pairs are indexed
( path , body ):
news7, bean
news71, jEaN
news72, lion
news73, beeemN
  
BK Best regards

BK Benjamin

 -Original Message-
 From: Serge A. Redchuk [mailto:[EMAIL PROTECTED]]
 Sent: 11 December 2001 15:24
 To: [EMAIL PROTECTED]
 Subject: Re[2]: WildcardQuery
 
 
 Hello Otis,
 
 Strongly can not agree with you, because I really _can_ search for
 anything like '*new*'.
 
 _Simply_Beacuse_I_have_working_code_that_do_it_
 
 Here's a slice of output of my program:
 
 Boolean wildcard search:
 built query: bee*
 news41:beem;
 news42:beem;
 news4:beem;
 
 Boolean wildcard search:
 built query: *ee
 f3:qthree;
 
 Boolean wildcard search:
 built query: +be* +path:*ws42
 news42:beem;
 
 Boolean wildcard search:
 built query: +path:*ws4 +be*
 news4:beem;
 
 As you can see the first search returned 3 entries, but the 3-rd -
 only one. As well as the 4-th.
 And the 2-nd search returned only entry f3:qthree;
 (as we've expected: built query: *ee).
 
 And I've achieve it combining WildcardQueries in BooleanQuery, but
 did not achieve it by simple call of QueryParser.parser.
 
 Tuesday, December 11, 2001, 4:22:04 PM, you wrote:
 
 OG If I understand you correctly, you tried to search for '*new*'.  I
 OG believe you can't use an asterisk (*) as the first query of the query
 OG term. So, new* is valid, while *new or *new* is not.
 
 OG Otis
 
 OG --- Serge A. Redchuk [EMAIL PROTECTED] wrote:
  Hello sampreet,
  
  Tuesday, December 11, 2001, 6:44:29 AM, you wrote:
  
  sic Hi All,
  
  sic This must be simple enough, but can anyone please explain me
  when a
  sic WildcardQuery is created in QueryParser i.e. what special
  characters in the
  sic query string are required to build a WildcardQuery within
  QueryParser?
  
  Moreover, when I achieved complex search like this: path:*new*
  comp*
  by combining WildcardQueries in BooleanQuery (NOT BY QueryParser),
  and
  then got that query using boolq.toString(...); - the QueryParser
  COULD
  NOT parse this string !!!
  
  Is not it strange ? :
  
 QueryParser.parse( bquery.toString( ... ) )   - do not work
  :-(
  

-- 
Best regards,
 Sergemailto:[EMAIL PROTECTED]


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Re[4]: WildcardQuery

2001-12-11 Thread Benjamin Kopic

Thank you Sergej :)

 -Original Message-
 From: Serge A. Redchuk [mailto:[EMAIL PROTECTED]]
 Sent: 11 December 2001 15:55
 To: Lucene Users List
 Subject: Re[4]: WildcardQuery


 Hello Benjamin,

 Tuesday, December 11, 2001, 5:28:46 PM, you wrote:

 BK Sergej

 BK Could you please provide a sample code to demonstrate how you
 did that?

 Of course:
 (please correct me if I'll become wrong finally, but hope that I have
 not hallucinations :-)

 // search function
 void searchBooleanWildcard( HashMap terms, boolean req ) throws
 IOException {
 System.out.println( Boolean wildcard search: );
 HashSet entries = new HashSet( terms.entrySet() );
 BooleanQuery bQuery = new BooleanQuery();
 for( Iterator it = entries.iterator(); it.hasNext(); ){
   Object itn = it.next();
   String where = (String)((Map.Entry)itn).getKey();
   String what = (String)((Map.Entry)itn).getValue();
   WildcardQuery wQuery = new WildcardQuery( new Term( where, what ) );
   //System.out.println( Add to query: [ + where + ,  +
 what + ] );
   bQuery.add( wQuery, req, false );
 }
 System.out.println( built query:  + bQuery.toString( body ) );
 Searcher searcher = new IndexSearcher( rdir );
 this.showHits( searcher.search( bQuery ) );
   }

 // used at the end of the function above
 private void showHits( Hits hits ) throws IOException {
 for( int i=0; ihits.length(); i++ ){
   System.out.println( hits.doc( i ).get( path ) + :
 + hits.doc( i ).get( body ) + ; Score:  + hits.score( i ) );
 };
 System.out.println(  );
   }

 Please do not forget that HashMap can't contain more then one values
 with the same key. So the function searchBooleanWildcard(HashMap hmap) can
 combine search request only for different field names. (Hope that this
 explaination is quite clear).

 For example if we built search directories from 3 type of fields:
 [body, ...], [path, ...], [type, ...]
 we can add no more then 3 pairs to HashMap hmap.

 And an example of search:
 HashMap phQeryTerms = new HashMap();
 phQeryTerms.put( body, *e*n );
 sr.searchBooleanWildcard( phQeryTerms, true );

 Corresponding output:
 Boolean wildcard search:
 built query: +*e*n
 news7:bean; Score: 1.0
 news73:beeemN; Score: 0.25
 news71:jEaN; Score: 0.25

 Of course, when the next pairs are indexed
 ( path , body ):
 news7, bean
 news71, jEaN
 news72, lion
 news73, beeemN

 BK Best regards

 BK Benjamin

  -Original Message-
  From: Serge A. Redchuk [mailto:[EMAIL PROTECTED]]
  Sent: 11 December 2001 15:24
  To: [EMAIL PROTECTED]
  Subject: Re[2]: WildcardQuery
 
 
  Hello Otis,
 
  Strongly can not agree with you, because I really _can_ search for
  anything like '*new*'.
 
  _Simply_Beacuse_I_have_working_code_that_do_it_
 
  Here's a slice of output of my program:
 
  Boolean wildcard search:
  built query: bee*
  news41:beem;
  news42:beem;
  news4:beem;
 
  Boolean wildcard search:
  built query: *ee
  f3:qthree;
 
  Boolean wildcard search:
  built query: +be* +path:*ws42
  news42:beem;
 
  Boolean wildcard search:
  built query: +path:*ws4 +be*
  news4:beem;
 
  As you can see the first search returned 3 entries, but the 3-rd -
  only one. As well as the 4-th.
  And the 2-nd search returned only entry f3:qthree;
  (as we've expected: built query: *ee).
 
  And I've achieve it combining WildcardQueries in BooleanQuery, but
  did not achieve it by simple call of QueryParser.parser.
 
  Tuesday, December 11, 2001, 4:22:04 PM, you wrote:
 
  OG If I understand you correctly, you tried to search for '*new*'.  I
  OG believe you can't use an asterisk (*) as the first query
 of the query
  OG term. So, new* is valid, while *new or *new* is not.
 
  OG Otis
 
  OG --- Serge A. Redchuk [EMAIL PROTECTED] wrote:
   Hello sampreet,
  
   Tuesday, December 11, 2001, 6:44:29 AM, you wrote:
  
   sic Hi All,
  
   sic This must be simple enough, but can anyone please explain me
   when a
   sic WildcardQuery is created in QueryParser i.e. what special
   characters in the
   sic query string are required to build a WildcardQuery within
   QueryParser?
  
   Moreover, when I achieved complex search like this: path:*new*
   comp*
   by combining WildcardQueries in BooleanQuery (NOT BY QueryParser),
   and
   then got that query using boolq.toString(...); - the QueryParser
   COULD
   NOT parse this string !!!
  
   Is not it strange ? :
  
  QueryParser.parse( bquery.toString( ... ) )   - do not work
   :-(
  

 --
 Best regards,
  Sergemailto:[EMAIL PROTECTED]


 --
 To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Flaws in WildcardQuery design....

2001-10-21 Thread Dave Kor


--- Robert J. Lebowitz [EMAIL PROTECTED]
wrote:
 I've been experimenting with the new WildcardQuery
 class and since there
 isn't really any documentation on its use, I've been
 sort of poking at it to
 see how it is used.
 
 From what I've seen so far, you must construct the
 query by passing it a
 Term object.  However, the String that is passed as
 the constructor for the
 Term must end with an asterix.

Hmm.. are you certain that WildcardQuery is used? If
you are using QueryParser, then Terms ending with an
asterix are handled by PrefixQuery and not
WildcardQuery. 

Although I think WildcardQuery possibly could be used
with terms ending with asterix, it was never tested
this way as it is assumed that PrefixQuery would
handle such cases. 

 Question 1:  Has the QueryParser been updated such
 that it can handle
 wildcard terms using the new WildcardQuery?  I.E.,
 can it return some kind
 of BooleanQuery that incorporates some terms
 utilizing Wildcard searches
 (and others that don't)?

Yes.




__
Do You Yahoo!?
Make a great connection at Yahoo! Personals.
http://personals.yahoo.com