query parser
I want to use query parser to parse my query string But the default field should be a group of fields with different fields where it is searched on Can any one let me know For example if my query is new books new should be searched in different fields ( content and title) books should be searched in different fields ( content and title) How do i accomplish this and how can i extend querparser to do the above
Re: query parser
Take a look at the class MultiFieldQueryParser, I think it does exactly what you want. GR, Rainer Raghavendra Prabhu wrote: I want to use query parser to parse my query string But the default field should be a group of fields with different fields where it is searched on Can any one let me know For example if my query is new books new should be searched in different fields ( content and title) books should be searched in different fields ( content and title) How do i accomplish this and how can i extend querparser to do the above - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Get only count
Signifies this that method collect can be called for document with score = 0 ? -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 07, 2006 6:35 PM To: java-user@lucene.apache.org Subject: Re: Get only count Importance: High On 3/7/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Can have matching document score equals zero ? Yes. Scorers don't generally use score to determine if a document matched the query. Scores = 0.0f are currently screened out at the top level search functions, but not when you use a HitCollector yourself. -Yonik -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 07, 2006 6:20 PM To: java-user@lucene.apache.org Subject: Re: Get only count Importance: High On 3/7/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: While you added if (score 0.0f). Javadoc contain lines HitCollector.collect(int,float) is called for every non-zero scoring. That should probably read is called for every matching document. -Yonik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: query parser
Hi Rainer Thanks. I have one more doubt. How do i set different boosts for each field using query parser Can i set different boosts for each field? Rgds Prabhu On 3/8/06, Rainer Dollinger [EMAIL PROTECTED] wrote: Take a look at the class MultiFieldQueryParser, I think it does exactly what you want. GR, Rainer Raghavendra Prabhu wrote: I want to use query parser to parse my query string But the default field should be a group of fields with different fields where it is searched on Can any one let me know For example if my query is new books new should be searched in different fields ( content and title) books should be searched in different fields ( content and title) How do i accomplish this and how can i extend querparser to do the above - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 1.9.1 and timeToString() apparent incompatibility with 1.4.3
: Thanks Chris for making it clear, I had read the comment but I had not : understood that it implied incompatibility. But will the code be preserved : in Lucene 2.0, in light of the comment contained in the Lucene 1.9.1 : announcement ? I don't really know, it's currently being discussed in LUCENE-500... http://issues.apache.org/jira/browse/LUCENE-500 -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 1.9.1 and timeToString() apparent incompatibility with 1.4.3
thanks Chris, I think I'll opt for re-creating the index now, using the new 1.9.1 code. Sooner or later, it seems to me, the deprecated code will be removed anyway. Better facing the pain now than later, makes it possible for me to take advantage of the new date resolution features. Even though I can live without them they can be a performance boost. Victor From: Chris Hostetter [EMAIL PROTECTED] Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: Lucene 1.9.1 and timeToString() apparent incompatibility with 1.4.3 Date: Wed, 8 Mar 2006 01:03:43 -0800 (PST) : Thanks Chris for making it clear, I had read the comment but I had not : understood that it implied incompatibility. But will the code be preserved : in Lucene 2.0, in light of the comment contained in the Lucene 1.9.1 : announcement ? I don't really know, it's currently being discussed in LUCENE-500... http://issues.apache.org/jira/browse/LUCENE-500 -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] _ New year, new job there's more than 100,00 jobs at SEEK http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Eseek%2Ecom%2Eau_t=752315885_r=Jan05_tagline_m=EXT - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
MuliField Query Parser
Hi I need different boosts for fields which we define in multifield query parser How can this be accomplished?? Rgds Prabhu
Re: MuliField Query Parser
You could try to inherit from MultiFieldQueryParser: public class BoostableMultiFieldQueryParser extends MultiFieldQueryParser { // TODO: add constructors of super class public static Query parse(String query, String[] fields, BooleanClause.Occur[] flags,Analyzer analyzer, float[] boosts) throws ParseException { if (fields.length != flags.length) throw new IllegalArgumentException(fields.length != flags.length); BooleanQuery bQuery = new BooleanQuery(); for (int i = 0; i fields.length; i++) { QueryParser qp = new QueryParser(fields[i], analyzer); Query q = qp.parse(query); // ATTENTION: the only new line !!! q.setBoost(boost[i]); bQuery.add(q, flags[i]); } return bQuery; } } I copied the code of method parse(String, String, BooleanClause.Occur[], Analyzer) and added the parameter float[] boosts. I marked the only line I have inserted. You have to add the constructors from the super class to get the class compiled. I did'nt have the time to test this idea, please post a reply if it works, if you try this. Rainer Raghavendra Prabhu wrote: Hi I need different boosts for fields which we define in multifield query parser How can this be accomplished?? Rgds Prabhu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: RangeQuery and RangeFilter
See http://wiki.apache.org/jakarta-lucene/FilteringOptions --- Anton Potehin [EMAIL PROTECTED] wrote: What faster RangeQuery or RangeFilter ? ___ Win a BlackBerry device from O2 with Yahoo!. Enter now. http://www.yahoo.co.uk/blackberry - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
1.4.3 and 64bit support? out of memory??
hi all, i've been trying to load a 6GB index on linux (16GB RAM) but am having no success. i wrote a program that allocates memory and it was able to allocate as much RAM as i requested (stopped at 12GB) however i am recieving the following stack trace: JVMDUMP013I Processed Dump Event uncaught, detail java/lang/OutOfMemoryError. Exception in thread main java.lang.OutOfMemoryError at org.apache.lucene.index.TermInfosReader.readIndex(TermInfosReader.java:82) at org.apache.lucene.index.TermInfosReader.init(TermInfosReader.java:45) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:112) at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:89) at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:118) at org.apache.lucene.store.Lock$With.run(Lock.java:109) at org.apache.lucene.index.IndexReader.open(IndexReader.java:111) at org.apache.lucene.index.IndexReader.open(IndexReader.java:106) at org.apache.lucene.search.IndexSearcher.init(IndexSearcher.java:43) when trying to load the indexes any ideas thanks in advance, - Yahoo! Mail Bring photos to life! New PhotoMail makes sharing a breeze.
Re: Get only count
On Wednesday 08 March 2006 09:25, [EMAIL PROTECTED] wrote: Signifies this that method collect can be called for document with score = 0 ? The collect() method is called after next() on the top level Scorer has returned true. In between score() is called on that Scorer to provide the score value, but the score value is not tested. Most Scorers give only positive score values for matching documents. This is implemented in the IndexSearcher.search(...) and Scorer.score(HitCollector) methods. Regards, Paul Elschot -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 07, 2006 6:35 PM To: java-user@lucene.apache.org Subject: Re: Get only count Importance: High On 3/7/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Can have matching document score equals zero ? Yes. Scorers don't generally use score to determine if a document matched the query. Scores = 0.0f are currently screened out at the top level search functions, but not when you use a HitCollector yourself. -Yonik -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 07, 2006 6:20 PM To: java-user@lucene.apache.org Subject: Re: Get only count Importance: High On 3/7/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: While you added if (score 0.0f). Javadoc contain lines HitCollector.collect(int,float) is called for every non-zero scoring. That should probably read is called for every matching document. -Yonik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 1.4.3 and 64bit support? out of memory??
z shalev wrote: hi all, i've been trying to load a 6GB index on linux (16GB RAM) but am having no success. i wrote a program that allocates memory and it was able to allocate as much RAM as i requested (stopped at 12GB) Was your program that got up to 12GB of memory written in Java, and using the same jvm with the same -Xmx settings as your lucene program? Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene Scoring
Hi, From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Anyone have a doc or something that would allow me to explain this to execs? A Lucene Scoring for Dummies idea...explaining math algo to a exec or someone with no knowledge is not that easy :) http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.h tml And Lucene Book: - 3.3 : Understanding Lucene scoring http://lucenebook.com/search?query=scoring Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 1.4.3 and 64bit support? out of memory??
yes, 100% Dan Armbrust [EMAIL PROTECTED] wrote: z shalev wrote: hi all, i've been trying to load a 6GB index on linux (16GB RAM) but am having no success. i wrote a program that allocates memory and it was able to allocate as much RAM as i requested (stopped at 12GB) Was your program that got up to 12GB of memory written in Java, and using the same jvm with the same -Xmx settings as your lucene program? Dan -- Daniel Armbrust Biomedical Informatics Mayo Clinic Rochester daniel.armbrust(at)mayo.edu http://informatics.mayo.edu/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Yahoo! Mail Bring photos to life! New PhotoMail makes sharing a breeze.
Does Lucene support on-disk search?
Hi, I heard that Lucene loads the index into memory to do a search, which does not sound quite right to me. I will not be surprised if Lucene is smart enough to load the index into memory when it is feasible, but I'd be surprised if it ALWAYS loads index memory to do the search, which I think would have scalability problem. Could someone clarify on this, thanks! By the way, could someone please share some experience on the performance of Lucene, say, on a data set of a few gigabytes and a reasonable query, what would be the average search time? Xiaocheng - Yahoo! Mail Use Photomail to share photos without annoying attachments.
Re: Lucene Scoring
[EMAIL PROTECTED] wrote: Anyone have a doc or something that would allow me to explain this to execs? Roughly speaking: * Documents containing *all* the search terms are good * Matches on rare words are better than for common words * Long documents are not as good as short ones * Documents which mention the search terms many times are good ...although there are more factors you can choose to add, like emphasising individual query terms or individual docs in the index. Cheers Mark ___ To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Does Lucene support on-disk search?
Lucene _can_ load the index into memory, but it doesn't have to, if you want further details see the Javadocs on RAMDirectory versus FSDirectory. I think you will find it has good performance on a few gigs of data. Results, of course, vary based on what you are asking it to do and what kind of hardware you have. -Grant Xiaocheng Luan wrote: Hi, I heard that Lucene loads the index into memory to do a search, which does not sound quite right to me. I will not be surprised if Lucene is smart enough to load the index into memory when it is feasible, but I'd be surprised if it ALWAYS loads index memory to do the search, which I think would have scalability problem. Could someone clarify on this, thanks! By the way, could someone please share some experience on the performance of Lucene, say, on a data set of a few gigabytes and a reasonable query, what would be the average search time? Xiaocheng - Yahoo! Mail Use Photomail to share photos without annoying attachments. -- --- Grant Ingersoll Sr. Software Engineer Center for Natural Language Processing Syracuse University School of Information Studies 335 Hinds Hall Syracuse, NY 13244 http://www.cnlp.org Voice: 315-443-5484 Fax: 315-443-6886 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 1.4.3 and 64bit support? out of memory??
: i am recieving the following stack trace: : : JVMDUMP013I Processed Dump Event uncaught, detail java/lang/OutOfMemoryError. : Exception in thread main java.lang.OutOfMemoryError : at org.apache.lucene.index.TermInfosReader.readIndex(TermInfosReader.java:82) is it possible that parts of your application are eating up all of the heap in your JVM before this exception is encountered? Possibly by opening a the index many times without closing it? More specifically, if you write a 4 line app that does nothing by open your index and then close it again, do you get an OOM? ... public class Main { public static void main(String[] args) throws Exception { Searcher s = new IndexSearcher(/your/index/path); s.close(); } } -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene Scoring
: Roughly speaking: : : * Documents containing *all* the search terms are good : * Matches on rare words are better than for common words : * Long documents are not as good as short ones : * Documents which mention the search terms many times are good Be wary of the distinction between term and word and how that affects statements like Long documents are not as good as short ones ... If you have a title field and body field and one document has a really long body, but a very short title then a search on the title isn't going to be penalized by the length of the body ... you have to choose your words carefully. -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 1.4.3 and 64bit support? out of memory??
hey chris, i will check and let you know just to make sure, basically i see the OS allocating memory (up to about 4GB) while loading the indexes to memory and then crashing on the TermInfosReader class. what i noticed was that the crash occured when lucene tried to create a Term array with the following code new Term[indexSize] i assume, since this is an array java was trying to allocate consecutive blocks in memory and this is hard to find , even in a 16 GB RAM machine, especially since (if im not mistaken) indexSize here is the termEnum size (which in my case is rather large) i will get back to you about the one liner, if you have any other thoughts id be extremely happy to hear them as this problem is a Major road block thanks a million Chris Hostetter [EMAIL PROTECTED] wrote: : i am recieving the following stack trace: : : JVMDUMP013I Processed Dump Event uncaught, detail java/lang/OutOfMemoryError. : Exception in thread main java.lang.OutOfMemoryError : at org.apache.lucene.index.TermInfosReader.readIndex(TermInfosReader.java:82) is it possible that parts of your application are eating up all of the heap in your JVM before this exception is encountered? Possibly by opening a the index many times without closing it? More specifically, if you write a 4 line app that does nothing by open your index and then close it again, do you get an OOM? ... public class Main { public static void main(String[] args) throws Exception { Searcher s = new IndexSearcher(/your/index/path); s.close(); } } -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Yahoo! Mail Bring photos to life! New PhotoMail makes sharing a breeze.
Re: 1.4.3 and 64bit support? out of memory??
z shalev wrote: hey chris, i will check and let you know just to make sure, basically i see the OS allocating memory (up to about 4GB) while loading the indexes to memory and then crashing on the TermInfosReader class. what i noticed was that the crash occured when lucene tried to create a Term array with the following code new Term[indexSize] i assume, since this is an array java was trying to allocate consecutive blocks in memory and this is hard to find , even in a 16 GB RAM machine, especially since (if im not mistaken) indexSize here is the termEnum size (which in my case is rather large) That's not exactly how memory works. When a program looks to allocate a chunk of memory, the chunk is allocated from the virtual memory space. In the case of Windows XP on a 32-bit machine, the maximum contiguous virtual memory is somewhere just below 2GB in a best-case scenario (usually it's more like 1.5GB) regardless of the amount of physical RAM. In the case of a 64-bit machine, though, the virtual memory space is much, much larger than your 16GB of RAM, so there should be no problem allocating ridiculous amounts of memory (or for that matter, memory mapping ridiculously large files to a byte buffer.) It wasn't mentioned explicitly so it's probably worth checking... you are using the 64-bit JVM, right? If you were still using the 32-bit JVM, that would certainly exhibit this sort of behaviour. Daniel -- Daniel Noll Nuix Pty Ltd Suite 79, 89 Jones St, Ultimo NSW 2007, AustraliaPh: +61 2 9280 0699 Web: http://www.nuix.com.au/Fax: +61 2 9212 6902 This message is intended only for the named recipient. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this message or attachment is strictly prohibited. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene Ranking/scoring
Hi, Just wondering how I can rank search result by a combination of fields. I know there is a multi-field sort, but it is just a sorting method. It is sorted by the first field and then the second field ... What I need is a weighted combination. For example, I want to assign a weight of 2 to title match, 1.5 to abstract match, and 3 to date match (i.e. How close the last modified date). The final score will be 2*inTitle+1.5*inAbstract+3*date instead of sorting by date and then sorting by title within the same date. I checked lucene Score, Similarity, and SortDocComparator and can't find an answer. Implements the SortDocComparator seems the closest, but it can only sort the result by one field. The Field boost does not work because the boosting factor has to be set during index time. What I need is setting the weight at query time. Please help. Thanks. Yang - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene Ranking/scoring
Hi Yang, Boosting works at query time as well as index time. If you are using the QueryParser, specify boosts like so: title:foo^2 abstract:foo^1.5 date:mydate^3 If you are building queries pragmatically, then use the Query.setBoost() method. That will boost relative to how a non-boosted query would score, but keep in mind that you still have tf/idf factors in the score. If you need to get rid of the tf/idf factors, either write your own ScoreDocComparator, or use a FunctionQuery. -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server On 3/8/06, Yang Sun [EMAIL PROTECTED] wrote: Hi, Just wondering how I can rank search result by a combination of fields. I know there is a multi-field sort, but it is just a sorting method. It is sorted by the first field and then the second field ... What I need is a weighted combination. For example, I want to assign a weight of 2 to title match, 1.5 to abstract match, and 3 to date match (i.e. How close the last modified date). The final score will be 2*inTitle+1.5*inAbstract+3*date instead of sorting by date and then sorting by title within the same date. I checked lucene Score, Similarity, and SortDocComparator and can't find an answer. Implements the SortDocComparator seems the closest, but it can only sort the result by one field. The Field boost does not work because the boosting factor has to be set during index time. What I need is setting the weight at query time. Please help. Thanks. Yang - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Atomic index/search for a phrase
Hi All, I am trying index and search a phrase (multiple words seperated by spaces). How should i index it so that it remains atomic. I have observed that if i index the phrase are keyword, lucene doesn't let me retrive the phrase in search. Please advice. Urvashi - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene Ranking/scoring
Hi Yonik, Thanks very much for your suggestion. The query boost works great for keyword matching. But in my case, I need to rank the results by date and title. For example, title:foo^2 abstract:foo^1.5 date:2004^3 will only boost the document with date=2004. What I need is boosting the distance from the specified date which means 2003 will have a better ranking than 2002, 20022001, etc. I implemented a customized ScoreDocComparator class which works fine for one field. But I met some trouble when trying to combine other fields together. I'm still looking at FunctionQuery. Don't know if I can figure out something. Any suggestions? Thanks. Yang -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: 2006年3月8日 21:35 To: java-user@lucene.apache.org Subject: Re: Lucene Ranking/scoring Hi Yang, Boosting works at query time as well as index time. If you are using the QueryParser, specify boosts like so: title:foo^2 abstract:foo^1.5 date:mydate^3 If you are building queries pragmatically, then use the Query.setBoost() method. That will boost relative to how a non-boosted query would score, but keep in mind that you still have tf/idf factors in the score. If you need to get rid of the tf/idf factors, either write your own ScoreDocComparator, or use a FunctionQuery. -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server On 3/8/06, Yang Sun [EMAIL PROTECTED] wrote: Hi, Just wondering how I can rank search result by a combination of fields. I know there is a multi-field sort, but it is just a sorting method. It is sorted by the first field and then the second field ... What I need is a weighted combination. For example, I want to assign a weight of 2 to title match, 1.5 to abstract match, and 3 to date match (i. e. How close the last modified date). The final score will be 2*inTitle+1.5*inAbstract+3*date instead of sorting by date and then sorting by title within the same date. I checked lucene Score, Similarity, and SortDocComparator and can't find an answer. Implements the SortDocComparator seems the closest, but it can only sort the result by one field. The Field boost does not work because the boosting factor has to be set during index time. What I need is setting the weight at query time. Please help. Thanks. Yang - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RangeQuery, FilterdQuery and HitCollector
Hello, I would like to use a Filter for rangeQuery ( to avoid potential TooManyClauses exception ) and found out http://wiki.apache.org/jakarta-lucene/FilteringOptions wiki said that FilteredQuery is best one. But Interesting is that when I used the option with HitCollector , FilteredQuery test is fail. Am I something missing or FilteredQuery with HitCollector is forbid or a bug ? Please refer to the my test code. -- import junit.framework.TestCase; import org.apache.lucene.analysis.cjk.CJKAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.NumberTools; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.Filter; import org.apache.lucene.search.FilteredQuery; import org.apache.lucene.search.HitCollector; import org.apache.lucene.search.Hits; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.RangeFilter; import org.apache.lucene.search.Searcher; import org.apache.lucene.search.TermQuery; import org.apache.lucene.store.Directory; import org.apache.lucene.store.RAMDirectory; import java.io.IOException; import java.io.Serializable; import java.util.Collection; import java.util.HashSet; public class FilteredRangeQueryTest extends TestCase { private Directory ramDir; protected void setUp() throws Exception { ramDir = new RAMDirectory(); addDocuments(); } public void testRangeQuery() throws Exception { IndexSearcher searcher = new IndexSearcher(ramDir); Filter filter = RangeFilter.Less(num, NumberTools.longToString(1L)); Term term = new Term(attid, NumberTools.longToString(113L)); Query query = new TermQuery(term); Hits hits = searcher.search(query, filter); assertEquals(0, hits.length()); HitCollector hitCollector = new TestHitCollector(); ((TestHitCollector) hitCollector).setSearcher(searcher); This test is Pass searcher.search(query, filter, hitCollector); assertEquals(0, ((TestHitCollector) hitCollector).getIds().size()); } public void testFilteredQuery() throws Exception { IndexSearcher searcher = new IndexSearcher(ramDir); Filter filter = RangeFilter.Less(num, NumberTools.longToString(1L)); Term term = new Term(attid, NumberTools.longToString(113L)); Query query = new TermQuery(term); FilteredQuery fq = new FilteredQuery(query, filter); Hits hits = searcher.search(fq); assertEquals(0, hits.length()); HitCollector hitCollector = new TestHitCollector(); ((TestHitCollector) hitCollector).setSearcher(searcher); // This test is FAIL // searcher.search(fq, hitCollector); assertEquals(0, ((TestHitCollector) hitCollector).getIds().size()); } private void addDocuments() throws IOException { IndexWriter writer = new IndexWriter(ramDir, new CJKAnalyzer(), true); Document doc = new Document(); doc.add(Field.Keyword(num, NumberTools.longToString(1000L))); doc.add(Field.Keyword(attid, NumberTools.longToString(113L))); doc.add(Field.Keyword(itid, 111)); writer.addDocument(doc); writer.optimize(); writer.close(); } public class TestHitCollector extends HitCollector implements Serializable { private transient Searcher searcher; private transient Collection res; public TestHitCollector() { } public void setSearcher(Searcher searcher) { res = new HashSet(); this.searcher = searcher; } public void collect(int i, float v) { try { final Document doc = searcher.doc(i); res.add(doc.get(itid)); } catch (IOException e) { // ignored } } public Collection getIds() { return res; } } } Thanks, Youngho