Database integration best practices ...
Hi! As many others I want to use Lucene as a frontend for searching content which is burried in a relational database. As far as I can see this should be no problem, by building documents for single rows in the tables. Since many of you have already done such an approach I would appreciate any suggestions on the following issues: - Consistency What is the best way to maintain consistency between the database and the lucene index. I can think of two solutions: - update index on every insert - ignore index at insert and do full reindex after time (e.g. nightly) - Transactional issues what is the best way to make a database insert + index insert atomic!? - Content Separation My content in the database is spread across multiple tables. But there are clusters of related tables. For example I have 3 tables describing authors of papers. My solution would be a separate index for each of those clusters. When the user does a search every index must be searched separately of course ... Is maintaining a separate index for every topic a good idea? One might ask why not searching against the database directly. Well, I would have to build a search interface (think of boolean issues) on my own, which is definitely something I do not have time for. Additionally my database (Postgresql) doesn't support full-text searches (yet). Any additional input on your expiriences are very welcome! Thx in advance, Peter -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Database integration best practices ...
I forgot one thing to ask: search results should be anchored to a unique id which maps to a serial in the database. If my search now results in multiple such id's what is the best way to transform this into a row-fetching SQL-statement? I think I would end up in something: SELECT * FROM atable WHERE id = 12 AND id = 23 AND id = 34 AND ... and so on. For this purpose it would be nice to limit lucene search results, so that the SQL statement can be limited. Any better idea!? Thx, Peter -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE : Database integration best practices ...
Just a little idea, replace AND by OR in your Select statement I used to store some fields in lucene index in order to show them in the result page. Otherwise, I use : Select * From atable Where id in (12, 23, 34, ...) Elie -Message d'origine- De : Peter Sojan [mailto:[EMAIL PROTECTED]] Envoyé : mercredi 27 mars 2002 09:59 À : Lucene Users List Objet : Re: Database integration best practices ... I forgot one thing to ask: search results should be anchored to a unique id which maps to a serial in the database. If my search now results in multiple such id's what is the best way to transform this into a row-fetching SQL-statement? I think I would end up in something: SELECT * FROM atable WHERE id = 12 AND id = 23 AND id = 34 AND ... and so on. For this purpose it would be nice to limit lucene search results, so that the SQL statement can be limited. Any better idea!? Thx, Peter -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Database integration best practices ...
On Wed, Mar 27, 2002 at 10:53:30AM +, geoff webb wrote: Either: SELECT * FROM atable WHERE id IN ( 12, 23, 34 ... ) OR SELECT * FROM atable WHERE id = 12 OR id = 23 OR id = 34 OR Of course it has to be OR'ed. Must have been an Freudian typo :) Flow of retrieving an entry would be: search index - present results (from index) - select desired result (from database) This should be the right way to go. I just don't want to let my index grow that much, but as you mention going directly into the database for displaying results would cause prohibitive bottlenecks in the backend ... Thx Peter -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
StopFilter-troubles
Dear Lucene-users, has someone an answer to the following question: If I add a StopFilter to my Analyzer, the stopwords I gave him will be left out the query. So far, so good. But when my query is like this one: (field1 : x) AND (field2 : stopword) AND (field 1 : y) the StopFilter will do its work, but the resulting query is a big mess : (field1 : x) AND ( ) AND (field 1 : y), and because of that the searching results ara no good. I hoped it would search for (field1 : x) AND (field 1 : y). I think the StopFilter does a poor job here. Is anyone familiar with this problem and has an answer for me? Puk Witte. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: StopFilter-troubles
--- [EMAIL PROTECTED] wrote: Dear Lucene-users, has someone an answer to the following question: If I add a StopFilter to my Analyzer, the stopwords I gave him will be left out the query. So far, so good. But when my query is like this one: (field1 : x) AND (field2 : stopword) AND (field 1 : y) the StopFilter will do its work, but the resulting query is a big mess : (field1 : x) AND ( ) AND (field 1 : y), and because of that the searching results ara no good. I hoped it would search for (field1 : x) AND (field 1 : y). I think the StopFilter does a poor job here. Is anyone familiar with this problem and has an answer for me? Puk Witte. I tried something like this on one Lucene index: description:travel AND description:a The results were the same as this query: description:travel This seems right to me. Otis __ Do You Yahoo!? Yahoo! Movies - coverage of the 74th Academy Awards® http://movies.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Retrieve all documents from Index, How to?
Hi all, I indexed all records in DataBase. One of the field in my index stores primary key from DataBase (each key is in different document). Is there a way that I can retrieve all documents from index ? I need to validate if all records is indexed. I try search for * but it return empty result. I'm using StandardAnalyzer with Field.KeyWord for the PrimayKey field and everything else is Field.Text Thanks for your help. TihonOne _ Chat with friends online, try MSN Messenger: http://messenger.msn.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: StopFilter-troubles
Dear all, especially Otis Gospodnetic (thanks for your answer), without ( )'s the StopFilter is doing a good job indeed, but if I put them around parts of the query, then the searchResult is wrong. For example: (field1 : x) AND (field2 : stopword) AND (field 1 : y) So I'm afraid my problem is not solved yet. But maybe someone can try it with the ()'s with his own tool and tell me if they've got the same problem. Then I know whether I made a mistake. Puk Witte -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Lucene with Number+Text
Hi, I am indexing as text field. Search for 05qzFebqz01, 05q* do not work. I am using a StandardAnalyzer. Search for 05* works. Searches on another word cq6r work fine. Any idea why this is happening? Thanks! Aruna. -Original Message- From: Ian Lea [mailto:[EMAIL PROTECTED]] Sent: Monday, March 25, 2002 3:56 PM To: Lucene Users List Subject: Re: Lucene with Number+Text Good thinking. In my test, using a Text field, searches for 1727a and 1727* both return a hit but if switch to Keyword they don't. -- Ian. [EMAIL PROTECTED] (Shannon Booher) wrote I think I have seen a similar problem. Are you guys using Keyword or Text fields? -- Searchable personal storage and archiving from http://www.digimem.net/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: StopFilter-troubles
I don't know enough about the query parser to be able to answer that question, but why do you really need those parentheses? It would also be great if you could submit this as a bug at http://jakarta.apache.org/lucene/ Thanks, Otis --- [EMAIL PROTECTED] wrote: Dear all, especially Otis Gospodnetic (thanks for your answer), without ( )'s the StopFilter is doing a good job indeed, but if I put them around parts of the query, then the searchResult is wrong. For example: (field1 : x) AND (field2 : stopword) AND (field 1 : y) So I'm afraid my problem is not solved yet. But maybe someone can try it with the ()'s with his own tool and tell me if they've got the same problem. Then I know whether I made a mistake. Puk Witte -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Movies - coverage of the 74th Academy Awards® http://movies.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Retrieve all documents from Index, How to?
Hi One, Try to do something like this doc.add(Field.Text(type,product)); for all records. Then search for type:product It will return all the records. William. From: Tihon One [EMAIL PROTECTED] Reply-To: Lucene Users List [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Retrieve all documents from Index, How to? Date: Wed, 27 Mar 2002 16:05:16 + Hi all, I indexed all records in DataBase. One of the field in my index stores primary key from DataBase (each key is in different document). Is there a way that I can retrieve all documents from index ? I need to validate if all records is indexed. I try search for * but it return empty result. I'm using StandardAnalyzer with Field.KeyWord for the PrimayKey field and everything else is Field.Text Thanks for your help. TihonOne _ Chat with friends online, try MSN Messenger: http://messenger.msn.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] _ Chat with friends online, try MSN Messenger: http://messenger.msn.com -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Retrieve all documents from Index, How to?
Dear Otis Gospodnetic, these parantheses seem (and are) rather unnecessary, but users of my program can fill in textfields and boolean-radiobuttons and then the program will make a query out of it. My program has a lot of fields (about twenty) and a query will for that reason often get rather complicated. I thought about it and made a query-maker-tool that would also take care of the right use of parentheses. As a result there are sometimes parentheses that are not useful, but are a side-effect of this tool. I thought this would not lead to any problems, unfortunately I had not thought about my StopFilter. But with a more common query, the problem will also occur: (field : AND field : y) OR (field : stopword AND field : stopword) In this case I get a nullpointer-exception. I am afraid I deleted your mail by accident. Could you please mail me the adress you gave me for reporting the bug a second time? Thanks, Puk Witte PS If there is someone else who could help me, please react! -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Term
Hi All, I just tried this again, seems to work fine. Not sure what I have done wrong the first time. Just a follow up. -Original Message- From: Aruna Raghavan [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 27, 2002 12:45 PM To: Lucene Users List Subject: Term Hi, While adding documents using something like the following- document.add(Field.Text(object number, m_strObjectNumber)); I used a string object number as you can see. I can not find the values for object number when I do a search. I am using a StandardAnalyzer. Any idea why this is happening? Thanks, Aruna. -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Term
Aruna, Hi, While adding documents using something like the following- document.add(Field.Text(object number, m_strObjectNumber)); I used a string object number as you can see. I can not find the values for object number when I do a search. I am using a StandardAnalyzer. Any idea why this is happening? You would need to pose a query like this object number:54321 However this is parsed by the standard analyzer as a query looking for the term 'object' in the default field and looking for the term '54321' in the field named 'number'. There are three workarounds: - change your fieldname to eg. objectnumber, and query by: objectnumber:54321 - use 'object number' as the default field for searching. - construct the query without using the standard analyzer. I think the best solution would be to change the fieldname into something shorter like 'onr' which allows for easy querying. Regards, Ype -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Term
Ype, Thanks for the response. I think the reason my search worked was because object number got indexed as object and the searcher searched for object as well. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Wednesday, March 27, 2002 1:31 PM To: [EMAIL PROTECTED] Subject: Re: Term Aruna, Hi, While adding documents using something like the following- document.add(Field.Text(object number, m_strObjectNumber)); I used a string object number as you can see. I can not find the values for object number when I do a search. I am using a StandardAnalyzer. Any idea why this is happening? You would need to pose a query like this object number:54321 However this is parsed by the standard analyzer as a query looking for the term 'object' in the default field and looking for the term '54321' in the field named 'number'. There are three workarounds: - change your fieldname to eg. objectnumber, and query by: objectnumber:54321 - use 'object number' as the default field for searching. - construct the query without using the standard analyzer. I think the best solution would be to change the fieldname into something shorter like 'onr' which allows for easy querying. Regards, Ype -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Lexical Error??? Help Please!
I just started having trouble with Lucene. I'm getting this Lexical error almost out of nowhere. What does this mean? All I can understand from it is that there is an * that is causing a problem, but there are no *'s in the text being searched! Thanks, Alan [Default] Searching holdings against newly inserted research [Default] java.rmi.ServerError: Transaction rolled back; nested exception is: org.apache.lucene.queryParser.TokenMgrError: Lexical error at line 1, co lumn 30. Encountered: * (42), after : [Default] org.apache.lucene.queryParser.TokenMgrError: Lexical error at line 1, column 30. Encountered: * (42), after : -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Field.Text arguments
I'm confused about using Fields. Here's the two methods that are confusing me: public static final Field Text(String name, Reader value) public static final Field Text(String name, String value) The difference is that one takes a reader and the other a string. I have a field that will have pretty large contents after running through my analyzer (1500 to 6000 characters). When I use the second of the two methods above my string is not run through the analyzer, but is stored in the index. When I use the first method, by passing in a StringReader based of the String, I don't get anything indexed at all (and therefore it's difficult to know if it was analyzed). Is there some other Field type that I should be using for text that I want analyzed and indexed, and that the text can be fairly long? Here's a rough order of I'm doing things. FragmentAnalyzer is my own custom class that seems to normally work: Document document = new Document(); Reader reader = new StringReader(text); document.add(Field.Text(contents, reader)); ... FragmentAnalyzer analyzer = new FragmentAnalyzer(); IndexWriter writer = new IndexWriter(pathToIndex, analyzer, isCreateNewIndex); writer.addDocument(document); writer.close(); rob -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Field.Text arguments
Hi, thats interesting. if you do a Field.Text(String name, Reader value) it should be indexed but not stored. strange i had no problems, but i didnt use a stringreader, just file readers. try to do create your customized field, passing a string that is not stored. i dont remember the documentation exactly, but this should be possible passing the right parameters to the field constructor. regards joe Robert A. Decker [EMAIL PROTECTED] writes on Thu, 28 Mar 2002 00:22:36 +0100 (MET): I'm confused about using Fields. Here's the two methods that are confusing me: public static final Field Text(String name, Reader value) public static final Field Text(String name, String value) The difference is that one takes a reader and the other a string. I have a field that will have pretty large contents after running through my analyzer (1500 to 6000 characters). When I use the second of the two methods above my string is not run through the analyzer, but is stored in the index. When I use the first method, by passing in a StringReader based of the String, I don't get anything indexed at all (and therefore it's difficult to know if it was analyzed). Is there some other Field type that I should be using for text that I want analyzed and indexed, and that the text can be fairly long? Here's a rough order of I'm doing things. FragmentAnalyzer is my own custom class that seems to normally work: Document document = new Document(); Reader reader = new StringReader(text); document.add(Field.Text(contents, reader)); ... FragmentAnalyzer analyzer = new FragmentAnalyzer(); IndexWriter writer = new IndexWriter(pathToIndex, analyzer, isCreateNewIndex); writer.addDocument(document); writer.close(); rob -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Field.Text arguments
I think I may be confused on the terminology. What is meant by 'not stored'? The comments on the method that takes a Reader as an argument states that it 'is tokenized and indexed, but not stored in the index verbatim'. I took this to mean that it stores the version of the text after it is run through the analyzer, which is exactly what I want. Now that I've looked at the index files closer, I'm starting to think that perhaps the text may be being stored. It's hard to tell though. I want to be able go get at the contents of the stored field, and can do so easily when I use the method that takes a String as an argument. Here's how I'm trying to get the field back: Field contentsField = doc.getField(contents); I get null back when I used the Reader-as-argument Field method, but get the correct, but unanalyzed, text back when I use the String-as-argument Field method. thanks, rob On Thu, 28 Mar 2002, Joe Hajek wrote: Hi, thats interesting. if you do a Field.Text(String name, Reader value) it should be indexed but not stored. strange i had no problems, but i didnt use a stringreader, just file readers. try to do create your customized field, passing a string that is not stored. i dont remember the documentation exactly, but this should be possible passing the right parameters to the field constructor. regards joe Robert A. Decker [EMAIL PROTECTED] writes on Thu, 28 Mar 2002 00:22:36 +0100 (MET): I'm confused about using Fields. Here's the two methods that are confusing me: public static final Field Text(String name, Reader value) public static final Field Text(String name, String value) The difference is that one takes a reader and the other a string. I have a field that will have pretty large contents after running through my analyzer (1500 to 6000 characters). When I use the second of the two methods above my string is not run through the analyzer, but is stored in the index. When I use the first method, by passing in a StringReader based of the String, I don't get anything indexed at all (and therefore it's difficult to know if it was analyzed). Is there some other Field type that I should be using for text that I want analyzed and indexed, and that the text can be fairly long? Here's a rough order of I'm doing things. FragmentAnalyzer is my own custom class that seems to normally work: Document document = new Document(); Reader reader = new StringReader(text); document.add(Field.Text(contents, reader)); ... FragmentAnalyzer analyzer = new FragmentAnalyzer(); IndexWriter writer = new IndexWriter(pathToIndex, analyzer, isCreateNewIndex); writer.addDocument(document); writer.close(); rob -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Chainable Filter contribution
Dan, Totally my bad. I had since changed it but hadn't posted it to the list coz I didn't think anyone found it useful. Here's the correct version. I haven't really documented since it's pretty straightforward. Just holler if you need any help. Regards, Kelvin - Original Message - From: Armbrust, Daniel C. [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, March 28, 2002 5:17 AM Subject: Chainable Filter contribution I found this in the mailing list, and I do need something like this, as I need to apply more than one filter at a time. I'm fairly new to lucene, however, and my knowledge of BitSets is very limited. My question, if you would be so kind as to donate a minute of time to me, is how does this combine the filters? From my nieve look through it, it seems that all filter results would get discarded except for the last filter that was applied. Thanks, Dan import org.apache.lucene.index.IndexReader; import org.apache.lucene.search.Filter; import java.io.IOException; import java.util.BitSet; /** * p * A ChainableFilter allows multiple filters to be chained * such that the result is the intersection of all the * filters. * /p * p * Order in which filters are called depends on * the position of the filter in the chain. It's probably * more efficient to place the most restrictive filters * /least computationally-intensive filters first. * /p * * @author a href=mailto:[EMAIL PROTECTED];Kelvin Tan/a */ public class ChainableFilter extends Filter { /** The filter chain */ private Filter[] chain = null; /** * Creates a new ChainableFilter. * * @param chain The chain of filters. */ public ChainableFilter(Filter[] chain) { this.chain = chain; } public BitSet bits(IndexReader reader) throws IOException { BitSet result = null; for (int i = 0; i chain.length; i++) { result = chain[i].bits(reader); } return result; } } ChainableFilter.java Description: Binary data -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Question on the FAQ list with filters
On Wed, Mar 27, 2002 at 03:52:21PM -0600, Armbrust, Daniel C. wrote: From the FAQ: 16. What is filtering and how is it performed ? * Search Query - in this approach, provide your custom filter object to the when you call the search() method. This filter will be called exactly once to evaluate every document that resulted in non zero score. * Selective Collection - in this approach you perform the regular search and when you get back the hit list, collect only those that matches your filtering criteria. In this approach, your filter is called only for hits that returned by the search method which may be only a subset of the non zero matches (useful when evaluating your search filter is expensive). *** I don't see why the second way is useful. Yes, your filter is called only for hits that got returned by the search method, but aren't those the same hits that the search() method would run through the filter? Maybe I'm just not reading it close enough. Is my assumption that it is faster to provide a filter to the search() method, than to do a selective collation correct? It Depends. That's more or less the point of the FAQ answer, though it could be more clearly expressed. The gist of the FAQ seems to be that you can either do the filtering BEFORE you do the search, or AFTER you do the search. Obviously the question is, which is more expensive, filtering out inappropriate documents, or searching for the possible hits? If filtering is cheaper, you do the filtering first, then do the search. If filtering is expensive, you do the search first, then do the filtering. You should also factor in which is more restrictive - will either the filter or the search drop out a large number of the documents? If you can arrange it so one is both cheaper and drops out the majority of the documents, you win. In either case, you implement some sort of object which you can hand a org.apache.lucene.TermDocs and get back a yes or no as to whether it's a valid possible search result. From looking at the source for: org.apache.lucene.search.Filter, org.apache.lucene.search.DateFilter, and org.apache.lucene.search.IndexSearcher, ...it appears that you instantiate your Filter subclass, then for filtering BEFORE the search, you pass YourFilter an IndexReader and get back a BitSet. Or more to the point, when you invoke IndexSearcher.search(), you pass it YourFilter, and a HitsCollector, and IndexSearcher.search() gets the BitSet from YourFilter. A BitSet, from the JDK API, is a vector of bit values (i.e. 1 or 0, corresponding to the java boolean values true and false). It appears, from looking at the source, that each Bit in the BitSet corresponds to an SearchIndex TermDoc at the same sequential location in the SearchIndex. IndexSearcher.search() has an inner class (this is a bit ambiguous and it's been a year since I've lookd at inner classes, so I'm going to just handwave and move along :-) with a collect() method that loops through the termDocs, skipping the ones for which BitSet.get() returns false. I'm not sure exactly how you would use an org.apache.lucene.search.Filter to do the filtering AFTER, but presumably that would involve just handing it the TermDocs in question, or maybe IndexReader and Hits both implement a common interface... uhm, no, that's not it. Well, I guess you use your own class for the filter. That's what I ended up doing anyway, in my ignorance of the Filter abstract class. I ended up doing my filtering AFTER, btw, because it involved some expensive lookups in other documents. There's actually a third option, figure out a way to implement your filter as an additional boolean phrase on your search. However, that may or may not be feasible, or the Lucene Filter mechanism may not have been intended to address such cases. To be honest, the design of the Filter seems less well-thought-out than the rest of Lucene, like it's an afterthought. I really oughta join the developers list, I guess, so I can put my money where my mouth is, and submit changes to clarify the docs, etc, when I go roaming through the source. Steven J. Owens [EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Chainable Filter contribution
Stephan, I honestly don't know. There's going to be a /contrib section set up soon though, so I think it might go in there at least. Does it matter? :) Regards, Kelvin - Original Message - From: Strittmatter Stephan (external) [EMAIL PROTECTED] To: 'Lucene Users List' [EMAIL PROTECTED] Sent: Thursday, March 28, 2002 2:54 PM Subject: RE: Chainable Filter contribution Hi Kelvin, I done som similar only doing XOR for my chains. But now your improved filter is better than my own. I think I will replace my own by yours. Will it be part of Lucene in future? Regards, Stephan -Original Message- From: Kelvin Tan [mailto:[EMAIL PROTECTED]] Sent: Thursday, March 28, 2002 2:58 AM To: Armbrust, Daniel C. Cc: [EMAIL PROTECTED] Subject: Re: Chainable Filter contribution Dan, Totally my bad. I had since changed it but hadn't posted it to the list coz I didn't think anyone found it useful. Here's the correct version. I haven't really documented since it's pretty straightforward. Just holler if you need any help. Regards, Kelvin - Original Message - From: Armbrust, Daniel C. [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, March 28, 2002 5:17 AM Subject: Chainable Filter contribution I found this in the mailing list, and I do need something like this, as I need to apply more than one filter at a time. I'm fairly new to lucene, however, and my knowledge of BitSets is very limited. My question, if you would be so kind as to donate a minute of time to me, is how does this combine the filters? From my nieve look through it, it seems that all filter results would get discarded except for the last filter that was applied. Thanks, Dan import org.apache.lucene.index.IndexReader; import org.apache.lucene.search.Filter; import java.io.IOException; import java.util.BitSet; /** * p * A ChainableFilter allows multiple filters to be chained * such that the result is the intersection of all the * filters. * /p * p * Order in which filters are called depends on * the position of the filter in the chain. It's probably * more efficient to place the most restrictive filters * /least computationally-intensive filters first. * /p * * @author a href=mailto:[EMAIL PROTECTED];Kelvin Tan/a */ public class ChainableFilter extends Filter { /** The filter chain */ private Filter[] chain = null; /** * Creates a new ChainableFilter. * * @param chain The chain of filters. */ public ChainableFilter(Filter[] chain) { this.chain = chain; } public BitSet bits(IndexReader reader) throws IOException { BitSet result = null; for (int i = 0; i chain.length; i++) { result = chain[i].bits(reader); } return result; } } -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]