Re: The case of the disappearing index files
i also had an experience on this, what i did is i wrap my searcher into a singleton object, and check if it is being used by another thread, then i let the thread caller to put on wait state, until the other thread finish using the searcher. maybe it can help buics Scott Smith <[EMAIL PROTECTED]> wrote: We have started using lucene as the indexer for messages on our website. We are seeing a problem where some index files seem to disappear (we've seen the segment file vanish as well as some others). My first thought after looking though some archives is that maybe we are getting the "too many open files" problem and this means that a file might get deleted in preparation for being rewritten, but it can't be rewritten because there are no file handles (this is on a Windows XP box). Since the indexer is pretty staight forward in that it opens an IndexWriter, adds new messages received in the last minute and then closes the IndexWriter, I'm pretty sure it's ok. Besides, we didn't see this problem until we started doing lots of searches. I'm feeling less comfortable with the search code. Here are a couple of snippets. The first was a transliteration of some code that I saw in a Doug C. posting (it was in v1.2 form and I needed it in v1.3) private Searcher m_Searcher = null; private long m_LastModified; private void getSearcher() throws IOException { // has the index been modified since last we looked? long newModified = IndexReader.getCurrentVersion(m_IndexDirectory); if (m_LastModified != newModified) { // Get a new searcher and orphan the old one w/o closing m_Searcher = new IndexSearcher(m_IndexDirectory); m_LastModified = newModified; } } Here's a somewhat simplified version (I search more fields) of the search code that calls it. public synchronized Hits SimpleSearch(String a_SearchString) throws IOException, ParseException { Query q = QueryParser.parse(a_SearchString, "Body", m_Analyzer); try { getSearcher(); } catch (IOException e) { // if we can't generate searcher, then claim // nothing is there m_lggr.error(e.getMessage()); return null; } Hits hits = m_Searcher.search(q); return hits; } The caller then can walk through the hits list to get the messages. Originally, I would close the searcher after I got the hits, but I found that you couldn't access the documents in the Hits structure once the IndexSearcher was closed (Looking at the source, it looks like the Hits list doesn't actually have the documents in it, but simply has references to them which it uses the Searcher object to get at). So, I now never close the Searcher (though I'll create a new one if the index has been modified since the last time I looked). One other thing, I know the web guy using this is creating a new object everytime he does a search (which I will talk to him about since I think that's the wrong thing based on what I've read). Is that my only problem? Do I really want to wait until garbage collection deletes the old Searchers for the files it has opened to get closed? Does anyone see anything wrong with the above code or anything I should do to optimize it? Suggestions anyone? Scott - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] "We shape clay into a pot but it is the emptyness inside that holds whatever we want." Lao Tzu - Do you Yahoo!? Yahoo! SiteBuilder - Free web site building tool. Try it!
RE: use Lucene LOCAL (looking for a frontend)
Sorry, but i can´t send a mail to the server, which is the addres, or do you can help me I want to install lucene in windows xp, do you know where i can find information, ive traed but when test the demo executing java org.apache.lucene.demo.IndexFiles {full-path-to-lucene}/src i have the next error C:\Documents and Settings\juan>java org.apache.lucene.demo.IndexFiles e:\lucene \src Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/lucene/dem o/IndexFiles anyone know why thanks a lot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
The case of the disappearing index files
We have started using lucene as the indexer for messages on our website. We are seeing a problem where some index files seem to disappear (we've seen the segment file vanish as well as some others). My first thought after looking though some archives is that maybe we are getting the "too many open files" problem and this means that a file might get deleted in preparation for being rewritten, but it can't be rewritten because there are no file handles (this is on a Windows XP box). Since the indexer is pretty staight forward in that it opens an IndexWriter, adds new messages received in the last minute and then closes the IndexWriter, I'm pretty sure it's ok. Besides, we didn't see this problem until we started doing lots of searches. I'm feeling less comfortable with the search code. Here are a couple of snippets. The first was a transliteration of some code that I saw in a Doug C. posting (it was in v1.2 form and I needed it in v1.3) private Searcher m_Searcher = null; private long m_LastModified; private void getSearcher() throws IOException { // has the index been modified since last we looked? long newModified = IndexReader.getCurrentVersion(m_IndexDirectory); if (m_LastModified != newModified) { // Get a new searcher and orphan the old one w/o closing m_Searcher = new IndexSearcher(m_IndexDirectory); m_LastModified = newModified; } } Here's a somewhat simplified version (I search more fields) of the search code that calls it. public synchronized Hits SimpleSearch(String a_SearchString) throws IOException, ParseException { Query q = QueryParser.parse(a_SearchString, "Body", m_Analyzer); try { getSearcher(); } catch (IOException e) { // if we can't generate searcher, then claim // nothing is there m_lggr.error(e.getMessage()); return null; } Hits hits = m_Searcher.search(q); return hits; } The caller then can walk through the hits list to get the messages. Originally, I would close the searcher after I got the hits, but I found that you couldn't access the documents in the Hits structure once the IndexSearcher was closed (Looking at the source, it looks like the Hits list doesn't actually have the documents in it, but simply has references to them which it uses the Searcher object to get at). So, I now never close the Searcher (though I'll create a new one if the index has been modified since the last time I looked). One other thing, I know the web guy using this is creating a new object everytime he does a search (which I will talk to him about since I think that's the wrong thing based on what I've read). Is that my only problem? Do I really want to wait until garbage collection deletes the old Searchers for the files it has opened to get closed? Does anyone see anything wrong with the above code or anything I should do to optimize it? Suggestions anyone? Scott - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: use Lucene LOCAL (looking for a frontend)
Hi, My company has implemented a socket based interface for lucene. To index and query documents you need to construct and xml document and then send it to our "luceneserver" which listens on a socket (can be same machine or different). I can email this to you if you wish, it is ~2.5Mb including all libs to run it. It is currently licensed under GPL. btw; installing tomcat to test the lucene webapp locally is not too difficult. Hamish Carpenter. Sebastian Fey wrote: hi, my task is to implement a search engine to a documentation in HTML. the files are not online but local. But the "getting started" guide at lucene-home just explains howto set up lucene with tomcat. (ive never set up a webserver) I was able to create an index of my files, but now the web-frontend is missing. I think its in the luceneweb.war, right? So, my qustion, how can i use lucene local? Can someone provide a html-frontend? thx in advance, Sebastian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: AW: use Lucene LOCAL (looking for a frontend)
On Jan 28, 2004, at 9:37 AM, Sebastian Fey wrote: this is JSP. i think i need to set up a webserver to run it. (sorry all this web and server stuff really isnt my field ... :) ) is there actually a way to use Lucene without a webserver? Yes. Lucene has *nothing* to do with web applications. It is completely orthogonal. Look at the many Lucene articles (mine at java.net are the most recent). You have to write some Java code, but you can easily write a few lines of code that search an index and output the results. In fact, look at Luke if you want a pre-built desktop application to browse/search an index. Also, look at the lucli project in the sandbox for a command-line tool. The Lucene website has pointers to all that I've mentioned here. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: use Lucene LOCAL (looking for a frontend)
Why don't you take a look at luke. That way you can play with the index you built and work from there. If you're looking to replicate something like Luke, I'd get studying now ;). http://www.getopt.org/luke/ -Original Message- From: Sebastian Fey [mailto:[EMAIL PROTECTED] Sent: 28 January 2004 14:23 To: Lucene Users List Subject: AW: use Lucene LOCAL (looking for a frontend) >Not being funny, but if you have no experience in Java, then why are you using a Java >API >for index building/text searching ? im just testing some possibilities. though i cant write an java application, i can read it and, if someone gives me something to start with, im sure ill make it. if lucene seems to be the best solution, ill spend some time to leran something about java. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
AW: use Lucene LOCAL (looking for a frontend)
hi again, one more question... >No offense intended at all, but you'll really need some Java experience >to do stuff with Lucene. There is no real good out-of-the-box >front-end at the moment, unless you went with something like Searchblox >(www.searchblox.com). this is JSP. i think i need to set up a webserver to run it. (sorry all this web and server stuff really isnt my field ... :) ) is there actually a way to use Lucene without a webserver? thx :) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: use Lucene LOCAL (looking for a frontend)
For an "out of the box" job, I found searchblox pretty impressive, and easy to install. -Original Message- From: Sebastian Fey [mailto:[EMAIL PROTECTED] Sent: 28 January 2004 14:23 To: Lucene Users List Subject: AW: use Lucene LOCAL (looking for a frontend) >Not being funny, but if you have no experience in Java, then why are you using a Java >API >for index building/text searching ? im just testing some possibilities. though i cant write an java application, i can read it and, if someone gives me something to start with, im sure ill make it. if lucene seems to be the best solution, ill spend some time to leran something about java. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] This e-mail and any attachments may be confidential and/or legally privileged. If you have received this e-mail and you are not a named addressee, please inform Landmark Information Group on 01392 441700 and then delete the e-mail from your system. If you are not a named addressee you must not use, disclose, distribute, copy, print or rely on this e-mail. This email and any attachments have been scanned for viruses and to the best of our knowledge are clean. To ensure regulatory compliance and for the protection of our clients and business, we may monitor and read e-mails sent to and from our servers. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
AW: use Lucene LOCAL (looking for a frontend)
>>> How you present the search results will be up to you and the needs of >>> your >>> project. >> >> ive NO experience with java. >> it would be nice to see an example of a webinterface, that implements >> lucene to have something to start with. > >No offense intended at all, :) >but you'll really need some Java experience to do stuff with Lucene. >There is no real good out-of-the-box front-end at the moment, >unless you went with something like Searchblox (www.searchblox.com). nice ill take a look. >My JavaDevWithAnt project provides a front-end (using Struts) similar >to the one that comes with the demo. You can get JavaDevWithAnt (and >build it yourself) at http://www.ehatchersolutions.com/JavaDevWithAnt thx for the infos, ill do some further reading about all this stuff. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
AW: use Lucene LOCAL (looking for a frontend)
>Not being funny, but if you have no experience in Java, then why are you using a Java >API >for index building/text searching ? im just testing some possibilities. though i cant write an java application, i can read it and, if someone gives me something to start with, im sure ill make it. if lucene seems to be the best solution, ill spend some time to leran something about java. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: use Lucene LOCAL (looking for a frontend)
On Jan 28, 2004, at 9:01 AM, Sebastian Fey wrote: How you present the search results will be up to you and the needs of your project. ive NO experience with java. it would be nice to see an example of a webinterface, that implements lucene to have something to start with. No offense intended at all, but you'll really need some Java experience to do stuff with Lucene. There is no real good out-of-the-box front-end at the moment, unless you went with something like Searchblox (www.searchblox.com). My JavaDevWithAnt project provides a front-end (using Struts) similar to the one that comes with the demo. You can get JavaDevWithAnt (and build it yourself) at http://www.ehatchersolutions.com/JavaDevWithAnt Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: use Lucene LOCAL (looking for a frontend)
Not being funny, but if you have no experience in Java, then why are you using a Java API for index building/text searching ? -Original Message- From: Sebastian Fey [mailto:[EMAIL PROTECTED] Sent: 28 January 2004 14:01 To: Lucene Users List Subject: RE: use Lucene LOCAL (looking for a frontend) >To index local files leverage some of the >code I have put in my java.net articles, or use the Ant task >that resides in the sandbox repository, or write your own. im satisfied with the index ive for now, but later on ill take a look ... >How you present the search results will be up to you and the needs of your >project. ive NO experience with java. it would be nice to see an example of a webinterface, that implements lucene to have something to start with. thx, Sebastian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] This e-mail and any attachments may be confidential and/or legally privileged. If you have received this e-mail and you are not a named addressee, please inform Landmark Information Group on 01392 441700 and then delete the e-mail from your system. If you are not a named addressee you must not use, disclose, distribute, copy, print or rely on this e-mail. This email and any attachments have been scanned for viruses and to the best of our knowledge are clean. To ensure regulatory compliance and for the protection of our clients and business, we may monitor and read e-mails sent to and from our servers. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: use Lucene LOCAL (looking for a frontend)
>To index local files leverage some of the >code I have put in my java.net articles, or use the Ant task >that resides in the sandbox repository, or write your own. im satisfied with the index ive for now, but later on ill take a look ... >How you present the search results will be up to you and the needs of your >project. ive NO experience with java. it would be nice to see an example of a webinterface, that implements lucene to have something to start with. thx, Sebastian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: use Lucene LOCAL (looking for a frontend)
Lucene is a Java API, and can be used within any type of Java program (command-line, web, etc). It is up to you as the developer embedding Lucene to put whatever kind of interface you want on it. To index local files leverage some of the code I have put in my java.net articles, or use the Ant task that resides in the sandbox repository, or write your own. How you present the search results will be up to you and the needs of your project. Erik On Jan 28, 2004, at 7:44 AM, Sebastian Fey wrote: hi, my task is to implement a search engine to a documentation in HTML. the files are not online but local. But the "getting started" guide at lucene-home just explains howto set up lucene with tomcat. (ive never set up a webserver) I was able to create an index of my files, but now the web-frontend is missing. I think its in the luceneweb.war, right? So, my qustion, how can i use lucene local? Can someone provide a html-frontend? thx in advance, Sebastian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
use Lucene LOCAL (looking for a frontend)
hi, my task is to implement a search engine to a documentation in HTML. the files are not online but local. But the "getting started" guide at lucene-home just explains howto set up lucene with tomcat. (ive never set up a webserver) I was able to create an index of my files, but now the web-frontend is missing. I think its in the luceneweb.war, right? So, my qustion, how can i use lucene local? Can someone provide a html-frontend? thx in advance, Sebastian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: arrays of values in a field
Erik Hatcher wrote: On Jan 27, 2004, at 2:27 PM, Gabe wrote: If I have a group of documents and I want to filter on a category, it is fairly straightforward. I just create a Field that contains the category and filter on it. However, what if I want the field "category" to have multiple possible values? Is there a known best way to filter on that? I imagine it is possible to "hack" it by, say, creating a field with value: |category1|category2|category3| etc. And then query "|category1|" I was wondering if there was a better way. Simply add multiple (probably Keyword) fields with the same name. Lucene supports this nicely. There are other tricks you can use here, too... In one of my projects I had a need to store a list of weighted keywords. No problem storing multiple tokens under the same field name, as Erik explained above. However, in Lucene you can only apply a single boost value to a field. I ended up encoding the keywords like "10.0 keyword" and then writing an analyzer which skips the initial numbers when processing this particular field (which was stored, indexed and tokenized). -- Best regards, Andrzej Bialecki - Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator - FreeBSD developer (http://www.freebsd.org) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]