RE: Search Performance
Are you creating new IndexSearchers or IndexReaders on each search? Caching your IndexSearchers has a dramatic effect on speed. David Townsend -Original Message- From: Michael Celona [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 15:55 To: Lucene Users List Subject: Search Performance What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
IndexSearchers are thread safe, so you can use the same object on multiple requests. If the index is static and not constantly updating, just keep one IndexSearcher for the life of the app. If the index changes and you need that instantly reflected in the results, you need to check if the index has changed, if it has create a new cached IndexSearcher. To check for changes use you'll need to monitor the version number of the index obtained via IndexReader.getCurrentVersion(Index Name) David -Original Message- From: Stefan Groschupf [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 16:15 To: Lucene Users List Subject: Re: Search Performance Try a singleton pattern or an static field. Stefan Michael Celona wrote: I am creating new IndexSearchers... how do I cache my IndexSearcher... Michael -Original Message- From: David Townsend [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 11:00 AM To: Lucene Users List Subject: RE: Search Performance Are you creating new IndexSearchers or IndexReaders on each search? Caching your IndexSearchers has a dramatic effect on speed. David Townsend -Original Message- From: Michael Celona [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 15:55 To: Lucene Users List Subject: Search Performance What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Multi-threading problem: couldn't delete segments
The problem could be you're writing to an index with multiple processes. This can happen if you're using a shared file system (NFS?). We saw this problem when we had two IndexWriters getting access to a single index at the same time. Usually if you're working on a single machine the file locks prevent this from happening. -Original Message- From: Luke Francl [mailto:[EMAIL PROTECTED] Sent: 13 January 2005 18:13 To: Lucene Users List Subject: Re: Multi-threading problem: couldn't delete segments I didn't get any response to this post so I wanted to follow up (you can read the full description of my problem in the archives: http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=11986). Here's an additional piece of information: I wrote a small program to confirm that on Windows, you can't rename a file while another thread has it open. If I am performing a search, is it possible that the IndexReader is holding open the segments file when there is an attempt by my indexing code to overwrite it with File.renameTo()? Thanks, Luke Francl On Thu, 2005-01-06 at 17:43, Luke Francl wrote: We are having a problem with Lucene in a high concurrency create/delete/search situation. I thought I fixed all these problems, but I guess not. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene Book in UK
Sorry if this is the wrong forum but I wondered what's happened to 'Lucene In Action' in the UK. Looking forward to reading it but amazon.co.uk report it as a 'hard to find' item and are now quoting a 4-6 week delivery time and tacking on a rare book charge. Amazon.com are quoting shipping in 24hrs. Is this a new 'Boston Tea Party'? cheers David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
hits.length() changes during delete process.
I have a delete script IndexSearcher searcher = new IndexSearcher(reader); Hits hits = searcher.search(query); log.info(there are + hits.length() + hits); for (int i = 0; i hits.length(); i++) { log.info(hits.length() + + i + + hits.id(i)); reader.delete(hits.id(i)); } which iterates through the results of a search and deletes the returns. I keep getting an ArrayIndexOutOfBoundsException. I've found the reason is that hits.length() actually changes during the iteration in large regular steps i.e The hits length is initially 10003 after 100 deletions hits.length() changes to 9903 after 200 deletions hits.length() changes to 9803 then changes after 200 deletions 400 800 1600 3200 So the short question is, should the hits object be changing and what is the best way to delete all the results of a search (it's a range query so I can't use delete(Term term)? cheers. David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Making lucene work in weblogic cluster
No I didn't. If you look for NFS in the archives, there is an alternate solution out there. I suppose I should get around to submitting the patch. -Original Message- From: Praveen Peddi [mailto:[EMAIL PROTECTED] Sent: 08 October 2004 16:10 To: lucenelist Subject: Making lucene work in weblogic cluster While I was going through the mailing list in solving the lucene cluster problem, I came accross this thread. Does any one know if David Townsend had submitted the patch he was talking about? http://www.mail-archive.com/[EMAIL PROTECTED]/msg06252.html I am interested in looking at the NFS solution (mounting the shared drive on each server in cluster). I don't know if anyone has used this solution in cluster but this seems to be a better approach than RemoteSearchable interface and DB based index (SQLDirectory). I am currently looking at 2 options: Index on Shared drive: Use single index dir on a shared drive (NFS, etc.), which is mounted on each app server. All the servers in the cluster write to this shared drive when objects are modified. Problems: 1) Known problems like file locking etc. (The above thread talks about moving locking mechanism to DB but I have no idea how). 2) Performance. Index Per Server: Create copies of the index dir for each machine. Requires regular updates, etc. Each server maintains its own index and searches on its own index. Problems: 1) Modifying the index is complex. When Objects are modified on a server1 that does not run the search system, server1 needs to notify all servers in the cluster about these modifications so that each server can update its own index. This may involve some kind of remote communication mechanism which will perform bad since our index modifies a lot. So I am still reviewing both options and trying to figure out which one is the best and how to solve the above problems. If you guys have any ideas, Pls shoot them. I would appreciate any help regarding making lucene clusterable (both indexing and searching). Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Making lucene work in weblogic cluster
Doug discusses the locking issue, with a potential solution http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgId=1619988 -Original Message- From: Praveen Peddi [mailto:[EMAIL PROTECTED] Sent: 08 October 2004 16:10 To: lucenelist Subject: Making lucene work in weblogic cluster While I was going through the mailing list in solving the lucene cluster problem, I came accross this thread. Does any one know if David Townsend had submitted the patch he was talking about? http://www.mail-archive.com/[EMAIL PROTECTED]/msg06252.html I am interested in looking at the NFS solution (mounting the shared drive on each server in cluster). I don't know if anyone has used this solution in cluster but this seems to be a better approach than RemoteSearchable interface and DB based index (SQLDirectory). I am currently looking at 2 options: Index on Shared drive: Use single index dir on a shared drive (NFS, etc.), which is mounted on each app server. All the servers in the cluster write to this shared drive when objects are modified. Problems: 1) Known problems like file locking etc. (The above thread talks about moving locking mechanism to DB but I have no idea how). 2) Performance. Index Per Server: Create copies of the index dir for each machine. Requires regular updates, etc. Each server maintains its own index and searches on its own index. Problems: 1) Modifying the index is complex. When Objects are modified on a server1 that does not run the search system, server1 needs to notify all servers in the cluster about these modifications so that each server can update its own index. This may involve some kind of remote communication mechanism which will perform bad since our index modifies a lot. So I am still reviewing both options and trying to figure out which one is the best and how to solve the above problems. If you guys have any ideas, Pls shoot them. I would appreciate any help regarding making lucene clusterable (both indexing and searching). Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Moving from a single server to a cluster
Would it be cheeky to ask you to post the docs to the group? It would be interesting to read how you've tackled this. -Original Message- From: Nader Henein [mailto:[EMAIL PROTECTED] Sent: 08 September 2004 13:57 To: Lucene Users List Subject: Re: Moving from a single server to a cluster Hey Ben, We've been using a distributed environment with three servers and three separate indecies for the past 2 years since the first stable Lucene release and it has been great, recently and for the past two months I've been working on a redesign for our Lucene App and I've shared my findings and plans with Otis, Doug and Erik, they pointed out a few faults in my logic which you will probably come across soon enough that mainly have to do with keeping you updates atomic (not too hard) and your deletes atomic (a little more tricky), give me a few days and I'll send you both the early document and the newer version that deals squarely with Lucene in a distributed environment with high volume index. Regards. Nader Henein Ben Sinclair wrote: My application currently uses Lucene with an index living on the filesystem, and it works fine. I'm moving to a clustered environment soon and need to figure out how to keep my indexes together. Since the index is on the filesystem, each machine in the cluster will end up with a different index. I looked into JDBC Directory, but it's not tested under Oracle and doesn't seem like a very mature project. What are other people doing to solve this problem? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: worddoucments search
Is this a wind-up? -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: 24 August 2004 13:16 To: Lucene Users List Subject: worddoucments search Can lucene be able to search word documents? if so please give me information about it regards Santosh kumar ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: pdf search
Hi Santosh, Lucene doesn't search pdfs per se. To make anything searchable you have to first extract the content and then put it in lucene in a form it understands (i.e document objects). So in order to search your pdfs you first need to extract the info from the PDFs using something like PDFBox. So your battleplan should be forget lucene for a while, get the raw data out of all the items you want to search. Then look at the lucene articles about creating simple searchable indices. DT If we didn't train to fight, who'd fight the wars? :) -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: 20 August 2004 13:30 To: Lucene Users List Subject: Fw: pdf search How can I search through PDF? - Original Message - From: Santosh To: Lucene Users List Sent: Friday, August 20, 2004 5:59 PM Subject: pdf search Hi, I am new bee to lucene. I have downloaded zip file. now how can i give my own list words to lucene? In the demo i saw that lucene is automatically creating index if we run the java program.but I want to give my own search words, how is it possible? regards Santosh kumar SoftPro Systems Hyderabad The harder you train in peace, the lesser you bleed in war ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: searchhelp
JGURU FAQ http://www.jguru.com/faq/Lucene OFFICIAL FAQ http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi MAIL ARCHIVE http://www.mail-archive.com/[EMAIL PROTECTED]/ hope this helps. -Original Message- From: Santosh [mailto:[EMAIL PROTECTED] Sent: 19 August 2004 11:25 To: Lucene Users List Subject: Re: searchhelp I am recently joined into list, I didnt gone through any previous mails, if you have any mails or related code please forward it to me - Original Message - From: Chandan Tamrakar [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, August 19, 2004 3:47 PM Subject: Re: searchhelp For PDF you need to extract a text from pdf files using pdfbox library and for word documents u can use apache POI api's . There are messages posted on the lucene list related to your queries. About database ,i guess someone must have done it . :) - Original Message - From: Santosh [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, August 19, 2004 3:58 PM Subject: searchhelp Hi, I am using lucene search engine for my application. i am able to search through the text files and htmls as specified by lucene can you please clarify my doubts 1.can lucene search through pdfs and word documents? if yes then how? 2.can lucene search through database ? if yes then how? thankyou santosh ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Memo: RE: Query parser and minus signs
Doesn't en UK as a phrase query work? You're probably indexing it as a text field so it's being tokenised. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 21 May 2004 16:47 To: Lucene Users List Subject: Memo: RE: Query parser and minus signs Hmm, we may have to if there is no work around. We're not using java locales, but were trying to stick to the ISO standard which uses hyphens. Ryan Sonnek [EMAIL PROTECTED] on 21 May 2004 16:38 Please respond to Lucene Users List [EMAIL PROTECTED] To:Lucene Users List [EMAIL PROTECTED] cc: bcc: Subject:RE: Query parser and minus signs if you're dealing with locales, why not use java's built in locale syntax (ex: en_UK, zh_HK)? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, May 21, 2004 10:36 AM To: [EMAIL PROTECTED] Subject: Query parser and minus signs Hi All, I'm using Lucene on a site that has split content with a branch containing pages in English and a separate branch in Chinese. Some of the chinese pages include some (untranslatable) English words, so when a search is carried out in either language you can get pages from the wrong branch. To combat this we introduced a language field into the index which contains the standard language codes: en-UK and zh-HK. When you parse a query e.g. language:en\-UK you could reasonably expect the search to recover all pages with the language field set to en-UK (the minus symbol should be escaped by the backslash according to the FAQ). Unfortunately the parser seems to return en UK as the parsed query and hence returns no documents. Has anyone else had this problem, or could suggest a workaround ?? as I have yet to find a solution in the mailing archives or elsewhere. Many thanks in advance, Alex Bourne _ This transmission has been issued by a member of the HSBC Group (HSBC) for the information of the addressee only and should not be reproduced and / or distributed to any other person. Each page attached hereto must be read in conjunction with any disclaimer which forms part of it. This transmission is neither an offer nor the solicitation of an offer to sell or purchase any investment. Its contents are based on information obtained from sources believed to be reliable but HSBC makes no representation and accepts no responsibility or liability as to its completeness or accuracy. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ** This message originated from the Internet. Its originator may or may not be who they claim to be and the information contained in the message and any attachments may or may not be accurate. ** _ This transmission has been issued by a member of the HSBC Group (HSBC) for the information of the addressee only and should not be reproduced and / or distributed to any other person. Each page attached hereto must be read in conjunction with any disclaimer which forms part of it. This transmission is neither an offer nor the solicitation of an offer to sell or purchase any investment. Its contents are based on information obtained from sources believed to be reliable but HSBC makes no representation and accepts no responsibility or liability as to its completeness or accuracy. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: about search and update one index simultaneously
There is no problem with updating and searching simultaneously. Two threads updating simultaneously on the same index on NFS can be a problem, as the locking does not work reliably. Have a look through the archives for NFS, there are some solutions scattered about. David -Original Message- From: xuemei li [mailto:[EMAIL PROTECTED] Sent: 18 May 2004 23:01 To: [EMAIL PROTECTED] Subject: about search and update one index simultaneously Hi,all, Can we do search and update one index simultaneously?Is someone know sth about it? I had done some experiments.Now the search will be blocked when the index is being updated.The error in search node is like this: caught a class java.io.IOException with message:Stale NFS file handle Thanks Xuemei Li - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Getting a field value from a large indexed document is slow.
You say the content is indexed, is it stored? If note, index the content of the document, but don't store it. eg doc.add(Field.UnStored(content, content)); -Original Message- From: Paul Williams [mailto:[EMAIL PROTECTED] Sent: 14 May 2004 16:22 To: 'Lucene Users List' Subject: Getting a field value from a large indexed document is slow. Hi, I hope someone can help! I am using Lucene to make a searching repository of electronic documents. (MS Office, PDF's etc.). Some of these document can contain a large amount of text (about 500K of text in some cases) which is indexed to make it searchable. Doing the search and getting the hits found is not effected by the size of the document found. But when I try and access a field (my document id) in the document i.e. // Create Lucene Doc with value Document doc = hits.doc(i); String number = doc.get(Field10); The creation of the Lucene document can take up to a second per hit. I don't actually use any of the other fields apart from getting my ID value from field10. So my question is:- Is there a smarter way of getting out the 'Field10' value without it populating all the rest of the fields in the Lucene document and therefore reduce the time taken for this action. Paul DISCLAIMER: The information in this message is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, or distribution of the message, or any action or omission taken by you in reliance on it, is prohibited and may be unlawful. Please immediately contact the sender if you have received this message in error. Thank you. Valid Information Systems Limited. Address: Morline House, 160 London Road, Barking, Essex, IG11 8BB. http://www.valinf.com Tel: +44 (0) 20 8215 1414 Fax: +44 (0) 20 8215 2040 - - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Returning Separate Hits from Multisearcher
We have a number of small indices and also an uber-index made up of all the smaller indices. We need to get do a search across a number of the sub-indices and get back a hit count from each. Currently we search each index, we've also tried running multiple queries against the uber-index, with a field denoting which subindex we are interested in. Obviously this approach is very slow. Is there any way to use MultiSearcher to do this? The problem we currently have with MultiSearcher is there seems no way to tell how many hits came from each index. Is there a recommended way to do this, or should we modify MultiSearcher to return information about the hits on each index. any ideas? David Townsend
RE: weblogic cluster, index on NFS and locking problem
We work on NFS and have had major problems with locking, which often leads to the indices becoming corrupt. Our solution was to replace file locking with a database system. I can release the code but I'm not sure of the process or where to put it. It basically two classes one that extends the Directory class and one that deals with the database interaction. David Townsend -Original Message- From: Dmitri Ilyin [mailto:[EMAIL PROTECTED] Sent: 04 February 2004 09:49 To: [EMAIL PROTECTED] Subject: Re: weblogic cluster, index on NFS and locking problem What is it good for??? unfortunately i don't have any access to NFS server. It runs at customers in production environment. Suggestion: make sure the NFS lock daemon (lockd) is running on the NFS server. Peter - Original Message - From: Dmitri Ilyin [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, February 03, 2004 1:09 PM Subject: weblogic cluster, index on NFS and locking problem Hi, We run our application on weblogic cluster. the lucene index service runs on both server in cluster and they both write to one index directory, shared via NFS. We have experenced a problem with commit.lock file, that seems not to be deleted and stayed in the index directory, so we could not start indexing any more becouse lucene could not create/read commit.lock file. I'm not sure what excatly our problem is. It could be NFS problem or it could be our usage problem. We are just starting to use lucene and could made something wrong. We use lucene to index and to search documents. Write/read could be concurent. I saw in the list some messages about problems with lock files on NFS file system. But i could not realy understand what the problem is. How can we improve our solution?? What do we have to do excatly to avoid problem with stayed commit.lock file??? thaks for any advise regards Dmitri - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: use Lucene LOCAL (looking for a frontend)
Why don't you take a look at luke. That way you can play with the index you built and work from there. If you're looking to replicate something like Luke, I'd get studying now ;). http://www.getopt.org/luke/ -Original Message- From: Sebastian Fey [mailto:[EMAIL PROTECTED] Sent: 28 January 2004 14:23 To: Lucene Users List Subject: AW: use Lucene LOCAL (looking for a frontend) Not being funny, but if you have no experience in Java, then why are you using a Java API for index building/text searching ? im just testing some possibilities. though i cant write an java application, i can read it and, if someone gives me something to start with, im sure ill make it. if lucene seems to be the best solution, ill spend some time to leran something about java. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Multiple Creation of Writers
In my system indices are created and updated by multiple threads. I need to check if an index exists to decide whether to pass true or false to the IndexWriter constructor. new IndexWriter(FSDirectory, Analyzer, boolean); The problem arises when two threads attempt to create the same index after simultaneously finding that the index does not exist. This problem can be reproduced in a single thread by writerA = new IndexWriter(new File(c:/import/test), new StandardAnalyzer(), true); writerB = new IndexWriter(new File(c:/import/test), new StandardAnalyzer(), true); add1000Docs(writerA); add1000Docs(writerB); this will throw an IOException C:\import\test\_a.fnm (The system cannot find the file specified) The only solution I can think of is to create a database/file lock to get around this, or change the Lucene code to obtain a lock before creating an index. Any ideas? David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lock obtain timed out
Does this mean if you can insure that only one IndexWriter and/or IndexReader(Doing deletion) are never open at the same time (eg using database instead of lucene's locking), there will be no problem with removing locking? If you do not use an IndexReader to do deletion can you open and close it at anytime? David -Original Message- From: Morus Walter [mailto:[EMAIL PROTECTED] Sent: 16 December 2003 11:08 To: Lucene Users List Subject: Re: Lock obtain timed out Hohwiller, Joerg writes: Am I safe disabling the locking??? No. Can anybody tell me where to get documentation about the Locking strategy (I still would like to know why I have that problem) ??? I guess -- but given your input I really have to guess; the source you wanted to attach didn't make it to the list -- your problem is, that you cannot have a writing (deleting) IndexReader and an IndexWriter open at the same time. There can only be one instance that writes to an index at one time. Disabling locking disables the checks, but then you have to take care yourself. So in practice disabling locking is useful for readonly access to static indices only. Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: indexing/searching a website
I would advise you to use the excellent articles listed here. http://jakarta.apache.org/lucene/docs/resources.html Some good examples and by the end of it you should have a good understanding of the major classes and their use. -Original Message- From: Michal S [mailto:[EMAIL PROTECTED] Sent: 27 November 2003 10:52 To: Lucene Users List Subject: Re: indexing/searching a website Another option is to deploy your site and crawl it from the outside (have a look at Nutch at sourceforge - or write your own using HttpClient and some HTML parsing for hyperlinks). I realize that it will be necessary to write or use existing html parser. I know that i need But i don't know how the whole framework would look like (how to translate pages on webserwer to Lucene documents, how to index them, how to search them). The example on the Lucene home page is very simple and doesn't give me much answers. I would argue that content within the JSP is a bad thing given that you want to index it - perhaps it makes more sense to put the content somewhere easier to get at like a database? You are absolutely right. But my client wants to edit the content as easy as possible (via notepad or other text editor). If the content were in database, it would be necessery to provide my client with some kind of application which could let him update the content. The budget of the project is strongly limited so i can't afford to allocate more developers to build content editor. Thanks for the reply. Michal. ___ Najlepsze bo darmowe - konta e-mail www.free.os.pl -- SUPER LOGOSY I DZWONKI DO TWOJEJ KOMÓRKI -- www.logo-dzwonki.pl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]