For best performance you should aim to keep a shared index searcher, or the underlying index reader, open as long as possible. You may of course need to reopen it if/when the index changes. As to scope, you can store it wherever it makes sense for your application.
-- Ian. On Tue, May 4, 2010 at 10:13 AM, Vijay Veeraraghavan <vijay.raghava...@gmail.com> wrote: > Hi, > Thanks for the reply. So I will have a dedicated servlet to search the > index, but does it mean that the indexsearcher does not close the > index, keep it open? Is it not possible to keep it in the application > scope? > > Vijay > > On 5/3/10, Vijay Veeraraghavan <vijay.raghava...@gmail.com> wrote: >> Hi all, >> >> In a clustered environment I search the index from the web >> application. In the web application I am creating IndexReader on each >> request. is it expensive to do like this? I read somewhere in the web >> that try using the same reader as much as possible. Can i keep the >> initially created IndexReader in the session/application scopes and >> use the same for each request? Any other idea? >> >> Viay >> >> On 5/3/10, Vijay Veeraraghavan <vijay.raghava...@gmail.com> wrote: >>> dear all, >>> >>> as replied below, does searching again for the document in the index >>> and if found skip the indexing else index it, is this not similar to >>> indexing all pdf documents once again, is not this overhead? As I am >>> not going to index the details of the pdf (so if an indexed pdf was >>> recreated i need not reindex it) but just the paths of the documents. >>> >>> Vijay >>> >>>>> Hey there, >>>>> >>>>> you might have to implement a some kind of unique identifier using an >>>>> indexed lucene field. When you are indexing you should fire a query >>>>> with >>>>> the >>>>> uuid of your document (maybe the path to you pdf document) and check if >>>>> the >>>>> document is in the index already. You could also do a boolean query >>>>> combining UUID, timestamp and / or a hash value to see if the document >>>>> has >>>>> been changed. if so you can simply update the document by its UUID >>>>> (something like indexwriter.updateDocument(new Term("uuid", >>>>> value),document);) >>>>> >>>>> Unfortunately you have to implement this yourself but it should not be >>>>> that >>>>> much of a deal. >>>>> >>>>> simon >>>>> >>>>> On Mon, May 3, 2010 at 9:21 AM, Vijay Veeraraghavan < >>>>> vijay.raghava...@gmail.com> wrote: >>>>> >>>>>> Dear all, >>>>>> I am using lucene 3.0 to index the pdf reports that I generate >>>>>> dynamically. I index the pdf file name (without extension), file path >>>>>> and its absolute path as fields. I search with the file name without >>>>>> extension; it retrieves a list, as usually 2 or more files are present >>>>>> in the same name in different sub directories. As I create the index >>>>>> for the first time it updates, assuming 100 pdf files in different >>>>>> directories, the files meta info. If again I do indexing, while my >>>>>> report generator scheduler has the produced 500 more pdf files >>>>>> totaling to 600 files in different directories, I wish to index only >>>>>> the new files to the index. But presently it’s doing the whole thing >>>>>> again (600 files). How to implement this functionality? Think of the >>>>>> thousands of pdf files created on each run. >>>>>> >>>>>> P.S: I cannot keep the meta-info of generated pdf files in the java >>>>>> memory, as it exceeds thousands in a single run, and update the index >>>>>> looping this list. >>>>>> >>>>>> new IndexWriter(FSDirectory.open(this.indexDir), new StandardAnalyzer( >>>>>> Version.LUCENE_CURRENT), true, >>>>>> >>>>>> IndexWriter.MaxFieldLength.LIMITED); >>>>>> >>>>>> is the boolean parameter is for this purpose? Please guide me. >>>>>> >>>>>> -- >>>>>> Thanks >>>>>> Vijay Veeraraghavan >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks & Regards >>>>>> Vijay Veeraraghavan >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Thanks & Regards >>>> Vijay Veeraraghavan >>>> >>> >>> >>> -- >>> Thanks & Regards >>> Vijay Veeraraghavan >>> >> >> >> -- >> Thanks & Regards >> Vijay Veeraraghavan >> > > > -- > Thanks & Regards > Vijay Veeraraghavan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org