Hi Jack, It runs on a local file system.
- Ravi Chintakunta On 2/15/06, Jack Tang <[EMAIL PROTECTED]> wrote: > Hi there. > > I am facing the same the question and looking for same solution. > Your solution seems easy:) My question is what file system the > application runs on? > LocalFileSystem or DistributedFileSystem? > > Thanks > /Jack > > On 2/9/06, Ravi Chintakunta <[EMAIL PROTECTED]> wrote: > > Hi David, > > > > Thanks for your reply. > > > > After posting the question, I have done this in a more optimum way. > > > > - I used only a single NutchBean and modified it so that the search > > method takes the indices being searched as an argument. This single > > NutchBean creates separate IndexReaders on the merged indices in the > > directories and keeps them in a map. > > > > - Based on the indexes that are searched, NutchBean creates an > > IndexSearcher using the appropriate IndexReaders. I have added a > > constructor to IndexSearcher that takes an array of IndexReaders and > > uses a MultiReader to initialize itself. > > > > - The NutchBean creates a single FetchedSegments with the combination > > of the segments directories in all the directories. > > > > The advantages with this are: > > > > - A single IndexReader for an index - so no additional filehandles are > > created. > > - No opening / closing of readers or segments - this improves performance. > > > > > > - Ravi Chintakunta > > > > > > > This is almost exactly what I've done. I create a new NutchBean for > > > each search, and point it at whichever of 9 subdirectories the user has > > > selected; because I really don't want 511 (2^9-1) beans hanging around. > > > > > > The reason for the "too many open files" is that the NutchBean doesn't > > > clean up after itself - I guess because for most people, the NutchBean > > > is going to be reused. > > > > > > I added a close() method to FetchSegments.Segment in my installation, > > > to close all the readers. I added a closeSegments() method to > > > NutchBean, to call close() on each segment that's been opened. Then I > > > call closeSegments() after each search. > > > > > > I realise that NutchBean really wasn't designed to support being > > > instantiated once per search, but I don't care. It works well, and > > > performance is not an issue. > > > > > > Regards, > > > David. > > > > > > > > > Date: Mon, 6 Feb 2006 20:59:34 -0500 > > > From: Ravi Chintakunta <[EMAIL PROTECTED]> > > > To: [email protected] > > > Subject: [Nutch-general] Dynamic merging of indices > > > Reply-To: [EMAIL PROTECTED] > > > > > > I have multiple indices for the crawls across various intranet sites > > > stored in separate folders. My search application should support > > > searching across one or more of these indices dynamically - by way of > > > checkboxes on the web page. For this, I have modified NutchBean to > > > create the IndexSearcher and FetchedSegments from the segments > > > directory (not the merged index directory) in these folders. Based on > > > the selected intranet sites, a NutchBean is instantiated for the > > > indices of the selected sites and the results are displayed. > > > > > > With this I had the "Too many open files error" and have increased the > > > number of files limit. > > > > > > This seems to work well now. But if I have 5 such sites, then I am > > > opening 2^5 =3D 32 times more files than I would have opened. > > > > > > My question is: Is there a better way of doing this? Like: > > > > > > - Can I open an IndexReader on each of the merged index directory and > > > dynamically create an IndexSearcher by merging these readers using > > > MultiReader? > > > > > > - Is an IndexReader thread safe and can it be used simultaneously in > > > different IndexSearchers? > > > > > > - Can I create the IndexReader on the merged index directory and > > > create the corresponding FetchedSegments on the corresponding > > > non-merged segments directory? > > > > > > Thanks > > > Ravi Chintakunta > > > > > > > > > > > > > > > ******************************************************************************** > > > This email may contain legally privileged information and is intended > > > only for the addressee. It is not necessarily the official view or > > > communication of the New Zealand Qualifications Authority. If you are not > > > the intended recipient you must not use, disclose, copy or distribute > > > this email or > > > information in it. If you have received this email in error, please > > > contact the sender immediately. NZQA does not accept any liability for > > > changes made to this email or attachments after sending by NZQA. > > > > > > All emails have been scanned for viruses and content by MailMarshal. > > > NZQA reserves the right to monitor all email communications through its > > > network. > > > > > > ******************************************************************************** > > > > > > > > > > > -- > Keep Discovering ... ... > http://www.jroller.com/page/jmars >
