Hi Jack,

It runs on a local file system.

- Ravi Chintakunta

On 2/15/06, Jack Tang <[EMAIL PROTECTED]> wrote:
> Hi there.
>
> I am facing the same the question and looking for same solution.
> Your solution seems easy:) My question is what file system the
> application runs on?
> LocalFileSystem or DistributedFileSystem?
>
> Thanks
> /Jack
>
> On 2/9/06, Ravi Chintakunta <[EMAIL PROTECTED]> wrote:
> > Hi David,
> >
> > Thanks for your reply.
> >
> > After posting the question, I have done this in a more optimum way.
> >
> > - I used only a single NutchBean and modified it so that the search
> > method takes the indices being searched as an argument. This single
> > NutchBean creates separate IndexReaders on the merged indices in the
> > directories and keeps them in a map.
> >
> > - Based on the indexes that are searched, NutchBean creates an
> > IndexSearcher using the appropriate IndexReaders. I have added a
> > constructor to IndexSearcher that takes an array of IndexReaders and
> > uses a MultiReader to initialize itself.
> >
> > - The NutchBean creates a single FetchedSegments with the combination
> > of the segments directories in all the directories.
> >
> > The advantages with this are:
> >
> > - A single IndexReader for an index - so no additional filehandles are 
> > created.
> > - No opening / closing of readers or segments - this improves performance.
> >
> >
> > - Ravi Chintakunta
> >
> >
> > > This is almost exactly what I've done.  I create a new NutchBean for
> > > each search, and point it at whichever of 9 subdirectories the user has
> > > selected; because I really don't want 511 (2^9-1) beans hanging around.
> > >
> > > The reason for the "too many open files" is that the NutchBean doesn't
> > > clean up after itself - I guess because for most people, the NutchBean
> > > is going to be reused.
> > >
> > > I added a close() method to FetchSegments.Segment in my installation,
> > > to close all the readers.  I added a closeSegments() method to
> > > NutchBean, to call close() on each segment that's been opened.  Then I
> > > call closeSegments() after each search.
> > >
> > > I realise that NutchBean really wasn't designed to support being
> > > instantiated once per search, but I don't care.  It works well, and
> > > performance is not an issue.
> > >
> > > Regards,
> > > David.
> > >
> > >
> > > Date: Mon, 6 Feb 2006 20:59:34 -0500
> > > From: Ravi Chintakunta <[EMAIL PROTECTED]>
> > > To: [email protected]
> > > Subject: [Nutch-general] Dynamic merging of indices
> > > Reply-To: [EMAIL PROTECTED]
> > >
> > > I have multiple indices for the crawls across various intranet sites
> > > stored in separate folders. My search application should support
> > > searching across one or more of these indices dynamically - by way of
> > > checkboxes on the web page.  For this, I have modified NutchBean to
> > > create the IndexSearcher and FetchedSegments from the segments
> > > directory (not the merged index directory) in these folders.  Based on
> > > the selected intranet sites, a NutchBean is instantiated for the
> > > indices  of the selected sites and the results are displayed.
> > >
> > > With this I had the "Too many open files error" and have increased the
> > > number of files limit.
> > >
> > > This seems to work well now. But if I have 5 such sites, then I am
> > > opening 2^5 =3D 32 times more files than I would have opened.
> > >
> > > My question is: Is there a better way of doing this? Like:
> > >
> > > - Can I open an IndexReader on each of the merged index directory and
> > > dynamically create an IndexSearcher by merging these readers using
> > > MultiReader?
> > >
> > > - Is an IndexReader thread safe and can it be used simultaneously in
> > > different IndexSearchers?
> > >
> > > - Can I create the IndexReader on the merged index directory and
> > > create the corresponding FetchedSegments on the corresponding
> > > non-merged segments directory?
> > >
> > > Thanks
> > > Ravi Chintakunta
> > >
> > >
> > >
> > >
> > > ********************************************************************************
> > > This email may contain legally privileged information and is intended 
> > > only for the addressee. It is not necessarily the official view or
> > > communication of the New Zealand Qualifications Authority. If you are not 
> > > the intended recipient you must not use, disclose, copy or distribute 
> > > this email or
> > > information in it. If you have received this email in error, please 
> > > contact the sender immediately. NZQA does not accept any liability for 
> > > changes made to this email or attachments after sending by NZQA.
> > >
> > > All emails have been scanned for viruses and content by MailMarshal.
> > > NZQA reserves the right to monitor all email communications through its 
> > > network.
> > >
> > > ********************************************************************************
> > >
> > >
> >
>
>
> --
> Keep Discovering ... ...
> http://www.jroller.com/page/jmars
>

Reply via email to