can u share the script with everyone?

On 3/31/06, Berlin Brown <[EMAIL PROTECTED]> wrote:
>
> Do you have that shell script?
>
> On 3/30/06, Dan Morrill <[EMAIL PROTECTED]> wrote:
> > Hi folks,
> >
> > It worked, it worked great, I made a shell script to do the work for me.
> > Thank you, thank you, and again, thank you.
> >
> > r/d
> >
> > -----Original Message-----
> > From: Dan Morrill [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, March 30, 2006 5:12 AM
> > To: [email protected]
> > Subject: RE: Multiple crawls how to get them to work together
> >
> > Aled,
> >
> > I'll try that today, excellent, and thanks for the heads up on the db
> > directory. I'll let you now how it goes.
> >
> > r/d
> >
> >
> >
> > -----Original Message-----
> > From: Aled Jones [mailto:[EMAIL PROTECTED]
> > Sent: Thursday, March 30, 2006 12:24 AM
> > To: [email protected]
> > Subject: ATB: Multiple crawls how to get them to work together
> >
> > Hi Dan
> >
> > I'll presume you've done the crawls already..
> >
> > Each resulting crawled folder should have 3 folders, db, index and
> > segments.
> >
> > Create your search.dir folder and create a segments folder in that.
> >
> > Each segments folder in each crawl folder should contain folders with
> > timestamps as the names.  Copy the contents of:
> >
> > crawlA/segments
> > crawlB/segments
> > crawlc/segments
> >
> > (i.e. The folders with timestamps as names)Into:
> >
> > search.dir/segments
> >
> > Next, delete the duplicates from the segments by running the command:
> >
> > bin/nutch dedup -local search.dir/segments
> >
> > Then you need to merge the segments to create an index folder, so run
> > the command:
> >
> > bin/nutch merge -local search.dir/index search.dir/segments/*
> >
> > You should now have two folders in your search.dir:
> > search.dir/segments
> > search.dir/index
> >
> > That's all you need for serving pages (db folder is only used when
> > fetching).
> >
> > Now just set the searcher.dir property value in nutch-site.xml to be the
> > location of search.dir
> >
> > That's how I've been doing it, although it may not be the "right" way.
> > :-) Hope this helps.
> >
> > Cheers
> > Aled
> >
> >
> > > -----Neges Wreiddiol-----/-----Original Message-----
> > > Oddi wrth/From: Dan Morrill [mailto:[EMAIL PROTECTED]
> > > Anfonwyd/Sent: 29 March 2006 18:06
> > > At/To: [email protected]
> > > Copi/Cc: [EMAIL PROTECTED]
> > > Pwnc/Subject: Multiple crawls how to get them to work together
> > >
> > > Hi folks,
> > >
> > >
> > >
> > > I have 3 crawls, crawlA, crawlB, and crawlC. I would like all
> > > of them to be available to the search.jsp page.
> > >
> > >
> > >
> > > I went through the site saw merge, index, make new db, and
> > > followed all the directions that I could find, but still no
> > > resolution on this one. So what I need are some idea's on
> > > where to proceed from here, I intend on having 2 or
> > > 3 boxes make a crawl, then somehow merge the crawls together
> > > and form a "master" under search.dir. I would also want to
> > > update this one on a regular basis.
> > >
> > >
> > >
> > > Unfortunately, the instructions to date have all been tried,
> > > and have all lead to the idea not working. There is also no
> > > indexmerger or indexsemgents directives in nutch 0.7.1. Any
> > > support ideas, direct pointers, or even step-by-step
> > > instructions on how to do this (outside of what is in the
> > > tutorials because that has been tried already, including
> > > support idea's in the user web mail list).
> > >
> > >
> > >
> > > Cheers/r/dan
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > ###########################################
> >
> > This message has been scanned by F-Secure Anti-Virus for Microsoft
> Exchange.
> > For more information, connect to http://www.f-secure.com/
> >
> > ************************************************************************
> > This e-mail and any attachments are strictly confidential and intended
> > solely for the addressee. They may contain information which is covered
> by
> > legal, professional or other privilege. If you are not the intended
> > addressee, you must not copy the e-mail or the attachments, or use them
> for
> > any purpose or disclose their contents to any other person. To do so may
> be
> > unlawful. If you have received this transmission in error, please notify
> us
> > as soon as possible and delete the message and attachments from all
> places
> > in your computer where they are stored.
> >
> > Although we have scanned this e-mail and any attachments for viruses, it
> is
> > your responsibility to ensure that they are actually virus free.
> >
> >
> > =
> >
> >
>



--
www.babatu.com

Reply via email to