Hi List,

Is it possible to disable indexing in nutch?

Actually in my application, I am working on twitter feeds where I
am filtering the tweets present with links and I am crawling these links. I
am just bothered about the contents of these links. I am able to get the
contents by reading the segments.

I dont require the search feature provided by Nutch for which it does
indexing. So, is it possible to remove indexing in nutch? Doing this will
improve the performance of my crawler.


Thanks and regards,*
*Ch. Arjun Kumar Reddy
*
*



On Mon, Jan 31, 2011 at 5:00 PM, Alexander Aristov <
[email protected]> wrote:

> yes, you can but only if you use nutch + solr.
>
> If you use old nutchfrontend then you might brake index and searching after
> merging content or indexes.
>
> If you don't merge then search should work during crawling.
>
> but remember that results don't come available for searching immediately
> after fetching. all pages must be fetched andf then indexed first to be
> searchable.
>
> Best Regards
> Alexander Aristov
>
>
> On 31 January 2011 13:17, .: Abhishek :. <[email protected]> wrote:
>
> > Hi folks,
> >
> >  I should thank you all for the great help you have been offering so far.
> I
> > am learning about Nutch quite well.
> >
> >  One more beginners question here - Can I search for something while
> nutch
> > is still crawling an site? I believe this is not possible. However, why I
> > am
> > asking this is - I am crawling a big site and  also the site is updated
> > frequently with a lot of new pages, I just wanted to get some quick
> results
> > while its on the go.
> >
> > Thanks,
> > Abhi
> >
>

Reply via email to