RE: Crawler/Indexer redesign

Vadim Gritsenko Mon, 04 Feb 2002 14:21:37 -0800

> From: Bernhard Huber [mailto:[EMAIL PROTECTED]]
> 
> hi,
> 
> >How about
> >
> >  Collection crawl(Source)
> >
> >? Then crawler can be ThreadSafe.
> >
> Yes, it would be ThreadSafe, storing all crawled resources in the
> collection.
> Does this work for crawling huge sites?
> 
> My idea was to handle that problem by introducing the Iterator.
> Using Iterator might allow to process some crawled resources quite
early.
> Using collection might delay the processing of the crawled resources
> until the crawling has terminated,
> that might take quite some time.
> 
> Hence it might be better:
> Iterator crawl( Source)


Go for it. Just make sure you are not buffering results from this
Iterator somewhere down the pipe ;)

Vadim

> 
> bye bernhard


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

RE: Crawler/Indexer redesign

Reply via email to