OK, thanks.

So how do people really use Droids at scale? e.g. crawling a large number of 
web pages?  I happen to use it for something smalish, so I never had issues 
with the queue being in the JVM heap and getting OOMs because of that.  But I 
imagine that anyone using it for a larger crawl would hit OOM sooner or later, 
no?

Does this imply that either nobody is using Droids for large-scale crawls, or 
that everyone who does implemented their own, custom disk-backed queue?


Thanks,
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Ryan McKinley <[email protected]>
> To: [email protected]
> Sent: Fri, November 13, 2009 5:17:51 PM
> Subject: Re: Queue: in memory or on disk?
> 
> ya, the standard one is in memory.
> 
> It is easy to write one to store things to disk or whatever -- I use one that 
> stores tasks to an h2 database, but it is not general enough to contribute 
> back...
> 
> I think Migfa was looking at replacing the droids Queue interface with a 
> standard java.util.Queue interface
> 
> ryan
> 
> 
> On Nov 13, 2009, at 5:10 PM, Chapuis Bertil wrote:
> 
> > I think the current implementation only provides in memory queues of tasks. 
> However, since the TaskQueue interface is relatively simple it shouldn't be 
> too 
> hard to persists the data on the disk or to implement a TaskQueue which works 
> with a JMS broker or something else.
> > 
> > 
> > On Nov 12, 2009, at 10:37 PM, Otis Gospodnetic wrote:
> > 
> >> Hello,
> >> 
> >> I haven't looked at the sources.  But who stores items put in the Queue?  
> >> Are 
> they in memory, or does something write them to disk, or something else?
> >> 
> >> Thanks,
> >> Otis
> >> --
> >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
> >> 
> > 

Reply via email to