OK, thanks. So how do people really use Droids at scale? e.g. crawling a large number of web pages? I happen to use it for something smalish, so I never had issues with the queue being in the JVM heap and getting OOMs because of that. But I imagine that anyone using it for a larger crawl would hit OOM sooner or later, no?
Does this imply that either nobody is using Droids for large-scale crawls, or that everyone who does implemented their own, custom disk-backed queue? Thanks, Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Ryan McKinley <[email protected]> > To: [email protected] > Sent: Fri, November 13, 2009 5:17:51 PM > Subject: Re: Queue: in memory or on disk? > > ya, the standard one is in memory. > > It is easy to write one to store things to disk or whatever -- I use one that > stores tasks to an h2 database, but it is not general enough to contribute > back... > > I think Migfa was looking at replacing the droids Queue interface with a > standard java.util.Queue interface > > ryan > > > On Nov 13, 2009, at 5:10 PM, Chapuis Bertil wrote: > > > I think the current implementation only provides in memory queues of tasks. > However, since the TaskQueue interface is relatively simple it shouldn't be > too > hard to persists the data on the disk or to implement a TaskQueue which works > with a JMS broker or something else. > > > > > > On Nov 12, 2009, at 10:37 PM, Otis Gospodnetic wrote: > > > >> Hello, > >> > >> I haven't looked at the sources. But who stores items put in the Queue? > >> Are > they in memory, or does something write them to disk, or something else? > >> > >> Thanks, > >> Otis > >> -- > >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls > >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR > >> > >
