I don't know if this is true. But I have heard that if you re-inject a specific URL then it's retry date is updated, and it will be included as part of a segment.
Haven't had time to test this out yet. -John On Thu, Jun 19, 2008 at 9:01 AM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Don't know off the top of my head, but I'd guess no, because Nutch uses > Hadoop/HDFS. HDFS files are write-once, so I doubt you can just update a > single URL's data. But you could write a MapReduce job that goes over the > whole CrawlDb and modifies only the records you need modified. You'll need > to essentially rewrite the whole CrawlDb and replace the old version. > > It would be nice to be able to change specific URL(s) data... > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > ----- Original Message ---- > > From: Chris Kline <[EMAIL PROTECTED]> > > To: [email protected] > > Sent: Tuesday, June 17, 2008 6:19:51 PM > > Subject: updating retry inteval > > > > is there a way to update the retry interval for a specific url? > > > > -Chris > > -- John Martyniak Before Dawn Solutions, Inc. 9457 S. University Blvd. #266 Highlands Ranch, CO 80126 o: 1-877-499-1562 x707 (Toll Free) c: 303-522-1756 e: [EMAIL PROTECTED] w: http://www.beforedawn.com
