On Sat, Jul 16, 2011 at 1:29 PM, lewis john mcgibbney < [email protected]> wrote:
> Hi Gabriele, > > At first this seems like a plausable arguement, Indeed, I think it could be a FAQ. Shall I add it to nutch wiki? > however my question concerns > what Nutch would do if we wished to change the Solr core which to index to? > > If we removed this functionality from the crawldb there would be no way to > determine what Nutch was to fetch and what it wasn't. > Indeed, you confirm my though. > > > crawled, the fetch status, and the date. This data is maintained beyond > > fetch so that pages may be re-crawled, after the a re-crawling period. > > At the same time Solr maintains an inverted index of all the fetched > pages. > > It'd seem more efficient if nutch relied on the index instead of > > maintaining its own crawldb, to !store the same url twice. > > [BUT THAT'S JUST A KEY/ID, NOT WASTE AT ALL, WOULD ALSO END UP THE SAME > IN > > SOLR] > > > > -- > > Regards, > > K. Gabriele > > > > --- unchanged since 20/9/10 --- > > P.S. If the subject contains "[LON]" or the addressee acknowledges the > > receipt within 48 hours then I don't resend the email. > > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ > > time(x) < Now + 48h) ⇒ ¬resend(I, this). > > > > If an email is sent by a sender that is not a trusted contact or the > email > > does not contain a valid code then the email is not received. A valid > code > > starts with a hyphen and ends with "X". > > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ > > L(-[a-z]+[0-9]X)). > > > > > > > -- > *Lewis* > -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains "[LON]" or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) < Now + 48h) ⇒ ¬resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with "X". ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ L(-[a-z]+[0-9]X)).

