On Sat, Jul 16, 2011 at 1:29 PM, lewis john mcgibbney <
[email protected]> wrote:

> Hi Gabriele,
>
> At first this seems like a plausable arguement,


Indeed, I think it could be a FAQ. Shall I add it to nutch wiki?


> however my question concerns
> what Nutch would do if we wished to change the Solr core which to index to?
>
> If we removed this functionality from the crawldb there would be no way to
> determine what Nutch was to fetch and what it wasn't.
>

Indeed, you confirm my though.

>
> > crawled, the fetch status, and the date. This data is maintained beyond
> > fetch so that pages may be re-crawled, after the a re-crawling period.
> > At the same time Solr maintains an inverted index of all the fetched
> pages.
> > It'd seem more efficient if nutch relied on the index instead of
> > maintaining its own crawldb, to !store the same url twice.
> > [BUT THAT'S JUST A KEY/ID, NOT WASTE AT ALL, WOULD ALSO END UP THE SAME
> IN
> > SOLR]
> >
> > --
> > Regards,
> > K. Gabriele
> >
> > --- unchanged since 20/9/10 ---
> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
> > receipt within 48 hours then I don't resend the email.
> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> > time(x) < Now + 48h) ⇒ ¬resend(I, this).
> >
> > If an email is sent by a sender that is not a trusted contact or the
> email
> > does not contain a valid code then the email is not received. A valid
> code
> > starts with a hyphen and ends with "X".
> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> > L(-[a-z]+[0-9]X)).
>
>
>
>
>
>
> --
> *Lewis*
>



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Reply via email to