What you mentioned is correct?

It is stored in the db

Rgds
Prabhu


On 2/17/06, TDLN <[EMAIL PROTECTED]> wrote:
>
> Ah, after reading up on the new metadata facility in JIRA, I think I
> understand better what you mean - the metadata are added to the WebDB and
> persisted across refetches. This way it is possible for the complete index
> to be recreated from scratch while maintaing the first indexed date, which
> otherwise would be lost, right?
>
> Rgrds, Thomas
>
> On 2/17/06, TDLN <[EMAIL PROTECTED]> wrote:
> >
> > I am still using 0.7.1 - I think the CrawlDatum.setMetaData is only part
> > of the trunk.
> >
> > Is it not possible to just "hack" the MoreIndexingFilter and calculate
> the
> > date_indexed field there (similar to how the lastModified field is
> > calculated), and add a DateIndexedQueryFilter to the
> > org.apache.nutch.searcher.more package?
> >
> > Rgrds, Thomas
> >
> > On 2/17/06, Stefan Groschupf <[EMAIL PROTECTED] > wrote:
> > >
> > > Hey,
> > > May the freshly added CrawlDatum.setMetaData can help you to store
> > > such informations.
> > > However you need somehow to hack nutch code, since this is not stored
> > > until today yet there is no extension point for such a task.
> > >
> > > HTH
> > > Stefan
> > >
> > > Am 13.02.2006 um 17:36 schrieb Thomas Delnoij:
> > >
> > > > I have worked through the
> > > > WritingPluginExample<http://wiki.apache.org/nutch/
> > > > WritingPluginExample>example.
> > > > Now I am wondering if the following makes any sense. I would like
> > > > to store the date (yyyymmdd) the first time a Page was added to the
> > > > Index. I
> > > > thought I could create a plugin that would add a date_indexed
> > > > field. My
> > > > hesitation is what happens after the fetch interval, when the Page
> is
> > > > refetched.
> > > >
> > > > What happens
> > > >
> > > > - if the Page Content has changed? Is the Page updated (i.e.
> > > > deleted and
> > > > added) in the index and would the date_indexed be recalculated
> > > > (would be
> > > > ok.)
> > > > - if the Page hasn't changed? Is the Page also updated (would break
> > > > the
> > > > meaning of the date_indexed field, not ok).
> > > >
> > > > Or does this depend on how I organize my generate/fetch/update/
> > > > index cycle,
> > > > i.e. if I merge my indexes or recreate them from scratch?
> > > >
> > > > Rgrds, Thomas
> > >
> > > ---------------------------------------------
> > > George Orwel was an Optimist
> > > blog: http://www.find23.org
> > > company: http://www.media-style.com
> > >
> > >
> > >
> >
>
>

Reply via email to