On 07/12/2011 03:42 PM, lewis john mcgibbney wrote:
> Hi,
>
> An observation is that you are using the 1.3 branch, which will now contain
> some older code. For example the fetcher class has now been upgraded to deal
> with Nutch-962, which is mentioned at the top of the class as per your URL
> example.
>
> Can anyone explain what the existing metadata being transferred is as per
> below if it does not include the score as you state?
>
> } else {
> CrawlDatum newDatum = new CrawlDatum(CrawlDatum.STATUS_LINKED,
> datum.getFetchInterval());
> // transfer existing metadata
> newDatum.getMetaData().putAll(datum.getMetaData());
> try {
> scfilters.initialScore(url, newDatum);
>
> I would have imagined that the metadata would have included the relative
> initial score we are discussing if it were to be of use in attributing an
> initial URLs metadata to a redirect?
> Apart from this, with the addition of your datum.getScore(), do the new
> scores attributed to the URL redirects reflect accurately you're general
> understanding of the web graph?
I have only been dealing with Nutch 1.2 and 1.3. I tried to setup 2.0
with Eclipse but failed as described here
(http://lucene.472066.n3.nabble.com/TestFetcher-hangs-td3091057.html).
The new scores were as they should have been in my opinion. (Even though
I would state that Nutch's implementation of OPIC isn't exactly what the
publication says.) I don't know what information is passed in metadata.