On 3/18/11 4:31 PM, Markus Jelsma wrote:
Hi all,

I'm giving it a try to patch https://issues.apache.org/jira/browse/NUTCH-963
to trunk after committing to 1.3. There are of course a lot of differences so
i need a little advice on how to procede:

- instead of using CrawlDB and CrawlDatum we now need WebTableReader?

Actually you need to use StorageUtils to set up Mapper or Reducer contexts. See other tools, e.g. Fetcher or Generator.

- trunk uses slf instead of commons logging now?

Yes.

- a page is now represented by storage.WebPage?

Yes. When you prepare a Job you also need to specify what fields from WebPage you are interested in (and only these fields will be pulled in from the storage). This is all handled by StorageUtils methods.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to