Re: Differences 1.x and trunk

Andrzej Bialecki Fri, 18 Mar 2011 08:57:53 -0700

On 3/18/11 4:31 PM, Markus Jelsma wrote:

Hi all,


I'm giving it a try to patch https://issues.apache.org/jira/browse/NUTCH-963
to trunk after committing to 1.3. There are of course a lot of differences so
i need a little advice on how to procede:

- instead of using CrawlDB and CrawlDatum we now need WebTableReader?

Actually you need to use StorageUtils to set up Mapper or Reducercontexts. See other tools, e.g. Fetcher or Generator.

- trunk uses slf instead of commons logging now?


Yes.

- a page is now represented by storage.WebPage?

Yes. When you prepare a Job you also need to specify what fields fromWebPage you are interested in (and only these fields will be pulled infrom the storage). This is all handled by StorageUtils methods.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Differences 1.x and trunk

Reply via email to