Thanks! I'll try and come up with a working patch in the next few weeks orso.
On Friday 18 March 2011 16:57:20 Andrzej Bialecki wrote: > On 3/18/11 4:31 PM, Markus Jelsma wrote: > > Hi all, > > > > I'm giving it a try to patch > > https://issues.apache.org/jira/browse/NUTCH-963 to trunk after > > committing to 1.3. There are of course a lot of differences so i need a > > little advice on how to procede: > > > > - instead of using CrawlDB and CrawlDatum we now need WebTableReader? > > Actually you need to use StorageUtils to set up Mapper or Reducer > contexts. See other tools, e.g. Fetcher or Generator. > > > - trunk uses slf instead of commons logging now? > > Yes. > > > - a page is now represented by storage.WebPage? > > Yes. When you prepare a Job you also need to specify what fields from > WebPage you are interested in (and only these fields will be pulled in > from the storage). This is all handled by StorageUtils methods. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

