Does this require knowing Java proficiently? On Tue, Mar 6, 2018 at 10:51 AM Semyon Semyonov <[email protected]> wrote:
> Here is an unpleasant truth - there is no up to date tutorial for Nutch. > To make it even more interesting, sometimes the tutorial can contradict > real behavior of Nutch, because of lately introduced features/bugs. If you > find such cases, please try to fix and contribute to the project. > > Welcome to the open source world. > > Though, my recommendations as a person who started with Nutch less then a > year ago : > 1) If you just need a simple crawl, you are in luck. Simply run crawl > script or several steps according to the Nutch crawl tutorial. > 2) If it is bit more comlex you start to face problems either with > configuration or with bugs. Therefore, first have a look at Nutch List > Archive http://nutch.apache.org/mailing_lists.html , if it doesnt work > try to figure out yourself, if that doesnt work ask here or at developer > list. > 3) In most cases, you HAVE to open the code and fix/discover something. > Nutch is really complicated system and to understand it properly you can > easily spend 2-3 months trying to get the full basic understanding of the > system. It gets even worse if you don't know Hadoop. If you dont I do > recomend to read "Hadoop. The definitive guide", because, well, Nutch is > Hadoop. > > Here we are, no pain, no gain. > > > > Sent: Tuesday, March 06, 2018 at 7:42 PM > From: "Eric Valencia" <[email protected]> > To: [email protected] > Subject: Re: Need Tutorial on Nutch > Thank you kindly Yash. Yes, I did try some of the tutorials actually but > they seem to be missing the complete amount of steps required to > successfully scrape in nutch. > > On Tue, Mar 6, 2018 at 10:37 AM Yash Thenuan Thenuan < > [email protected]> > wrote: > > > I would suggest to start with the documentation on nutch's website. > > You can get a Idea about how to start crawling and all. > > Apart from that there are no proper tutorials as such. > > Just start crawling if you got stuck somewhere try to find something > > related to that on Google and nutch mailing list archives. > > Ask questions if nothing helps. > > > > On 7 Mar 2018 00:01, "Eric Valencia" <[email protected]> wrote: > > > > I'm a beginner in Nutch and need the best tutorials to get started. Can > > you guys let me know how you would advise yourselves if starting today > > (like me)? > > > > Eric > > >

