This will help for getting an idea about what is needed: http://wiki.apache.org/nutch/NutchAdministrationUserInterface
Rest API in nutch: (the jira comments and the patch will help you here) https://issues.apache.org/jira/browse/NUTCH-880 Also, its worth to invest some time to get to know nutch. This is an old paper by Doug Cutting on Nutch: http://www.master.netseven.it/files/262-Nutch.pdf Here is a video of a presentation by Julien @ Lucene Eurocon last year: http://vimeopro.com/user11514798/apache-lucene-eurocon-2012/video/55566234 After that, roll up your sleeves, get the source code and start off crawling. These are the relevant tutorials: http://wiki.apache.org/nutch/NutchTutorial http://wiki.apache.org/nutch/Nutch2Tutorial Also, you will find some config and feature-centric documentation over the wiki pages. Here is the wiki main page: http://wiki.apache.org/nutch/ I think that your work would be a great contribution to Nutch. Looking forward to see this feature in next release cycle. Thanks, Tejas Patil On Sun, May 19, 2013 at 12:30 PM, Ivan Vershinin <[email protected]> wrote: > Hello! > I am student from Estonia (Tartu University). I want to participate in > GSoC 2013, and selected your project because i have experience in Java and > Wicket. > Can you give me some advice, where i can start my investigations? > Best regards, > Ivan Vershinin >

