Hi Tejas, Thanks a lot for setting up this new setup guide. It really helped me and may be many other new Nutch users.
Tony. On Tue, Jun 11, 2013 at 7:02 AM, Tejas Patil <[email protected]>wrote: > Hi Tony, > > The simplified steps with snapshots are now added to Nutch wiki [0]. It > would be helpful if you could try those out and lets us know if there are > any improvements or corrections that you think. > > PS: Few images look shrinked. I will be fixing it soon. > > [0] : https://wiki.apache.org/nutch/RunNutchInEclipse > > > On Mon, Jun 10, 2013 at 2:57 PM, Tejas Patil <[email protected] > >wrote: > > > I have created a google doc [0] with several snapshots describing how to > > setup nutch 2.x + eclipse. This is different from the one over the wiki > > page and tailored for Nutch 2.x. Please try it out, let us know if you > > still have issues with that. Based on your comments, I would add the same > > over nutch wiki. > > > > [0] : > > > https://docs.google.com/document/d/1qvJwrZ9Sc0NAF9p3ie4uV7JsfCHxnrh9QF19HINw48c/edit?usp=sharing > > > > > > On Mon, Jun 10, 2013 at 11:32 AM, Tejas Patil <[email protected] > >wrote: > > > >> yes. > >> > >> - Close the project in eclipse. Right click on the project, click on > >> "Properties" and get the location of the project. > >> - Goto that location in terminal > >> - > >> > >> Run 'ant eclipse'. (Note that you need to have Apache Ant< > http://ant.apache.org/manual/index.html> installed > >> and configured) > >> > >> After going command line, you might as well do this: > >> Specify the GORA backend in nutch-site.xml, uncomment its dependency in > >> ivy/ivy.xml and ensure that the store you selected is set as the default > >> datastore in gora.properties > >> > >> > >> On Mon, Jun 10, 2013 at 11:21 AM, Tony Mullins < > [email protected]>wrote: > >> > >>> Hi, > >>> > >>> So the latest Nutch2.x includes the Teja's Patch ( > >>> https://issues.apache.org/jira/browse/NUTCH-1577) , means if I have > >>> latest > >>> source then it already has that patch. > >>> > >>> Now can some one please help me here what is meant by the 2nd last step > >>> 'Run 'ant eclipse' on http://wiki.apache.org/nutch/RunNutchInEclipse. > >>> > >>> Do I need to go to the location where source is and give ant command > 'ant > >>> -f build.xml' , or its something else ??? > >>> And after refreshing the source, Eclipse would let compile and run my > >>> code ? > >>> > >>> Thanks, > >>> Tony > >>> > >>> > >>> On Mon, Jun 10, 2013 at 6:56 PM, Tony Mullins < > [email protected] > >>> >wrote: > >>> > >>> > Hi Lewis, > >>> > > >>> > I understand this, that there may be something wrong on my end. And > as > >>> I > >>> > said I get different errors on running Nutch 2.x with Eclipse, after > >>> > following different tutorials. > >>> > > >>> > My background is in .NET and I might will just move to JAVA , just > >>> because > >>> > of this project (Nutch). But at the moment I am having difficult time > >>> > understanding the 'setup/configuration' required to run Nutch in > >>> Eclipse. > >>> > > >>> > When you say '...*you may find it convenient to patch > >>> > > >>> > your dist with Tejas' Eclipse ant target and simply run 'ant eclipse' > >>> from > >>> > within your terminal prior to doing a file, import, existing projects > >>> in to > >>> > workspace from within Eclipse..*.' > >>> > > >>> > which patch do I need to get and how to apply it ? > >>> > And by running 'ant eclipse' , do you mean dropping build.xml to Ant > >>> > window in Eclipse , OR building the Nutch source by using the "ant -f > >>> > build.xml" command in terminal ? ( by the way I have done both and > >>> both > >>> > successfully builds the source , but eclipse doesn't run the source). > >>> > > >>> > So could you please guide me here in more details, I would be really > >>> > grateful to you and Nutch community. > >>> > > >>> > Thanks, > >>> > Tony. > >>> > > >>> > > >>> > On Mon, Jun 10, 2013 at 6:38 PM, Lewis John Mcgibbney < > >>> > [email protected]> wrote: > >>> > > >>> >> Hi Tony, > >>> >> These issues stem from your environment not being correct. > >>> >> I, as many other, have been able to DEBUG and develop Nutch 1.7 and > >>> 2.x > >>> >> series from within Eclipse. > >>> >> As you are working with 2.x source, you may find it convenient to > >>> patch > >>> >> your dist with Tejas' Eclipse ant target and simply run 'ant > eclipse' > >>> from > >>> >> within your terminal prior to doing a file, import, existing > projects > >>> in > >>> >> to > >>> >> workspace from within Eclipse. > >>> >> I can guarantee you, the reason the tutorial is on the Nutch wiki is > >>> >> because as some stage, someone (many many people), somewhere have > >>> found it > >>> >> useful for developing Nutch in Eclipse. I don't want to sound like a > >>> >> baloon > >>> >> here, but your java security exceptions are not a problem with > >>> Nutch... > >>> >> it's your environment. > >>> >> hth > >>> >> > >>> >> On Monday, June 10, 2013, Tony Mullins <[email protected]> > >>> wrote: > >>> >> > Hi , > >>> >> > Ok now I have followed this tutorial word by word. > >>> >> > >>> > http://wiki.apache.org/nutch/RunNutchInEclipse#Checkout_Nutch_in_Eclipse > >>> . > >>> >> > > >>> >> > After getting new source 2.2 , I have build it using Ant - which > was > >>> >> successful then set the configurations and comment the 'hsqldb' > >>> dependency > >>> >> and uncomment the cassandra dependency ( as I want to run it against > >>> >> cassandra). After doing this all when I run the code from eclipse I > >>> get > >>> >> error > >>> >> > "Exception in thread "main" java.lang.SecurityException: > Prohibited > >>> >> package name: java.org.apache.nutch.crawl > >>> >> > at java.lang.ClassLoader.preDefineClass(ClassLoader.java:649) > >>> >> > at java.lang.ClassLoader.defineClass(ClassLoader.java:785) > >>> >> > at > >>> >> > >>> >> > >>> > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)...." > >>> >> > > >>> >> > and have red '*' all over my code. Please see the attached image. > >>> >> > > >>> >> > Now what I do ? > >>> >> > Please any one could tell me that is it even possible to > >>> >> compile/run/debug latest Nutch 2.x branch from Eclipse ? > >>> >> > > >>> >> > I need help here............... > >>> >> > > >>> >> > Tony !!! > >>> >> > > >>> >> > On Mon, Jun 10, 2013 at 12:15 PM, Tejas Patil < > >>> [email protected] > >>> >> > > >>> >> wrote: > >>> >> >> > >>> >> >> Hi Tony, > >>> >> >> > >>> >> >> That tutorial is based on some earlier nutch version. Please > follow > >>> >> >> > >>> >> > >>> > http://wiki.apache.org/nutch/RunNutchInEclipse#Checkout_Nutch_in_Eclipse > >>> . > >>> >> >> There has been recent changes to that wiki page and those new > steps > >>> >> would > >>> >> >> take care of getting automation.jar and etc dependencies in > place. > >>> >> >> > >>> >> >> > >>> >> >> On Sun, Jun 9, 2013 at 11:58 PM, Tony Mullins < > >>> >> [email protected] > >>> >> >wrote: > >>> >> >> > >>> >> >> > Hi , > >>> >> >> > > >>> >> >> > The last try I made was with this tutorial ' > >>> >> >> > run nutch in eclipse | profilerajanimaski' > >>> >> >> > , > >>> >> >> > after following word to word ( which didn't work for me) then I > >>> made > >>> >> some > >>> >> >> > modifications to it as for step 11 I added 'bin' , 'gora' , > >>> 'java' > >>> >> ,'test' > >>> >> >> > , 'testprocess' , 'testresources' . And for step 14 I couldn't > >>> find > >>> >> >> > 'src/plugin/url-filter-automation/lib/automation.jar' in my > >>> source. > >>> >> >> > > >>> >> >> > And when I try to run main 'Crawler' project it says there are > >>> errors > >>> >> and > >>> >> >> > give me option to proceed with errors and when I proceed with > >>> errors > >>> >> I am > >>> >> >> > getting this error: > >>> >> >> > > >>> >> >> > "InjectorJob: Using class org.apache.gora.memory.store.MemStore > >>> as > >>> >> the > >>> >> >> > Gora storage class. > >>> >> >> > InjectorJob: total number of urls rejected by filters: 0 > >>> >> >> > InjectorJob: total number of urls injected after normalization > >>> and > >>> >> >> > filtering: 0 > >>> >> >> > Exception in thread "main" java.lang.RuntimeException: job > >>> failed: > >>> >> >> > name=generate: null, jobid=job_local_0002....... > >>> >> >> > ..... > >>> >> >> > " > >>> >> >> > > >>> >> >> > So please help me what I am doing wrong here or guide me to a > >>> >> tutorial > >>> >> >> > which works.... > >>> >> >> > If the latest Nutch 2.2 source doesn't work with these > tutorials > >>> then > >>> >> >> > which version of 2.x will work and how ? > >>> >> >> > > >>> >> >> > Thanks. > >>> >> >> > Tony > >>> >> >> > > >>> >> >> > > >>> >> >> > On Mon, Jun 10, 2013 at 7:20 AM, Tejas Patil < > >>> >> [email protected] > >>> >> >wrote: > >>> >> >> > > >>> >> >> >> Could you try closing and re-opening the eclipse and then let > >>> >> eclipse > >>> >> >> >> rebuild workspace. BTW: On which packages / classes do you see > >>> red > >>> >> dots ? > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> On Sun, Jun 9, 2013 at 9:23 AM, Lewis John Mcgibbney < > >>> >> >> >> [email protected]> wrote: > >>> >> >> >> > >>> >> >> >> > Hi Tony, > >>> >> >> >> > This source has literally just been released. The tutorial > on > >>> the > >>> >> Nutch > >>> >> >> >> > wiki has also just been updated but you need to follow it > >>> closely > >>> >> and > >>> >> >> >> pay > >>> >> >> >> > attention to each step. It sounds like the red dots problem > >>> your > >>> >> having > >>> >> >> >> is > >>> >> >> >> > explained in the 2nd to last bullet point below > >>> >> >> >> > > >>> >> >> >> > > >>> >> >> >> > >>> >> > >>> > http://wiki.apache.org/nutch/RunNutchInEclipse#Checkout_Nutch_in_Eclipse > >>> >> >> >> > > >>> >> >> >> > Also, you've not actually said what went wrong! > >>> >> >> >> > Lewis > >>> >> >> >> > > >>> >> >> >> > > >>> >> >> >> > On Sunday, June 9, 2013, Tony Mullins < > >>> [email protected]> > >>> >> wrote: > >>> >> >> >> > > Hi, > >>> >> >> >> > > > >>> >> >> >> > > I am new to Nutch. I am trying to use Nutch with Cassandra > >>> and > >>> >> have > >>> >> >> >> > > successfully build the Nutch 2.x ( > >>> >> >> >> > > http://svn.apache.org/repos/asf/nutch/branches/2.x/). > >>> >> >> >> > > > >>> >> >> >> > > But I get errors ( different errors after following > >>> different > >>> >> >> >> tutorials) > >>> >> >> >> > > when I try to run it directly from Eclipse ( I am on > CentOS > >>> 6.4) > >>> >> , I > >>> >> >> >> have > >>> >> >> >> > > tried to follow these tutorials to run Nutch source from > >>> Eclipse > >>> >> but > >>> >> >> >> no > >>> >> >> >> > use. > >>> >> >> >> > > > >>> >> >> >> > > http://wiki.apache.org/nutch/RunNutchInEclipse > >>> >> >> >> > > run nutch in eclipse | profilerajanimaski > >>> >> >> >> > > > >>> >> >> >> > >>> >> > >>> http://jarpit83.blogspot.com/2012/07/configuring-nutch-in-eclipse.html > >>> >> >> >> > > > >>> http://techvineyard.blogspot.com/2010/12/build-nutch-20.html > >>> >> >> >> > > > >>> >> >> >> > > Whatever I do, I get red "*" on my source and it doesn't > >>> get > >>> >> run > >>> >> by > >>> >> >> >> > > Eclipse , but it always get build successfully using Ant. > >>> >> >> >> > > > >>> >> >> >> > > Pleeeeaaase help me here, could any one please guide me to > >>> >> single > >>> >> web > >>> >> >> >> > > tutorial which actually could help me compile and run > latest > >>> >> Nutch 2.x > >>> >> >> >> > with > >>> >> >> >> > > Eclipse (Juno) on CentOS. > >>> >> >> >> > > > >>> >> >> >> > > Thanksss. > >>> >> >> >> > > Tony. > >>> >> >> >> > > > >>> >> >> >> > > >>> >> >> >> > -- > >>> >> >> >> > *Lewis* > >>> >> >> >> > > >>> >> >> >> > >>> >> >> > > >>> >> >> > > >>> >> > > >>> >> > > >>> >> > >>> >> -- > >>> >> *Lewis* > >>> >> > >>> > > >>> > > >>> > >> > >> > > >

