Thank you Tejas. Your tips helped a lot.
One more thing is, after building, the plugin.folder property should point to build/plugins for executing the crawl. Now it crawling fine. My concern is to locate object which has the content and its metadata so that I can capture that and direct to my DB, as mentioned earlier. How to do that? Thanks, -- Prashant More On Thu, Feb 7, 2013 at 11:40 AM, Tejas Patil <[email protected]>wrote: > On Wed, Feb 6, 2013 at 9:23 PM, Prashant More (प्रशांत मोरे) < > [email protected]> wrote: > > > Thank you Tejas. > > I have added all the libraries/jars mentioned in [1], along with my > source > > jar and other required jars to the classpath. The difference between the > > bin/nutch script and the tutorial [1] is adding java's tools.jar in the > > script, and not adding nutch's build directory in eclipse as we want to > use > > the source for building nutch. > > Ok. > > > > I have added the tools.jar and instead of > > build directory, I have added nutch's java source to the classpath. > > > > [1] http://wiki.apache.org/nutch/RunNutchInEclipse > > > > Still it is giving the same error. > > > What is the name of that package that you are adding: is it > org.apache.nutch.XXXX or something else ? > How do you compile the code in Eclipse: running the ant build file or some > other way ? > These are relevant chunks in build.xml [1] that might help you: lines > 86-100, 455-460. > If you are running ant build file, try to print the classpath formed in the > compile-core target ([2] tells how to do that). > There are 2 possibilities: > 1. the extra jars you added are not in the classpath: in this case, you can > debug the "copy-libs" target and check what all things are getting copied. > 2. the extra jars you added are in the classpath and yet you see > compilation error: This might be strange but leading towards an eclispe + > ant issue and probably wont have to do with nutch. > > [1] : http://svn.apache.org/viewvc/nutch/trunk/build.xml?view=markup > [2] : http://www.javalobby.org/java/forums/t71033.html > > > > > > Thanks, > > Prashant More > > > > On Thu, Feb 7, 2013 at 5:30 AM, Tejas Patil <[email protected] > > >wrote: > > > > > If you see the bin/nutch script, there are lot of things that are to be > > > added to the CP before the actual nutch class is invoked. Looking at > the > > > script you will get a hint about what is missing. Also, beware of your > > > package naming. Build script it looks at specific places only for > source > > > files. eg. > > > includes="org/apache/nutch/**/*.java" > > > Tweaking the build file or placing your classes at right place might > help > > > you here. > > > > > > thanks, > > > Tejas Patil > > > > > > > > > > > > On Wed, Feb 6, 2013 at 12:30 AM, Prashant More (प्रशांत मोरे) < > > > [email protected]> wrote: > > > > > > > Thank you, Tejas. > > > > > > > > My DB is already in place, for processing, I have configured and used > > > > Nutch1.0 from shell script, but I want to configure and modify using > > > > eclipse for Nutch1.5. So at present I do not want to use 2.1. > > > > > > > > Thanks, > > > > Prashant More > > > > > > > > On Wed, Feb 6, 2013 at 12:41 PM, Tejas Patil < > [email protected] > > > > >wrote: > > > > > > > > > Have you considered using nutch 2.x ? It has support for doing > this. > > > > Google > > > > > out "nutch 2.x mySQL" to get some good tutorials like [0]. > > > > > > > > > > [0] : http://nlp.solutions.asia/?p=180 > > > > > > > > > > Thanks, > > > > > Tejas Patil > > > > > > > > > > > > > > > On Tue, Feb 5, 2013 at 10:24 PM, Prashant More (प्रशांत मोरे) < > > > > > [email protected]> wrote: > > > > > > > > > > > Hi, > > > > > > I am modifying the nutch source to direct the crawled content > to > > > > mysql > > > > > > db in my own database structure for further processing. > Initially, > > I > > > > > > condigured Nutch1.5 source with eclipse Juno and it crawls the > data > > > on > > > > my > > > > > > files system, as expected. Then I wrote some code for directing > the > > > > > crawled > > > > > > data to my DB. > > > > > > > > > > > > I added the code to the Nutch source and added the required > > libraries > > > > to > > > > > > the build path. But it is unable to find my packages in libraries > > and > > > > > > hadoop packages, during the build time. > > > > > > > > > > > > I placed my jars/libraries in NUTCH_HOME/lib, as this is used by > > > > > build.xml > > > > > > for compiling. > > > > > > > > > > > > It is showing compile error while building, however, when I made > > > > changes > > > > > in > > > > > > Nutch source it did not show any errors. > > > > > > > > > > > > Kindly let me know what am i missing? > > > > > > -- > > > > > > More Prashant > > > > > > > > > > > > > > > > > > > > >

