Thank you Tejas.

Your tips helped a lot.

One more thing is, after building, the plugin.folder property should point
to build/plugins for executing the crawl.

Now it crawling fine. My concern is to locate object which has the content
and its metadata so that I can capture that and direct to my DB, as
mentioned earlier. How to do that?

Thanks,

--
Prashant More


On Thu, Feb 7, 2013 at 11:40 AM, Tejas Patil <[email protected]>wrote:

> On Wed, Feb 6, 2013 at 9:23 PM, Prashant More (प्रशांत मोरे) <
> [email protected]> wrote:
>
> > Thank you Tejas.
> > I have added all the libraries/jars mentioned in [1], along with my
> source
> > jar and other required jars to the classpath. The difference between the
> > bin/nutch script and the tutorial [1] is adding java's tools.jar in the
> > script, and not adding nutch's build directory in eclipse as we want to
> use
> > the source for building nutch.
>
> Ok.
>
>
> > I have added the tools.jar and instead of
> > build directory, I have added nutch's java source to the classpath.
> >
> > [1] http://wiki.apache.org/nutch/RunNutchInEclipse
> >
> > Still it is giving the same error.
> >
> What is the name of that package that you are adding: is it
> org.apache.nutch.XXXX or something else ?
> How do you compile the code in Eclipse: running the ant build file or some
> other way ?
> These are relevant chunks in build.xml [1] that might help you: lines
> 86-100, 455-460.
> If you are running ant build file, try to print the classpath formed in the
> compile-core target ([2] tells how to do that).
> There are 2 possibilities:
> 1. the extra jars you added are not in the classpath: in this case, you can
> debug the "copy-libs" target and check what all things are getting copied.
> 2. the extra jars you added are in the classpath and yet you see
> compilation error: This might be strange but leading towards an eclispe +
> ant issue and probably wont have to do with nutch.
>
> [1] : http://svn.apache.org/viewvc/nutch/trunk/build.xml?view=markup
> [2] : http://www.javalobby.org/java/forums/t71033.html
>
>
> >
> > Thanks,
> > Prashant More
> >
> > On Thu, Feb 7, 2013 at 5:30 AM, Tejas Patil <[email protected]
> > >wrote:
> >
> > > If you see the bin/nutch script, there are lot of things that are to be
> > > added to the CP before the actual nutch class is invoked. Looking at
> the
> > > script you will get a hint about what is missing. Also, beware of your
> > > package naming. Build script it looks at specific places only for
> source
> > > files. eg.
> > > includes="org/apache/nutch/**/*.java"
> > > Tweaking the build file or placing your classes at right place might
> help
> > > you here.
> > >
> > > thanks,
> > > Tejas Patil
> > >
> > >
> > >
> > > On Wed, Feb 6, 2013 at 12:30 AM, Prashant More (प्रशांत मोरे) <
> > > [email protected]> wrote:
> > >
> > > > Thank you, Tejas.
> > > >
> > > > My DB is already in place, for processing, I have configured and used
> > > > Nutch1.0 from shell script, but I want to configure and modify using
> > > > eclipse for Nutch1.5. So at present I do not want to use 2.1.
> > > >
> > > > Thanks,
> > > > Prashant More
> > > >
> > > > On Wed, Feb 6, 2013 at 12:41 PM, Tejas Patil <
> [email protected]
> > > > >wrote:
> > > >
> > > > > Have you considered using nutch 2.x ? It has support for doing
> this.
> > > > Google
> > > > > out "nutch 2.x mySQL" to get some good tutorials like [0].
> > > > >
> > > > > [0] : http://nlp.solutions.asia/?p=180
> > > > >
> > > > > Thanks,
> > > > > Tejas Patil
> > > > >
> > > > >
> > > > > On Tue, Feb 5, 2013 at 10:24 PM, Prashant More (प्रशांत मोरे) <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > Hi,
> > > > > >    I am modifying the nutch source to direct the crawled content
> to
> > > > mysql
> > > > > > db in my own database structure for further processing.
> Initially,
> > I
> > > > > > condigured Nutch1.5 source with eclipse Juno and it crawls the
> data
> > > on
> > > > my
> > > > > > files system, as expected. Then I wrote some code for directing
> the
> > > > > crawled
> > > > > > data to my DB.
> > > > > >
> > > > > > I added the code to the Nutch source and added the required
> > libraries
> > > > to
> > > > > > the build path. But it is unable to find my packages in libraries
> > and
> > > > > > hadoop packages, during the build time.
> > > > > >
> > > > > > I placed my jars/libraries in NUTCH_HOME/lib, as this is used by
> > > > > build.xml
> > > > > > for compiling.
> > > > > >
> > > > > > It is showing compile error while building, however, when I made
> > > > changes
> > > > > in
> > > > > > Nutch source it did not show any errors.
> > > > > >
> > > > > > Kindly let me know what am i missing?
> > > > > > --
> > > > > > More Prashant
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to