Have you considered using nutch 2.x ? It has support for doing this. Google out "nutch 2.x mySQL" to get some good tutorials like [0].
[0] : http://nlp.solutions.asia/?p=180 Thanks, Tejas Patil On Tue, Feb 5, 2013 at 10:24 PM, Prashant More (प्रशांत मोरे) < [email protected]> wrote: > Hi, > I am modifying the nutch source to direct the crawled content to mysql > db in my own database structure for further processing. Initially, I > condigured Nutch1.5 source with eclipse Juno and it crawls the data on my > files system, as expected. Then I wrote some code for directing the crawled > data to my DB. > > I added the code to the Nutch source and added the required libraries to > the build path. But it is unable to find my packages in libraries and > hadoop packages, during the build time. > > I placed my jars/libraries in NUTCH_HOME/lib, as this is used by build.xml > for compiling. > > It is showing compile error while building, however, when I made changes in > Nutch source it did not show any errors. > > Kindly let me know what am i missing? > -- > More Prashant >

