You could write a MapReduce job that would use the parse_data folder as 
input and inside the map or reduce class depending on your logic use 
jdbc to update to mysql.  It would look something like this for the job 
configuration.

    JobConf yourjob= new NutchJob(conf);   
    for (int i = 0; i < segments.length; i++) {
      LOG.info("Job Runner: adding segment: " + segments[i]);
      pageanal.addInputPath(new Path(segments[i], ParseData.DIR_NAME));
    }
    yourjob.setJobName("yada yada");
    yourjob.setInputFormat(MapFileInputFormat.class);
    yourjob.setInputKeyClass(UTF8.class);
    yourjob.setInputValueClass(ParseData.class);
    yourjob.setMapperClass(YourMapperClass.class);
    yourjob.setReducerClass(YourReducerClass.class);

    JobClient.runJob(yourjob);

A second way would be to write an output format that uses JDBC to do the 
update to mysql.

Dennis

jaison wrote:
> I want to take the metadatas for a particular url parsed from the Dfs
> filesytem(from parse_data segment) and
> have to update the values in mysql database which is running in ext2 file
> system.Is it possible?
> If so how?.
>   

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to