include nutch jar in mapred jobs
--------------------------------

         Key: NUTCH-209
         URL: http://issues.apache.org/jira/browse/NUTCH-209
     Project: Nutch
        Type: Improvement
    Versions: 0.8-dev    
    Reporter: Doug Cutting
    Priority: Minor
     Fix For: 0.8-dev


I just added a simple way in Hadoop to specify the job jar file.  When 
constructing a JobConf one can specify a class whose containing jar is set to 
be the job's jar.  To take advantage of this in Nutch, we could add a util 
class:

public class NutchJob extends JobConf {
  public NutchJob(Configuration conf) {
    super(conf, NutchJob.class);
  }
}

Then change all of the places where we construct a JobConf to instead construct 
a NutchJob.

Finally, we should add an ant target called 'job' that constructs a job jar, 
containing all of the classes and the plugins, and make this the default 
target.  This way all Nutch code can be distributed with each job as it is 
submitted, and daemons would only need to be restarted when Hadoop code is 
updated.

Does this sound reasonable?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to