Andrzej Bialecki wrote:

Michael Wechner wrote:

Hi

Please apologize if I might ask something obvious, but what is actually the purpose
of the nutch-*.job file?


It contains all classes and plugins needed to run a Nutch job on a Hadoop cluster. Hadoop cluster doesn't have to be used for Nutch, indeed there are many other interesting applications for it - so the core Hadoop is independent of any Nutch classes.


ok. I guess this http://wiki.apache.org/nutch/NutchHadoopTutorial


So, as the job is submitted to the cluster, there must be a way to transmit all necessary implementation classes so that tasks on individual nodes could execute the Nutch code. This is the purpose of the job file - it is then expanded on each node,


the submission and expansion of the job file is done automatically? I mean one deploys the job file manually on the master and then it's being
spread automatically on the slaves?

Thanks for the info an clarifications

Michi

and all classes and plugins are loaded by a task's classloader.



--
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
[EMAIL PROTECTED]                        [EMAIL PROTECTED]
+41 44 272 91 61

Reply via email to