Michael Wechner wrote:
Hi
Please apologize if I might ask something obvious, but what is
actually the purpose
of the nutch-*.job file?
It contains all classes and plugins needed to run a Nutch job on a
Hadoop cluster. Hadoop cluster doesn't have to be used for Nutch, indeed
there are many other interesting applications for it - so the core
Hadoop is independent of any Nutch classes.
So, as the job is submitted to the cluster, there must be a way to
transmit all necessary implementation classes so that tasks on
individual nodes could execute the Nutch code. This is the purpose of
the job file - it is then expanded on each node, and all classes and
plugins are loaded by a task's classloader.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com