#1 and your suggestion of just the dependencies seem to work for me.
I also looked into using Classworlds (which allows you to package
jars into an uberJAR as they call it and is not quite the same as #2)
but I couldn't get it working (not that I spent that much time on
it). #2 should also work, but I haven't tried it.
-Grant
On Oct 30, 2006, at 6:29 AM, Vetle Roeim wrote:
On Sat, 28 Oct 2006 22:13:35 +0200, Albert Chern
<[EMAIL PROTECTED]> wrote:
I'm not sure if the first option works. If it does let me know.
One of the
developers taught me to use option 2 by creating a jar with your
dependencies in lib/. The tasktrackers will automatically include
everything in lib/ on their classpaths.
Yeah, I ended up using this method as well, after getting
ClassNotFoundException on some instances. Haven't tried the first
method in a while, though.
On 10/28/06, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
I'm not sure I am understanding this correctly and I don't see
anything on this in the Getting Started section, so...
It seems that when I want to run my application in distributed mode,
I should invoke the <hadoop_home>/bin/hadoop jar <jar> (or bin/
hadoop
<main-class>) and it will copy my JAR onto the DFS and then
distribute the other nodes in the cluster can access it and run it.
Classpath wise, there seems to be two options:
1. Have all the appropriate dependencies available so they are read
in by the start up commands and included in the classpath. Does
this
means they all need to be on each node at startup time?
2. Create a single JAR made up of the contents of all the
dependencies
Also, the paths must be exactly the same on all the nodes, right?
Is this correct or am I missing something?
Thanks,
Grant
--
Vetle Roeim
Team Manager, Information Systems
Opera Software ASA <URL: http://www.opera.com/ >
------------------------------------------------------
Grant Ingersoll
http://www.grantingersoll.com/