Hello Mike, I completely agree with you. I think bundling the libraries in the job jar file is the correct way to go.
Thanks, Farhan On Thu, Apr 22, 2010 at 9:12 PM, Michael Segel <[email protected]>wrote: > > > > > Date: Thu, 22 Apr 2010 17:30:13 -0700 > > Subject: Re: Using external library in MapReduce jobs > > From: [email protected] > > To: [email protected] > > > > Sure, you need to place them into $HADOOP_HOME/lib directory on each > server > > in the cluster and they will be picked up on the next restart. > > > > -- Alex K > > > > While this works, I wouldn't recommend it. > > You have to look at it this way... Your external m/r java libs are job > centric. So every time you want to add jobs that require new external > libraries you have to 'bounce' your cloud after pushing the the jars. Then > you also have the issue of java class collisions if the cloud has one > version of the same jar you're using. (We've had this happen to us already.) > > If you're just testing for a proof of concept, its one thing, but after the > proof, you'll need to determine how to correctly push the jars out to each > node. > > In a production environment, constantly bouncing clouds for each new job > isn't really a good idea. > > HTH > > -Mike > > _________________________________________________________________ > Hotmail has tools for the New Busy. Search, chat and e-mail from your > inbox. > > http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1 >
