The general practice is to install your deps into a custom location such as /opt/john-jars, and extend YARN_CLASSPATH to include the jars, while also configuring the classes under the aux-services list. You need to take care of deploying jar versions to /opt/john-jars/ contents across the cluster though.
I think it may be a neat idea to have jars be placed on HDFS or any other DFS, and the yarn-site.xml indicating the location plus class to load. Similar to HBase co-processors. But I'll defer to Vinod on if this would be a good thing to do. (I know the right next thing with such an ability people will ask for is hot-code-upgradesā¦) On Fri, Aug 23, 2013 at 10:11 PM, John Lilley <[email protected]> wrote: > Are there recommended conventions for adding additional code to a stock > Hadoop install? > > It would be nice if we could piggyback on whatever mechanisms are used to > distribute hadoop itself around the cluster. > > john > > > > From: Vinod Kumar Vavilapalli [mailto:[email protected]] > Sent: Thursday, August 22, 2013 6:25 PM > > > To: [email protected] > Subject: Re: yarn-site.xml and aux-services > > > > > > Auxiliary services are essentially administer-configured services. So, they > have to be set up at install time - before NM is started. > > > > +Vinod > > > > On Thu, Aug 22, 2013 at 1:38 PM, John Lilley <[email protected]> > wrote: > > Following up on this, how exactly does one *install* the jar(s) for > auxiliary service? Can it be shipped out with the LocalResources of an AM? > MapReduce's aux-service is presumably installed with Hadoop and is just > sitting there in the right place, but if one wanted to make a whole new > aux-service that belonged with an AM, how would one do it? > > John > > > -----Original Message----- > From: John Lilley [mailto:[email protected]] > Sent: Wednesday, June 05, 2013 11:41 AM > To: [email protected] > Subject: RE: yarn-site.xml and aux-services > > Wow, thanks. Is this documented anywhere other than the code? I hate to > waste y'alls time on things that can be RTFMed. > John > > > -----Original Message----- > From: Harsh J [mailto:[email protected]] > Sent: Wednesday, June 05, 2013 9:35 AM > To: <[email protected]> > Subject: Re: yarn-site.xml and aux-services > > John, > > The format is ID and sub-config based: > > First, you define an ID as a service, like the string "foo". This is the ID > the applications may lookup in their container responses map we discussed > over another thread (around shuffle handler). > > <property> > <name>yarn.nodemanager.aux-services</name> > <value>foo</value> > </property> > > Then you define an actual implementation class for that ID "foo", like so: > > <property> > <name>yarn.nodemanager.aux-services.foo.class</name> > <value>com.mypack.MyAuxServiceClassForFoo</value> > </property> > > If you have multiple services foo and bar, then it would appear like the > below (comma separated IDs and individual configs): > > <property> > <name>yarn.nodemanager.aux-services</name> > <value>foo,bar</value> > </property> > <property> > <name>yarn.nodemanager.aux-services.foo.class</name> > <value>com.mypack.MyAuxServiceClassForFoo</value> > </property> > <property> > <name>yarn.nodemanager.aux-services.bar.class</name> > <value>com.mypack.MyAuxServiceClassForBar</value> > </property> > > On Wed, Jun 5, 2013 at 8:42 PM, John Lilley <[email protected]> > wrote: >> Good, I was hoping that would be the case. But what are the mechanics of >> it? Do I just add another entry? And what exactly is "madreduce.shuffle"? >> A scoped class name? Or a key string into some map elsewhere? >> >> e.g. like: >> >> <property> >> <name>yarn.nodemanager.aux-services</name> >> <value>mapreduce.shuffle</value> >> </property> >> <property> >> <name>yarn.nodemanager.aux-services</name> >> <value>myauxserviceclassname</value> >> </property> >> >> Concerning auxiliary services -- do they communicate with NodeManager via >> RPC? Is there an interface to implement? How are they opened and closed >> with NodeManager? >> >> Thanks >> John >> >> -----Original Message----- >> From: Harsh J [mailto:[email protected]] >> Sent: Tuesday, June 04, 2013 11:58 PM >> To: <[email protected]> >> Subject: Re: yarn-site.xml and aux-services >> >> Yes, thats what this is for. You can implement, pass in and use your own >> AuxService. It needs to be on the NodeManager CLASSPATH to run (and NM has >> to be restarted to apply). >> >> On Wed, Jun 5, 2013 at 4:00 AM, John Lilley <[email protected]> >> wrote: >>> I notice the yarn-site.xml >>> >>> >>> >>> <property> >>> >>> <name>yarn.nodemanager.aux-services</name> >>> >>> <value>mapreduce.shuffle</value> >>> >>> <description>shuffle service that needs to be set for Map Reduce >>> to run </description> >>> >>> </property> >>> >>> >>> >>> Is this a general-purpose hook? >>> >>> Can I tell yarn to run *my* per-node service? >>> >>> Is there some other way (within the recommended Hadoop framework) to >>> run a per-node service that exists during the lifetime of the >>> NodeManager? >>> >>> >>> >>> John Lilley >>> >>> Chief Architect, RedPoint Global Inc. >>> >>> 1515 Walnut Street | Suite 200 | Boulder, CO 80302 >>> >>> T: +1 303 541 1516 | M: +1 720 938 5761 | F: +1 781-705-2077 >>> >>> Skype: jlilley.redpoint | [email protected] | www.redpoint.net >>> >>> >> >> >> >> -- >> Harsh J > > > > -- > Harsh J > > > > > -- > +Vinod > Hortonworks Inc. > http://hortonworks.com/ > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader of > this message is not the intended recipient, you are hereby notified that any > printing, copying, dissemination, distribution, disclosure or forwarding of > this communication is strictly prohibited. If you have received this > communication in error, please contact the sender immediately and delete it > from your system. Thank You. -- Harsh J
