This is what I use in production that has many benefits. In this case mapreduce.application.framework.path is the runtime classpath tar.gz file that is custom built mapreduce runtime environment, perhaps similar to nutch 1) localizing one tar.gz file instead of many individual jars 2) minimal jar has fewer class conflicts and a smaller footprint 3) localizing tez to tez folder (#tez) allows better control of the classpath to avoid java inconsistent classpath resolution of jars in same directory 4) use cluster hadooplibs false avoids using the jars from the individuals nodemanagers and only relies on jars listed in tez.lib.uris
<property> <name>mapreduce.application.framework.path</name> <value>/hdfs/path/hadoop-mapreduce-${mapreduce.application.framework.version}.tgz#hadoop-mapreduce</value> </property> <property> <name>tez.lib.uris</name> <value>/hdfs/path/tez-0.9.2-minimal.tar.gz#tez,${mapreduce.application.framework.path}</value> </property> <property> <name>tez.lib.uris.classpath</name> <value>${mapreduce.application.classpath},./tez/*,./tez/lib/*</value> </property> <property> <name>tez.use.cluster.hadoop-libs</name> <value>false</value> </property> On Thu, Dec 17, 2020 at 11:57 AM Lewis John McGibbney <lewi...@apache.org> wrote: > I tried the following configuration in tez-site.xml with no luck > > <configuration> > <property> > <name>tez.lib.uris</name> > > <value>${fs.defaultFS}/apps/tez-0.10.1-SNAPSHOT,${fs.defaultFS}/apps/tez-0.10.1-SNAPSHOT/lib,${fs.defaultFS}/apps/nutch/apache-nutch-1.18-SNAPSHOT.job</value> > </property> > > <property> > <name>tez.lib.uris.classpath</name> > <value>${fs.defaultFS}/apps/nutch/apache-nutch-1.18-SNAPSHOT.job</value> > </property> > </configuration> > > On 2020/12/17 17:35:28, Lewis John McGibbney <lewi...@apache.org> wrote: > > Hi Zhiyuan, > > Thanks for the guidance. I'm making progress but I am still battling > initial configuration management issues. > > I'm running HDFS and YARN v3.1.4 in pseudo-mode. > > My tez-site.xml contains the following content > > > > <configuration> > > <property> > > <name>tez.lib.uris</name> > > > > <value>${fs.defaultFS}/apps/tez-0.10.1-SNAPSHOT,${fs.defaultFS}/apps/tez-0.10.1-SNAPSHOT/lib,${fs.defaultFS}/apps/nutch</value> > > </property> > > </configuration> > > > > N.B. When I attempted to use the compressed Tez tar.gz, I was running > into classpath issues which are largely documented in the installation > documentation you pointed me to. I overcame these issues by simply > uploading the minimal directory. All seems fine at this stage as I can run > all of the Tez examples. > > > > I run into trouble when I try to run any job from the Nutch application. > For example when I run the Injector one of the Nutch plugin extension > points (x point org.apache.nutch.net.URLNormalizer) cannot be not found. > The relevant log can be seen at https://paste.apache.org/4whoe. > > I should note that the entire Nutch .job is available on HDFS at the URI > defined in the tez-site.xml above. > > > > The output of jar -tf on the nutch.job artifact can be seen at > https://paste.apache.org/hl8tk. > > Am I required to somehow describe the structural heirarchy of this > artifact in the tez.lib.uris.classpath configuration property? > > > > Thank you again for any guidance. > > > > lewismc > > > > On 2020/12/14 03:23:48, Zhiyuan Yang <zhiyu...@apache.org> wrote: > > > Hi Lewis, > > > > > > If there is no incompatibility, your existing job will run well on Tez > > > without code change. You can just follow this guide > > > <https://tez.apache.org/install.html> (especially step 4) to try it > out. > > > > > > Thanks, > > > Zhiyuan > > > > > > On Mon, Dec 14, 2020 at 9:04 AM Lewis John McGibbney < > lewi...@apache.org> > > > wrote: > > > > > > > >