Re: Running Tez with Tachyon
Thank you Bikas and Hitesh for your responses. I believe the problem is in the cluster. Here is the relevant information: *1) My HADOOP_CLASSPATH:* $ hadoop classpath /usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/hdfs:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/hdfs/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/hdfs/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/mapreduce/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/mapreduce/*:/Users/jsimsa/Projects/tez:/Users/jsimsa/Projects/tez/jars/*:/Users/jsimsa/Projects/tez/jars/lib/*:/contrib/capacity-scheduler/*.jar *2) The contents of /Users/jsimsa/Projects/tez/tez-site.xml:* tez.lib.uris ${fs.defaultFS}/apps/tez-0.8.2-SNAPSHOT/tez-0.8.2-SNAPSHOT.tar.gz tez.aux.uris ${fs.defaultFS}/apps/tachyon-0.8.2-SNAPSHOT/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar *3) The contents of the /apps HDFS folder:* $ ./bin/hdfs dfs -lsr /apps lsr: DEPRECATED: Please use 'ls -R' instead. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/11/12 10:39:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable drwxr-xr-x - jsimsa supergroup 0 2015-11-11 18:43 /apps/tachyon-0.8.2-SNAPSHOT -rw-r--r-- 1 jsimsa supergroup 43809325 2015-11-11 18:43 /apps/tachyon-0.8.2-SNAPSHOT/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar drwxr-xr-x - jsimsa supergroup 0 2015-11-11 18:44 /apps/tez-0.8.2-SNAPSHOT -rw-r--r-- 1 jsimsa supergroup 43884378 2015-11-11 18:44 /apps/tez-0.8.2-SNAPSHOT/tez-0.8.2-SNAPSHOT.tar.gz *4) Finally, the command I am running and its output:* $ HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:/Users/jsimsa/Projects/tachyon-amplab/clients/client/target/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar orderedwordcount tachyon://localhost:19998/input.txt tachyon://localhost:19998/output.txt SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/jsimsa/Projects/tez/jars/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/jsimsa/Projects/tachyon-amplab/clients/client/target/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/11/12 10:37:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/11/12 10:37:29 INFO client.TezClient: Tez Client Version: [ component=tez-api, version=0.8.2-SNAPSHOT, revision=6562a9d882fc455f511dd9d93af1d159d3e3e71b, SCM-URL=scm:git: https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=2015-11-11T19:44:28Z ] 15/11/12 10:37:29 INFO client.RMProxy: Connecting to ResourceManager at / 0.0.0.0:8032 15/11/12 10:37:30 INFO : initialize(tachyon://localhost:19998/input.txt, Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, tez-site.xml). Connecting to Tachyon: tachyon://localhost:19998/input.txt 15/11/12 10:37:30 INFO : Loading Tachyon properties from Hadoop configuration: {} 15/11/12 10:37:30 INFO : Tachyon client (version 0.8.2-SNAPSHOT) is trying to connect with BlockMaster master @ localhost/127.0.0.1:19998 15/11/12 10:37:30 INFO : Client registered with BlockMaster master @ localhost/127.0.0.1:19998 15/11/12 10:37:30 INFO : Tachyon client (version 0.8.2-SNAPSHOT) is trying to connect with FileSystemMaster master @ localhost/127.0.0.1:19998 15/11/12 10:37:30 INFO : Client registered w
Re: Running Tez with Tachyon
The general approach for add-on jars requires 2 steps: 1) On the client host, where the job is submitted, you need to ensure that the add-on jars are in the local classpath. This is usually done by adding them to HADOOP_CLASSPATH. Please do pay attention to adding the jars via "/*” instead of just "” 2) Next, "tez.aux.uris”. This controls additional files/jars needed in the runtime on the cluster. Upload the tachyon jar to HDFS and ensure that you provide the path to either the dir on HDFS or the full path to the file and specify that in tez.aux.uris. The last thing to note is that you may need to pull additional transitive dependencies of tachyon if it is not self-contained jar. thanks — HItesh On Nov 12, 2015, at 1:06 AM, Bikas Saha wrote: > Can you provide the full stack trace? > > Are you getting the exception on the client (while submitting the job) or in > the cluster (after the job started to run)? > > For the client side, the fix would be to add tachyon jars to the client > classpath. Looks like you tried some client side classpath fixes. You could > run ‘hadoop classpath’ to print the classpath being picked up by the ‘hadoop > jar’ command. And scan its output to check if your tachyon jars are being > picked up correctly or not. > > Bikas > > From: Jiří Šimša [mailto:jiri.si...@gmail.com] > Sent: Wednesday, November 11, 2015 6:54 PM > To: user@tez.apache.org > Subject: Running Tez with Tachyon > > Hello, > > I have followed the Tez installation instructions > (https://tez.apache.org/install.html) and was able to successfully run the > ordered word count example: > > $ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar > orderedwordcount /input.txt /output.txt > > Next, I wanted to see if I can do the same, this time reading from and > writing to Tachyon (http://tachyon-project.org/) using: > > $ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar > orderedwordcount tachyon://localhost:19998/input.txt > tachyon://localhost:19998/output.txt > > Unsurprisingly, this resulted in the "Class tachyon.hadoop.TFS not found" > error because Tez needs the Tachyon client jar that defines the > tachyon.hadoop.TFS class. To that end, I have tried several options (listed > below) to provide this jar to Tez, none of which seems to have worked: > > 1) Adding the Tachyon client jar to HADOOP_CLASSPATH > 2) Specifying the Tachyon client jar with the -libjars flag for the above > command. > 3) Copying the Tachyon client jar into the > $HADOOP_HOME/share/hadoop/common/lib directory of my HADOOP installation. > 4) Copying the Tachyon client jar into HDFS and specifying a path to it > through the tez.aux.uris property in the tez-site.xml file (in a similar > fashion the tez.lib.uris property specifies the path to the Tez tarball). > 5) I modified the source code of the ordered word count example, adding a > call to TezClient#addAppMasterLocalFiles(...), providing a URI for the > Tachyon client jar uploaded to HDFS. > > Any advice on how to pass the Tachyon client jar to Tez to resolve this issue > would be greatly appreciated. Thank you. > > Best, > > -- > Jiří Šimša
RE: Running Tez with Tachyon
Can you provide the full stack trace? Are you getting the exception on the client (while submitting the job) or in the cluster (after the job started to run)? For the client side, the fix would be to add tachyon jars to the client classpath. Looks like you tried some client side classpath fixes. You could run ‘hadoop classpath’ to print the classpath being picked up by the ‘hadoop jar’ command. And scan its output to check if your tachyon jars are being picked up correctly or not. Bikas From: Jiří Šimša [mailto:jiri.si...@gmail.com] Sent: Wednesday, November 11, 2015 6:54 PM To: user@tez.apache.org Subject: Running Tez with Tachyon Hello, I have followed the Tez installation instructions (https://tez.apache.org/install.html) and was able to successfully run the ordered word count example: $ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar orderedwordcount /input.txt /output.txt Next, I wanted to see if I can do the same, this time reading from and writing to Tachyon (http://tachyon-project.org/) using: $ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar orderedwordcount tachyon://localhost:19998/input.txt tachyon://localhost:19998/output.txt Unsurprisingly, this resulted in the "Class tachyon.hadoop.TFS not found" error because Tez needs the Tachyon client jar that defines the tachyon.hadoop.TFS class. To that end, I have tried several options (listed below) to provide this jar to Tez, none of which seems to have worked: 1) Adding the Tachyon client jar to HADOOP_CLASSPATH 2) Specifying the Tachyon client jar with the -libjars flag for the above command. 3) Copying the Tachyon client jar into the $HADOOP_HOME/share/hadoop/common/lib directory of my HADOOP installation. 4) Copying the Tachyon client jar into HDFS and specifying a path to it through the tez.aux.uris property in the tez-site.xml file (in a similar fashion the tez.lib.uris property specifies the path to the Tez tarball). 5) I modified the source code of the ordered word count example, adding a call to TezClient#addAppMasterLocalFiles(...), providing a URI for the Tachyon client jar uploaded to HDFS. Any advice on how to pass the Tachyon client jar to Tez to resolve this issue would be greatly appreciated. Thank you. Best, -- Jiří Šimša