Re: Running Tez with Tachyon

2015-11-12 Thread Jiří Šimša
Thank you Bikas and Hitesh for your responses.

I believe the problem is in the cluster. Here is the relevant information:

*1) My HADOOP_CLASSPATH:*

$ hadoop classpath
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/hdfs:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/hdfs/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/hdfs/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/yarn/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/mapreduce/lib/*:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/mapreduce/*:/Users/jsimsa/Projects/tez:/Users/jsimsa/Projects/tez/jars/*:/Users/jsimsa/Projects/tez/jars/lib/*:/contrib/capacity-scheduler/*.jar

*2) The contents of /Users/jsimsa/Projects/tez/tez-site.xml:*











  tez.lib.uris

${fs.defaultFS}/apps/tez-0.8.2-SNAPSHOT/tez-0.8.2-SNAPSHOT.tar.gz


  tez.aux.uris

${fs.defaultFS}/apps/tachyon-0.8.2-SNAPSHOT/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar




*3) The contents of the /apps HDFS folder:*

$ ./bin/hdfs dfs -lsr /apps
lsr: DEPRECATED: Please use 'ls -R' instead.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/11/12 10:39:52 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
drwxr-xr-x   - jsimsa supergroup  0 2015-11-11 18:43
/apps/tachyon-0.8.2-SNAPSHOT
-rw-r--r--   1 jsimsa supergroup   43809325 2015-11-11 18:43
/apps/tachyon-0.8.2-SNAPSHOT/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar
drwxr-xr-x   - jsimsa supergroup  0 2015-11-11 18:44
/apps/tez-0.8.2-SNAPSHOT
-rw-r--r--   1 jsimsa supergroup   43884378 2015-11-11 18:44
/apps/tez-0.8.2-SNAPSHOT/tez-0.8.2-SNAPSHOT.tar.gz


*4) Finally, the command I am running and its output:*

$
HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:/Users/jsimsa/Projects/tachyon-amplab/clients/client/target/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar
hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar
orderedwordcount tachyon://localhost:19998/input.txt
tachyon://localhost:19998/output.txt
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/local/Cellar/hadoop/2.7.1/libexec/share/hadoop/common/lib/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/Users/jsimsa/Projects/tez/jars/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/Users/jsimsa/Projects/tachyon-amplab/clients/client/target/tachyon-client-0.8.2-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/11/12 10:37:29 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
15/11/12 10:37:29 INFO client.TezClient: Tez Client Version: [
component=tez-api, version=0.8.2-SNAPSHOT,
revision=6562a9d882fc455f511dd9d93af1d159d3e3e71b, SCM-URL=scm:git:
https://git-wip-us.apache.org/repos/asf/tez.git,
buildTime=2015-11-11T19:44:28Z ]
15/11/12 10:37:29 INFO client.RMProxy: Connecting to ResourceManager at /
0.0.0.0:8032
15/11/12 10:37:30 INFO : initialize(tachyon://localhost:19998/input.txt,
Configuration: core-default.xml, core-site.xml, mapred-default.xml,
mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml,
hdfs-site.xml, tez-site.xml). Connecting to Tachyon:
tachyon://localhost:19998/input.txt
15/11/12 10:37:30 INFO : Loading Tachyon properties from Hadoop
configuration: {}
15/11/12 10:37:30 INFO : Tachyon client (version 0.8.2-SNAPSHOT) is trying
to connect with BlockMaster master @ localhost/127.0.0.1:19998
15/11/12 10:37:30 INFO : Client registered with BlockMaster master @
localhost/127.0.0.1:19998
15/11/12 10:37:30 INFO : Tachyon client (version 0.8.2-SNAPSHOT) is trying
to connect with FileSystemMaster master @ localhost/127.0.0.1:19998
15/11/12 10:37:30 INFO : Client registered w

Re: Running Tez with Tachyon

2015-11-12 Thread Hitesh Shah
The general approach for add-on jars requires 2 steps:

1) On the client host, where the job is submitted, you need to ensure that the 
add-on jars are in the local classpath. This is usually done by adding them to 
HADOOP_CLASSPATH. Please do pay attention to adding the jars via "/*” 
instead of just "”
2) Next, "tez.aux.uris”. This controls additional files/jars needed in the 
runtime on the cluster. Upload the tachyon jar to HDFS and ensure that you 
provide the path to either the dir on HDFS or the full path to the file and 
specify that in tez.aux.uris. 

The last thing to note is that you may need to pull additional transitive 
dependencies of tachyon if it is not self-contained jar.

thanks
— HItesh

On Nov 12, 2015, at 1:06 AM, Bikas Saha  wrote:

> Can you provide the full stack trace?
>  
> Are you getting the exception on the client (while submitting the job) or in 
> the cluster (after the job started to run)?
>  
> For the client side, the fix would be to add tachyon jars to the client 
> classpath. Looks like you tried some client side classpath fixes. You could 
> run ‘hadoop classpath’ to print the classpath being picked up by the ‘hadoop 
> jar’ command. And scan its output to check if your tachyon jars are being 
> picked up correctly or not.
>  
> Bikas
>  
> From: Jiří Šimša [mailto:jiri.si...@gmail.com] 
> Sent: Wednesday, November 11, 2015 6:54 PM
> To: user@tez.apache.org
> Subject: Running Tez with Tachyon
>  
> Hello,
>  
> I have followed the Tez installation instructions 
> (https://tez.apache.org/install.html) and was able to successfully run the 
> ordered word count example:
>  
> $ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar 
> orderedwordcount /input.txt /output.txt
>  
> Next, I wanted to see if I can do the same, this time reading from and 
> writing to Tachyon (http://tachyon-project.org/) using:
>  
> $ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar 
> orderedwordcount tachyon://localhost:19998/input.txt 
> tachyon://localhost:19998/output.txt
>  
> Unsurprisingly, this resulted in the "Class tachyon.hadoop.TFS not found" 
> error because Tez needs the Tachyon client jar that defines the 
> tachyon.hadoop.TFS class. To that end, I have tried several options (listed 
> below) to provide this jar to Tez, none of which seems to have worked:
>  
> 1) Adding the Tachyon client jar to HADOOP_CLASSPATH
> 2) Specifying the Tachyon client jar with the -libjars flag for the above 
> command.
> 3) Copying the Tachyon client jar into the 
> $HADOOP_HOME/share/hadoop/common/lib directory of my HADOOP installation.
> 4) Copying the Tachyon client jar into HDFS and specifying a path to it 
> through the tez.aux.uris property in the tez-site.xml file (in a similar 
> fashion the tez.lib.uris property specifies the path to the Tez tarball).
> 5) I modified the source code of the ordered word count example, adding a 
> call to TezClient#addAppMasterLocalFiles(...), providing a URI for the 
> Tachyon client jar uploaded to HDFS.
>  
> Any advice on how to pass the Tachyon client jar to Tez to resolve this issue 
> would be greatly appreciated. Thank you.
>  
> Best,
>  
> --
> Jiří Šimša



RE: Running Tez with Tachyon

2015-11-12 Thread Bikas Saha
Can you provide the full stack trace?

 

Are you getting the exception on the client (while submitting the job) or in 
the cluster (after the job started to run)?

 

For the client side, the fix would be to add tachyon jars to the client 
classpath. Looks like you tried some client side classpath fixes. You could run 
‘hadoop classpath’ to print the classpath being picked up by the ‘hadoop jar’ 
command. And scan its output to check if your tachyon jars are being picked up 
correctly or not.

 

Bikas

 

From: Jiří Šimša [mailto:jiri.si...@gmail.com] 
Sent: Wednesday, November 11, 2015 6:54 PM
To: user@tez.apache.org
Subject: Running Tez with Tachyon

 

Hello,

 

I have followed the Tez installation instructions 
(https://tez.apache.org/install.html) and was able to successfully run the 
ordered word count example:

 

$ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar 
orderedwordcount /input.txt /output.txt

 

Next, I wanted to see if I can do the same, this time reading from and writing 
to Tachyon (http://tachyon-project.org/) using:

 

$ hadoop jar ./tez-examples/target/tez-examples-0.8.2-SNAPSHOT.jar 
orderedwordcount tachyon://localhost:19998/input.txt 
tachyon://localhost:19998/output.txt

 

Unsurprisingly, this resulted in the "Class tachyon.hadoop.TFS not found" error 
because Tez needs the Tachyon client jar that defines the tachyon.hadoop.TFS 
class. To that end, I have tried several options (listed below) to provide this 
jar to Tez, none of which seems to have worked:

 

1) Adding the Tachyon client jar to HADOOP_CLASSPATH

2) Specifying the Tachyon client jar with the -libjars flag for the above 
command.

3) Copying the Tachyon client jar into the $HADOOP_HOME/share/hadoop/common/lib 
directory of my HADOOP installation.

4) Copying the Tachyon client jar into HDFS and specifying a path to it through 
the tez.aux.uris property in the tez-site.xml file (in a similar fashion the 
tez.lib.uris property specifies the path to the Tez tarball).

5) I modified the source code of the ordered word count example, adding a call 
to TezClient#addAppMasterLocalFiles(...), providing a URI for the Tachyon 
client jar uploaded to HDFS.

 

Any advice on how to pass the Tachyon client jar to Tez to resolve this issue 
would be greatly appreciated. Thank you.

 

Best,

 

--

Jiří Šimša