Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog merged PR #414: URL: https://github.com/apache/tez/pull/414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2100152057 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format Review Comment: -force worked -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
tez-yetus commented on PR #414: URL: https://github.com/apache/tez/pull/414#issuecomment-2893802060 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 28m 58s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | _ master Compile Tests _ | | +0 :ok: | mvndep | 2m 23s | | Maven dependency ordering for branch | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 8s | | Maven dependency ordering for patch | | +1 :green_heart: | codespell | 0m 4s | | No new issues. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | _ Other Tests _ | | +0 :ok: | asflicense | 0m 0s | | ASF License check generated no output? | | | | 31m 54s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.49 ServerAPI=1.49 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/tez/pull/414 | | Optional Tests | dupname asflicense codespell detsecrets shellcheck shelldocs | | uname | Linux 301ac7039060 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-home/workspace/tez-multibranch_PR-414/src/.yetus/personality.sh | | git revision | master / bd94d8bc04ce02a7939ee28f7ad818aadeb8 | | Max. process+thread count | 60 (vs. ulimit of 5500) | | modules | C: U: | | Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/2/console | | versions | git=2.34.1 maven=3.6.3 codespell=2.0.0 shellcheck=0.7.1 | | Powered by | Apache Yetus 0.15.1 https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2097494414 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + Review Comment: the history server doesn't load correctly for me on localhost I would rather go on without that instead of investigating how to open it correctly, maybe a TODO for later https://github.com/user-attachments/assets/3f4b3612-d7b6-4a9a-9b01-693fb6e77fba"; /> btw, the finished DAG can be seen on the RM UI: http://localhost:8088/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2097494414 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + Review Comment: the history server doesn't load correctly for me on localhost I would rather go on without that instead of investigating how to open it correctly, maybe a TODO for later https://github.com/user-attachments/assets/3f4b3612-d7b6-4a9a-9b01-693fb6e77fba"; /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on PR #414: URL: https://github.com/apache/tez/pull/414#issuecomment-2893479169 > > Do we need to specify: mapreduce.framework.name as yarn as well? > > for me earlier it never use to work unless, I specify `export HADOOP_USER_CLASSPATH_FIRST=true` > > does it work for you without that, even BigTop had to add that https://github.com/apache/bigtop/pull/1246/files#diff-f68b85f9302907e466b58d438376afb074df98fdbe571d30c188cd1767ff11eeR18 > > yeah, I can see this workaround happening everywhere, but here, it has just worked OOTB, maybe a certain state of defining ENV vars like HADOOP_CLASSPATH? I don't know what about: > > 1. I'm playing with it if I can reproduce their problems > 2. can you try the script on your side if it works? if the script works without the additional export for you too, we might want to publish it as is, proving that no further classpath hack are needed > > let me check mapreduce.framework.name as well, for me, the script simply ran a Tez DAG, so I haven't configured anything more...but this is really interesting, I'll discover wow, that's indeed needed, otherwise I get exotic exception like ``` java.lang.IllegalAccessError: tried to access field com.google.protobuf.AbstractMessage.memoizedSize from class org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto at org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.getSerializedSize(DAGProtos.java:21080) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75) at org.apache.tez.common.TezUtils.writeConfInPB(TezUtils.java:162) at org.apache.tez.common.TezUtils.createByteStringFromConf(TezUtils.java:82) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.createMRInputPayload(MRInputHelpers.java:717) at org.apache.tez.mapreduce.input.MRInput$MRInputHelpersInternal.createMRInputPayload(MRInput.java:712) at org.apache.tez.mapreduce.input.MRInput$MRInputConfigBuilder.createGeneratorDataSource(MRInput.java:336) at org.apache.tez.mapreduce.input.MRInput$MRInputConfigBuilder.build(MRInput.java:266) at org.apache.tez.examples.OrderedWordCount.createDAG(OrderedWordCount.java:130) at org.apache.tez.examples.OrderedWordCount.runJob(OrderedWordCount.java:200) at org.apache.tez.examples.TezExampleBase._execute(TezExampleBase.java:245) at org.apache.tez.examples.TezExampleBase.run(TezExampleBase.java:126) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) at org.apache.tez.examples.OrderedWordCount.main(OrderedWordCount.java:208) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.tez.examples.ExampleDriver.main(ExampleDriver.java:51) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:328) at org.apache.hadoop.util.RunJar.main(RunJar.java:241) ``` adding this export right before the Tez DAG submission -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2097117264 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 Review Comment: yeah, makes sense, let me do the same -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
ayushtkn commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094994611 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + +hadoop fs -mkdir /apps/ +hadoop fs -mkdir /apps/tez-$TEZ_VERSION +hadoop fs -copyFromLocal $TEZ_HOME/share/tez.tar.gz /apps/tez-$TEZ_VERSION + +# create a simple tez-site.xml +cat < $TEZ_HOME/conf/tez-site.xml + + + + + + tez.lib.uris + /apps/tez-$TEZ_VERSION/tez.tar.gz + + +EOF + +# create a simple input file +cat < ./words.txt +Apple +Banana +Car +Apple +Banana +Car +Dog +Elephant +Friend +Game +EOF + +hadoop fs -copyFromLocal words.txt /words.txt + +# finally run the example +hadoop jar $TEZ_HOME/tez-examples-$TEZ_VERSION.jar orderedwordcount /words.txt /words_out Review Comment: AFAIK for YARN it should be yarn jar https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd#L186-L190 if you have yarn opts defined and all, it would shooting a warning as well. hadoop jar was for MR job, though it doesn't fail for Tez job today -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
ayushtkn commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094987645 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 Review Comment: I am like if the user defines it use it else get it from the POM, I believe that is what the Hive docker build script does that as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094371697 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + +hadoop fs -mkdir /apps/ +hadoop fs -mkdir /apps/tez-$TEZ_VERSION +hadoop fs -copyFromLocal $TEZ_HOME/share/tez.tar.gz /apps/tez-$TEZ_VERSION + +# create a simple tez-site.xml +cat < $TEZ_HOME/conf/tez-site.xml + + + + + + tez.lib.uris + /apps/tez-$TEZ_VERSION/tez.tar.gz + + +EOF + +# create a simple input file +cat < ./words.txt +Apple +Banana +Car +Apple +Banana +Car +Dog +Elephant +Friend +Game +EOF + +hadoop fs -copyFromLocal words.txt /words.txt + +# finally run the example +hadoop jar $TEZ_HOME/tez-examples-$TEZ_VERSION.jar orderedwordcount /words.txt /words_out Review Comment: I haven't used yarn executable so far, fine with changing, but for the record: what advantages does it have? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on PR #414: URL: https://github.com/apache/tez/pull/414#issuecomment-2888783269 > Do we need to specify: mapreduce.framework.name as yarn as well? > > for me earlier it never use to work unless, I specify `export HADOOP_USER_CLASSPATH_FIRST=true` > > does it work for you without that, even BigTop had to add that https://github.com/apache/bigtop/pull/1246/files#diff-f68b85f9302907e466b58d438376afb074df98fdbe571d30c188cd1767ff11eeR18 yeah, I can see this workaround happening everywhere, but here, it has just worked OOTB, maybe a certain state of defining ENV vars like HADOOP_CLASSPATH? I don't know what about: 1. I'm playing with it if I can reproduce their problems 2. can you try the script on your side if it works? if the script works without the additional export for you too, we might want to publish it as is, proving that no further classpath hack are needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094372547 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format Review Comment: there is a " -force" option of namenode format, let me try -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094370102 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + Review Comment: -nc (--no-clobber) is exactly what takes care of this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094372315 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 Review Comment: good question, depends on what we want to achieve with this script, here is what I can think of: 1. get hadoop from the tez pom.xml as you adviced 2. both HADOOP_VERSION and TEZ_VERSION could be used from env if already defined (making the user able to define any for random experience) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094371043 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + Review Comment: makes sense, I'll also add comments where to check the history server -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
abstractdog commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2094370836 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + +hadoop fs -mkdir /apps/ +hadoop fs -mkdir /apps/tez-$TEZ_VERSION Review Comment: ack, will do -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
ayushtkn commented on code in PR #414: URL: https://github.com/apache/tez/pull/414#discussion_r2092566970 ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation +cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml + + + + + + dfs.replication + 1 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/core-site.xml + + + + + + fs.defaultFS + hdfs://localhost:9000 + + +EOF + +cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml + + + + + +yarn.nodemanager.aux-services +mapreduce_shuffle + + +EOF + +# optionally stop previous clusters if any +#$HADOOP_HOME/sbin/stop-dfs.sh +#$HADOOP_HOME/sbin/stop-yarn.sh + +hdfs namenode -format + +$HADOOP_HOME/sbin/start-dfs.sh +$HADOOP_HOME/sbin/start-yarn.sh + +hadoop fs -mkdir /apps/ +hadoop fs -mkdir /apps/tez-$TEZ_VERSION Review Comment: ``` hadoop fs -mkdir -p /apps/tez-$TEZ_VERSION ``` ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 Review Comment: shouldn't the hadoop version should be from the pom? not always the latest version is gonna work with Tez ## dev-support/bin/tez_run_example.sh: ## @@ -0,0 +1,119 @@ + +# This script is used to set up a local Hadoop and Tez environment for running a simple word count example. +# Prerequisites +# 1. java is installed and JAVA_HOME is set +# 2. ssh localhost works without password + +# configure this if needed, by default it will use the latest stable versions in the current directory +export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP '\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4 +export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; | grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 3.4.1 +export HADOOP_STACK_HOME=$PWD + +echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version $TEZ_VERSION and HADOOP version $HADOOP_VERSION" + +cd $HADOOP_STACK_HOME +wget -nc https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz +wget -nc https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz + +if [ ! -d "hadoop-$HADOOP_VERSION" ]; then +tar -xzf hadoop-$HADOOP_VERSION.tar.gz +fi + +if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then +tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz +fi + +ln -s hadoop-$HADOOP_VERSION hadoop +ln -s apache-tez-$TEZ_VERSION-bin tez + +export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop +export TEZ_HOME=$HADOOP_STACK_HOME/tez +export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf + +export PATH=$PATH:$HADOOP_HOME/bin + +# https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Singl
Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]
tez-yetus commented on PR #414: URL: https://github.com/apache/tez/pull/414#issuecomment-2885995105 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 27m 25s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | _ master Compile Tests _ | | +0 :ok: | mvndep | 2m 16s | | Maven dependency ordering for branch | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 9s | | Maven dependency ordering for patch | | +1 :green_heart: | codespell | 0m 4s | | No new issues. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | shellcheck | 0m 0s | | No new issues. | _ Other Tests _ | | +0 :ok: | asflicense | 0m 0s | | ASF License check generated no output? | | | | 30m 13s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.49 ServerAPI=1.49 base: https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/tez/pull/414 | | Optional Tests | dupname asflicense codespell detsecrets shellcheck shelldocs | | uname | Linux 45e3c0032967 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-home/workspace/tez-multibranch_PR-414/src/.yetus/personality.sh | | git revision | master / 85bdf17921cf62f9444dc21feddd8294056b77ea | | Max. process+thread count | 61 (vs. ulimit of 5500) | | modules | C: U: | | Console output | https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/1/console | | versions | git=2.34.1 maven=3.6.3 codespell=2.0.0 shellcheck=0.7.1 | | Powered by | Apache Yetus 0.15.1 https://yetus.apache.org | This message was automatically generated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org