Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-28 Thread via GitHub


abstractdog merged PR #414:
URL: https://github.com/apache/tez/pull/414


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-21 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2100152057


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format

Review Comment:
   -force worked



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-20 Thread via GitHub


tez-yetus commented on PR #414:
URL: https://github.com/apache/tez/pull/414#issuecomment-2893802060

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  28m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  shelldocs  |   0m  0s |  |  Shelldocs was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
    _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 23s |  |  Maven dependency ordering for branch  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m  8s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  codespell  |   0m  4s |  |  No new issues.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  shellcheck  |   0m  0s |  |  No new issues.  |
    _ Other Tests _ |
   | +0 :ok: |  asflicense  |   0m  0s |  |  ASF License check generated no 
output?  |
   |  |   |  31m 54s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.49 ServerAPI=1.49 base: 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/tez/pull/414 |
   | Optional Tests | dupname asflicense codespell detsecrets shellcheck 
shelldocs |
   | uname | Linux 301ac7039060 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 
15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 
/home/jenkins/jenkins-home/workspace/tez-multibranch_PR-414/src/.yetus/personality.sh
 |
   | git revision | master / bd94d8bc04ce02a7939ee28f7ad818aadeb8 |
   | Max. process+thread count | 60 (vs. ulimit of 5500) |
   | modules | C:  U:  |
   | Console output | 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/2/console |
   | versions | git=2.34.1 maven=3.6.3 codespell=2.0.0 shellcheck=0.7.1 |
   | Powered by | Apache Yetus 0.15.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-20 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2097494414


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+

Review Comment:
   the history server doesn't load correctly for me on localhost
   I would rather go on without that instead of investigating how to open it 
correctly, maybe a TODO for later
   https://github.com/user-attachments/assets/3f4b3612-d7b6-4a9a-9b01-693fb6e77fba";
 />
   
   
   btw, the finished DAG can be seen on the RM UI: http://localhost:8088/
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-20 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2097494414


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+

Review Comment:
   the history server doesn't load correctly for me on localhost
   I would rather go on without that instead of investigating how to open it 
correctly, maybe a TODO for later
   https://github.com/user-attachments/assets/3f4b3612-d7b6-4a9a-9b01-693fb6e77fba";
 />
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-20 Thread via GitHub


abstractdog commented on PR #414:
URL: https://github.com/apache/tez/pull/414#issuecomment-2893479169

   > > Do we need to specify: mapreduce.framework.name as yarn as well?
   > > for me earlier it never use to work unless, I specify `export 
HADOOP_USER_CLASSPATH_FIRST=true`
   > > does it work for you without that, even BigTop had to add that 
https://github.com/apache/bigtop/pull/1246/files#diff-f68b85f9302907e466b58d438376afb074df98fdbe571d30c188cd1767ff11eeR18
   > 
   > yeah, I can see this workaround happening everywhere, but here, it has 
just worked OOTB, maybe a certain state of defining ENV vars like 
HADOOP_CLASSPATH? I don't know what about:
   > 
   > 1. I'm playing with it if I can reproduce their problems
   > 2. can you try the script on your side if it works? if the script works 
without the additional export for you too, we might want to publish it as is, 
proving that no further classpath hack are needed
   > 
   > let me check mapreduce.framework.name as well, for me, the script simply 
ran a Tez DAG, so I haven't configured anything more...but this is really 
interesting, I'll discover
   
   wow, that's indeed needed, otherwise I get exotic exception like
   ```
   java.lang.IllegalAccessError: tried to access field 
com.google.protobuf.AbstractMessage.memoizedSize from class 
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto
at 
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.getSerializedSize(DAGProtos.java:21080)
at 
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
at org.apache.tez.common.TezUtils.writeConfInPB(TezUtils.java:162)
at 
org.apache.tez.common.TezUtils.createByteStringFromConf(TezUtils.java:82)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.createMRInputPayload(MRInputHelpers.java:717)
at 
org.apache.tez.mapreduce.input.MRInput$MRInputHelpersInternal.createMRInputPayload(MRInput.java:712)
at 
org.apache.tez.mapreduce.input.MRInput$MRInputConfigBuilder.createGeneratorDataSource(MRInput.java:336)
at 
org.apache.tez.mapreduce.input.MRInput$MRInputConfigBuilder.build(MRInput.java:266)
at 
org.apache.tez.examples.OrderedWordCount.createDAG(OrderedWordCount.java:130)
at 
org.apache.tez.examples.OrderedWordCount.runJob(OrderedWordCount.java:200)
at 
org.apache.tez.examples.TezExampleBase._execute(TezExampleBase.java:245)
at org.apache.tez.examples.TezExampleBase.run(TezExampleBase.java:126)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
at 
org.apache.tez.examples.OrderedWordCount.main(OrderedWordCount.java:208)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.tez.examples.ExampleDriver.main(ExampleDriver.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:328)
at org.apache.hadoop.util.RunJar.main(RunJar.java:241)
   ```
   
   adding this export right before the Tez DAG submission


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-19 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2097117264


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1

Review Comment:
   yeah, makes sense, let me do the same



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-19 Thread via GitHub


ayushtkn commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094994611


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+
+hadoop fs -mkdir /apps/
+hadoop fs -mkdir /apps/tez-$TEZ_VERSION
+hadoop fs -copyFromLocal $TEZ_HOME/share/tez.tar.gz /apps/tez-$TEZ_VERSION
+
+# create a simple tez-site.xml
+cat < $TEZ_HOME/conf/tez-site.xml
+
+
+
+
+  
+  tez.lib.uris
+  /apps/tez-$TEZ_VERSION/tez.tar.gz
+  
+
+EOF
+
+# create a simple input file
+cat < ./words.txt
+Apple
+Banana
+Car
+Apple
+Banana
+Car
+Dog
+Elephant
+Friend
+Game
+EOF
+
+hadoop fs -copyFromLocal words.txt /words.txt
+
+# finally run the example
+hadoop jar $TEZ_HOME/tez-examples-$TEZ_VERSION.jar orderedwordcount /words.txt 
/words_out

Review Comment:
   AFAIK for YARN it should be yarn jar
   
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop.cmd#L186-L190
   
   if you have yarn opts defined and all, it would shooting a warning as well. 
hadoop jar was for MR job, though it doesn't fail for Tez job today



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-19 Thread via GitHub


ayushtkn commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094987645


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1

Review Comment:
   I am like if the user defines it use it else get it from the POM, I believe 
that is what the Hive docker build script does that as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094371697


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+
+hadoop fs -mkdir /apps/
+hadoop fs -mkdir /apps/tez-$TEZ_VERSION
+hadoop fs -copyFromLocal $TEZ_HOME/share/tez.tar.gz /apps/tez-$TEZ_VERSION
+
+# create a simple tez-site.xml
+cat < $TEZ_HOME/conf/tez-site.xml
+
+
+
+
+  
+  tez.lib.uris
+  /apps/tez-$TEZ_VERSION/tez.tar.gz
+  
+
+EOF
+
+# create a simple input file
+cat < ./words.txt
+Apple
+Banana
+Car
+Apple
+Banana
+Car
+Dog
+Elephant
+Friend
+Game
+EOF
+
+hadoop fs -copyFromLocal words.txt /words.txt
+
+# finally run the example
+hadoop jar $TEZ_HOME/tez-examples-$TEZ_VERSION.jar orderedwordcount /words.txt 
/words_out

Review Comment:
   I haven't used yarn executable so far, fine with changing, but for the 
record: what advantages does it have?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on PR #414:
URL: https://github.com/apache/tez/pull/414#issuecomment-2888783269

   > Do we need to specify: mapreduce.framework.name as yarn as well?
   > 
   > for me earlier it never use to work unless, I specify `export 
HADOOP_USER_CLASSPATH_FIRST=true`
   > 
   > does it work for you without that, even BigTop had to add that 
https://github.com/apache/bigtop/pull/1246/files#diff-f68b85f9302907e466b58d438376afb074df98fdbe571d30c188cd1767ff11eeR18
   
   yeah, I can see this workaround happening everywhere, but here, it has just 
worked OOTB, maybe a certain state of defining ENV vars like HADOOP_CLASSPATH? 
I don't know
   what about:
   1. I'm playing with it if I can reproduce their problems
   2. can you try the script on your side if it works? if the script works 
without the additional export for you too, we might want to publish it as is, 
proving that no further classpath hack are needed
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094372547


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format

Review Comment:
   there is a " -force" option of namenode format, let me try



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094370102


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+

Review Comment:
   -nc (--no-clobber) is exactly what takes care of this 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094372315


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1

Review Comment:
   good question, depends on what we want to achieve with this script, here is 
what I can think of:
   1. get hadoop from the tez pom.xml as you adviced
   2. both HADOOP_VERSION and TEZ_VERSION could be used from env if already 
defined (making the user able to define any for random experience)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094371043


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+

Review Comment:
   makes sense, I'll also add comments where to check the history server



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-17 Thread via GitHub


abstractdog commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2094370836


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+
+hadoop fs -mkdir /apps/
+hadoop fs -mkdir /apps/tez-$TEZ_VERSION

Review Comment:
   ack, will do



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-16 Thread via GitHub


ayushtkn commented on code in PR #414:
URL: https://github.com/apache/tez/pull/414#discussion_r2092566970


##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation
+cat < $HADOOP_HOME/etc/hadoop/hdfs-site.xml
+
+
+
+
+  
+  dfs.replication
+  1
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/core-site.xml
+
+
+
+
+  
+  fs.defaultFS
+  hdfs://localhost:9000
+  
+
+EOF
+
+cat < $HADOOP_HOME/etc/hadoop/yarn-site.xml
+
+
+
+
+
+yarn.nodemanager.aux-services
+mapreduce_shuffle
+
+
+EOF
+
+# optionally stop previous clusters if any
+#$HADOOP_HOME/sbin/stop-dfs.sh
+#$HADOOP_HOME/sbin/stop-yarn.sh
+
+hdfs namenode -format
+
+$HADOOP_HOME/sbin/start-dfs.sh
+$HADOOP_HOME/sbin/start-yarn.sh
+
+hadoop fs -mkdir /apps/
+hadoop fs -mkdir /apps/tez-$TEZ_VERSION

Review Comment:
   ```
   hadoop fs -mkdir -p /apps/tez-$TEZ_VERSION
   ```



##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1

Review Comment:
   shouldn't the hadoop version should be from the pom? not always the latest 
version is gonna work with Tez



##
dev-support/bin/tez_run_example.sh:
##
@@ -0,0 +1,119 @@
+
+# This script is used to set up a local Hadoop and Tez environment for running 
a simple word count example.
+# Prerequisites
+# 1. java is installed and JAVA_HOME is set
+# 2. ssh localhost works without password
+
+# configure this if needed, by default it will use the latest stable versions 
in the current directory
+export TEZ_VERSION=$(curl -s "https://downloads.apache.org/tez/"; | grep -oP 
'\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 0.10.4
+export HADOOP_VERSION=$(curl -s "https://downloads.apache.org/hadoop/common/"; 
| grep -oP 'hadoop-\K[0-9]+\.[0-9]+\.[0-9]+(?=/)' | sort -V | tail -1) # e.g. 
3.4.1
+export HADOOP_STACK_HOME=$PWD
+
+echo "Demo script is running in $HADOOP_STACK_HOME with TEZ version 
$TEZ_VERSION and HADOOP version $HADOOP_VERSION"
+
+cd $HADOOP_STACK_HOME
+wget -nc 
https://archive.apache.org/dist/hadoop/common/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz
+wget -nc 
https://archive.apache.org/dist/tez/$TEZ_VERSION/apache-tez-$TEZ_VERSION-bin.tar.gz
+
+if [ ! -d "hadoop-$HADOOP_VERSION" ]; then
+tar -xzf hadoop-$HADOOP_VERSION.tar.gz
+fi
+
+if [ ! -d "apache-tez-$TEZ_VERSION-bin" ]; then
+tar -xzf apache-tez-$TEZ_VERSION-bin.tar.gz
+fi
+
+ln -s hadoop-$HADOOP_VERSION hadoop
+ln -s apache-tez-$TEZ_VERSION-bin tez
+
+export HADOOP_HOME=$HADOOP_STACK_HOME/hadoop
+export TEZ_HOME=$HADOOP_STACK_HOME/tez
+export HADOOP_CLASSPATH=$TEZ_HOME/*:$TEZ_HOME/lib/*:$TEZ_HOME/conf
+
+export PATH=$PATH:$HADOOP_HOME/bin
+
+# 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Singl

Re: [PR] TEZ-4631: Include an official script that installs hadoop and tez and runs a simple example DAG [tez]

2025-05-16 Thread via GitHub


tez-yetus commented on PR #414:
URL: https://github.com/apache/tez/pull/414#issuecomment-2885995105

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  27m 25s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  shelldocs  |   0m  0s |  |  Shelldocs was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
    _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   2m 16s |  |  Maven dependency ordering for branch  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m  9s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  codespell  |   0m  4s |  |  No new issues.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  shellcheck  |   0m  0s |  |  No new issues.  |
    _ Other Tests _ |
   | +0 :ok: |  asflicense  |   0m  0s |  |  ASF License check generated no 
output?  |
   |  |   |  30m 13s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.49 ServerAPI=1.49 base: 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/tez/pull/414 |
   | Optional Tests | dupname asflicense codespell detsecrets shellcheck 
shelldocs |
   | uname | Linux 45e3c0032967 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 
15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 
/home/jenkins/jenkins-home/workspace/tez-multibranch_PR-414/src/.yetus/personality.sh
 |
   | git revision | master / 85bdf17921cf62f9444dc21feddd8294056b77ea |
   | Max. process+thread count | 61 (vs. ulimit of 5500) |
   | modules | C:  U:  |
   | Console output | 
https://ci-hadoop.apache.org/job/tez-multibranch/job/PR-414/1/console |
   | versions | git=2.34.1 maven=3.6.3 codespell=2.0.0 shellcheck=0.7.1 |
   | Powered by | Apache Yetus 0.15.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@tez.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org