[ 
https://issues.apache.org/jira/browse/HIVE-28683?focusedWorklogId=950566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-950566
 ]

ASF GitHub Bot logged work on HIVE-28683:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 31/Dec/24 14:45
            Start Date: 31/Dec/24 14:45
    Worklog Time Spent: 10m 
      Work Description: zhangbutao commented on code in PR #24:
URL: https://github.com/apache/hive-site/pull/24#discussion_r1900139319


##########
content/docs/latest/manual-installation_283118363.md:
##########
@@ -378,6 +379,76 @@ That directory should contain all the files necessary to 
run Hive. You can run i
 
 From now, you can follow the steps described in the section Installing Hive 
from a Tarball
 
+## Installing with old version hadoop(>=3.1.0)
+
+Although we normally require hive4 to rely on a 
+hadoop 3.3.6+ cluster environment. 
+However, in practice, in an ON YARN environment,
+we can package all the hadoop related dependencies into 
+tez&hive so that they do not need to rely on the lib 
+of the original hadoop cluster environment at runtime. 
+In this way, we can run HIVE4 in a lower version of hadoop, 
+provided that the base APIs of the hadoop 3.x series are common to 
+each other.
+
+The steps are as follows:
+
+1.Download the high version of the Hadoop package, unzip it, and then set the 
hadoop_home finger of the env script in HIVE4 to the path where the high 
version of hadoop is unzipped.
+
+2.Compile TEZ to get tez.tar.gz which contains all hadoop related 
dependencies(not tez minimal tarball), 
+first extract it on the physical machine where HIVE is deployed and configure 
the TEZ_HOME in HIVE to point to it, 
+then place tez.tar.gz in a path in hdfs.
+
+```shell
+## This is an example,Users should install HIVE and TEZ into actual 
directories.
+## In this example, we have installed HIVE-4.0.1 and TEZ-0.10.4 on an Hadoop 
3.1.0 cluster.
+[root@hmsclient01 opt]# cd /opt
+[root@hmsclient01 opt]# ll
+drwxr-xr-x 11 hive hadoop      4096 Nov  7 13:59 apache-hive-4.0.1-bin
+drwxr-xr-x  3 hive hadoop      4096 Nov  7 13:59 apache-tez-0.10.4-bin
+drwxr-xr-x 10 hive hadoop      4096 Nov  7 13:59 hadoop-3.3.6
+lrwxrwxrwx  1 hive hadoop        30 Nov  7 13:59 hive-4.0.0 -> 
apache-hive-4.0.1-bin
+lrwxrwxrwx  1 hive hadoop        21 Nov  7 13:59 tez -> apache-tez-0.10.4-bin
+```
+
+edit `hive-env.sh`
+
+```shell
+# Folder containing extra libraries required for hive compilation/execution 
can be controlled by:
+export TEZ_HOME=/opt/tez
+# Set HADOOP_HOME to point to a specific hadoop install directory
+HADOOP_HOME=${HADOOP_HOME:-/opt/hadoop-3.3.6}
+
+export HIVE_HOME=${HIVE_HOME:-/opt/hive-4.0.0}
+```
+
+3.In tez-site.xml. Set the following two confs to use only the libs that come 
with tez. For nativeLib, 
+you can reuse the cluster's existing libs.
+```xml
+      <property>
+        <name>tez.lib.uris</name><!--hdfs path-->
+        <value>/{hdfs-dir}/apache-tez-0.10.4-bin.tar.gz</value>
+    </property>
+    <property>
+        <name>tez.lib.uris.classpath</name> <!--only use tez self lib-->
+       <value>$PWD/tezlib/*,$PWD/tezlib/lib/*</value>
+    </property>
+
+    <property>
+        <name>tez.am.launch.env</name><!--Example, replace with actual value-->
+        <value>LD_LIBRARY_PATH=/usr/hdp/3.1.0.0-78/hadoop/lib/native</value>
+    </property>

Review Comment:
   Can remove the property?
   If this configuration is required, please avoid using the hdp version.



##########
content/docs/latest/manual-installation_283118363.md:
##########
@@ -378,6 +379,76 @@ That directory should contain all the files necessary to 
run Hive. You can run i
 
 From now, you can follow the steps described in the section Installing Hive 
from a Tarball
 
+## Installing with old version hadoop(>=3.1.0)
+
+Although we normally require hive4 to rely on a 
+hadoop 3.3.6+ cluster environment. 
+However, in practice, in an ON YARN environment,
+we can package all the hadoop related dependencies into 
+tez&hive so that they do not need to rely on the lib 
+of the original hadoop cluster environment at runtime. 
+In this way, we can run HIVE4 in a lower version of hadoop, 
+provided that the base APIs of the hadoop 3.x series are common to 
+each other.
+
+The steps are as follows:
+
+1.Download the high version of the Hadoop package, unzip it, and then set the 
hadoop_home finger of the env script in HIVE4 to the path where the high 
version of hadoop is unzipped.
+
+2.Compile TEZ to get tez.tar.gz which contains all hadoop related 
dependencies(not tez minimal tarball), 
+first extract it on the physical machine where HIVE is deployed and configure 
the TEZ_HOME in HIVE to point to it, 
+then place tez.tar.gz in a path in hdfs.
+
+```shell
+## This is an example,Users should install HIVE and TEZ into actual 
directories.
+## In this example, we have installed HIVE-4.0.1 and TEZ-0.10.4 on an Hadoop 
3.1.0 cluster.
+[root@hmsclient01 opt]# cd /opt
+[root@hmsclient01 opt]# ll
+drwxr-xr-x 11 hive hadoop      4096 Nov  7 13:59 apache-hive-4.0.1-bin
+drwxr-xr-x  3 hive hadoop      4096 Nov  7 13:59 apache-tez-0.10.4-bin
+drwxr-xr-x 10 hive hadoop      4096 Nov  7 13:59 hadoop-3.3.6
+lrwxrwxrwx  1 hive hadoop        30 Nov  7 13:59 hive-4.0.0 -> 
apache-hive-4.0.1-bin
+lrwxrwxrwx  1 hive hadoop        21 Nov  7 13:59 tez -> apache-tez-0.10.4-bin
+```
+
+edit `hive-env.sh`
+
+```shell
+# Folder containing extra libraries required for hive compilation/execution 
can be controlled by:
+export TEZ_HOME=/opt/tez
+# Set HADOOP_HOME to point to a specific hadoop install directory
+HADOOP_HOME=${HADOOP_HOME:-/opt/hadoop-3.3.6}
+
+export HIVE_HOME=${HIVE_HOME:-/opt/hive-4.0.0}
+```
+
+3.In tez-site.xml. Set the following two confs to use only the libs that come 
with tez. For nativeLib, 
+you can reuse the cluster's existing libs.
+```xml
+      <property>
+        <name>tez.lib.uris</name><!--hdfs path-->
+        <value>/{hdfs-dir}/apache-tez-0.10.4-bin.tar.gz</value>

Review Comment:
   Please use a correct hdfs path format, such as 
`${fs.defaultFS}/apps/apache-tez-0.10.4-bin.tar.gz`   (Refer to 
https://tez.apache.org/install.html)
   
   BTW, should the apache-tez-0.10.4-bin.tar.gz contain all high hadoop version 
(3.3.6) related dependencies?
   You can add a description for the property.



##########
content/docs/latest/manual-installation_283118363.md:
##########
@@ -378,6 +379,76 @@ That directory should contain all the files necessary to 
run Hive. You can run i
 
 From now, you can follow the steps described in the section Installing Hive 
from a Tarball
 
+## Installing with old version hadoop(>=3.1.0)
+
+Although we normally require hive4 to rely on a 
+hadoop 3.3.6+ cluster environment. 
+However, in practice, in an ON YARN environment,
+we can package all the hadoop related dependencies into 
+tez&hive so that they do not need to rely on the lib 
+of the original hadoop cluster environment at runtime. 
+In this way, we can run HIVE4 in a lower version of hadoop, 
+provided that the base APIs of the hadoop 3.x series are common to 
+each other.
+
+The steps are as follows:
+
+1.Download the high version of the Hadoop package, unzip it, and then set the 
hadoop_home finger of the env script in HIVE4 to the path where the high 
version of hadoop is unzipped.
+
+2.Compile TEZ to get tez.tar.gz which contains all hadoop related 
dependencies(not tez minimal tarball), 

Review Comment:
   You can add more about how to` Compile TEZ to get tez.tar.gz which contains 
all hadoop related dependencies`. Such as what mvn command can be used to 
compile?



##########
content/docs/latest/manual-installation_283118363.md:
##########
@@ -378,6 +379,76 @@ That directory should contain all the files necessary to 
run Hive. You can run i
 
 From now, you can follow the steps described in the section Installing Hive 
from a Tarball
 
+## Installing with old version hadoop(>=3.1.0)
+
+Although we normally require hive4 to rely on a 
+hadoop 3.3.6+ cluster environment. 
+However, in practice, in an ON YARN environment,
+we can package all the hadoop related dependencies into 
+tez&hive so that they do not need to rely on the lib 
+of the original hadoop cluster environment at runtime. 
+In this way, we can run HIVE4 in a lower version of hadoop, 
+provided that the base APIs of the hadoop 3.x series are common to 
+each other.
+
+The steps are as follows:
+
+1.Download the high version of the Hadoop package, unzip it, and then set the 
hadoop_home finger of the env script in HIVE4 to the path where the high 
version of hadoop is unzipped.
+
+2.Compile TEZ to get tez.tar.gz which contains all hadoop related 
dependencies(not tez minimal tarball), 
+first extract it on the physical machine where HIVE is deployed and configure 
the TEZ_HOME in HIVE to point to it, 
+then place tez.tar.gz in a path in hdfs.
+
+```shell
+## This is an example,Users should install HIVE and TEZ into actual 
directories.
+## In this example, we have installed HIVE-4.0.1 and TEZ-0.10.4 on an Hadoop 
3.1.0 cluster.
+[root@hmsclient01 opt]# cd /opt
+[root@hmsclient01 opt]# ll
+drwxr-xr-x 11 hive hadoop      4096 Nov  7 13:59 apache-hive-4.0.1-bin
+drwxr-xr-x  3 hive hadoop      4096 Nov  7 13:59 apache-tez-0.10.4-bin
+drwxr-xr-x 10 hive hadoop      4096 Nov  7 13:59 hadoop-3.3.6
+lrwxrwxrwx  1 hive hadoop        30 Nov  7 13:59 hive-4.0.0 -> 
apache-hive-4.0.1-bin
+lrwxrwxrwx  1 hive hadoop        21 Nov  7 13:59 tez -> apache-tez-0.10.4-bin
+```
+
+edit `hive-env.sh`
+
+```shell
+# Folder containing extra libraries required for hive compilation/execution 
can be controlled by:
+export TEZ_HOME=/opt/tez
+# Set HADOOP_HOME to point to a specific hadoop install directory
+HADOOP_HOME=${HADOOP_HOME:-/opt/hadoop-3.3.6}
+
+export HIVE_HOME=${HIVE_HOME:-/opt/hive-4.0.0}
+```
+
+3.In tez-site.xml. Set the following two confs to use only the libs that come 
with tez. For nativeLib, 
+you can reuse the cluster's existing libs.
+```xml
+      <property>
+        <name>tez.lib.uris</name><!--hdfs path-->
+        <value>/{hdfs-dir}/apache-tez-0.10.4-bin.tar.gz</value>
+    </property>
+    <property>
+        <name>tez.lib.uris.classpath</name> <!--only use tez self lib-->
+       <value>$PWD/tezlib/*,$PWD/tezlib/lib/*</value>
+    </property>
+
+    <property>
+        <name>tez.am.launch.env</name><!--Example, replace with actual value-->
+        <value>LD_LIBRARY_PATH=/usr/hdp/3.1.0.0-78/hadoop/lib/native</value>
+    </property>
+    
+    <property>
+        <name>tez.task.launch.env</name><!--Example, replace with actual 
value-->
+        <value>LD_LIBRARY_PATH=/usr/hdp/3.1.0.0-78/hadoop/lib/native</value>
+    </property>

Review Comment:
   Can remove the property?
   If this configuration is required, please avoid using the hdp version.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 950566)
    Time Spent: 2h 20m  (was: 2h 10m)

> Add doc for install hive4 with old version hadoop
> -------------------------------------------------
>
>                 Key: HIVE-28683
>                 URL: https://issues.apache.org/jira/browse/HIVE-28683
>             Project: Hive
>          Issue Type: Task
>            Reporter: yongzhi.shao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently, many users want to upgrade to HIVE4, but they are often limited by 
> the fact that the version of their existing HADOOP cluster is too low to 
> upgrade to HIVE4. But in fact, HIVE4 can work with lower versions of 
> HADOOP.We should improve the documentation to show these users how to upgrade 
> to HIVE4.
>  
> see [Running Hive4 in low-version Hadoop environments.-Apache Mail 
> Archives|https://lists.apache.org/thread/f6j6jnk9qnbgywj4c24lql0l7x1p1yv4]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to