[incubator-linkis-website] branch dev updated: Deploy Apache Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2 (#526)

casion Tue, 27 Sep 2022 04:39:46 -0700

This is an automated email from the ASF dual-hosted git repository.

casion pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/incubator-linkis-website.git



The following commit(s) were added to refs/heads/dev by this push:
     new 23ba47cf83 Deploy Apache Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2 
(#526)
23ba47cf83 is described below

commit 23ba47cf83040e9a92a2e09c3173dc036c6b99a9
Author: kongslove <[email protected]>
AuthorDate: Tue Sep 27 19:39:15 2022 +0800

    Deploy Apache Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2 (#526)
---
 .../2022-09-27-linkis111-deploy/img/config-err.png | Bin 0 -> 218989 bytes
 .../img/jdbc-engine-analyze.png                    | Bin 0 -> 61055 bytes
 .../img/pugin-error.png                            | Bin 0 -> 218989 bytes
 .../img/pyspark-error.png                          | Bin 0 -> 97383 bytes
 .../img/spark-hive-verion-error.png                | Bin 0 -> 99411 bytes
 blog/2022-09-27-linkis111-deploy/index.md          | 205 +++++++++++++++++++++
 blog/authors.yml                                   |   8 +-
 .../2022-09-27-linkis111-deploy/img/config-err.png | Bin 0 -> 218989 bytes
 .../img/jdbc-engine-analyze.png                    | Bin 0 -> 61055 bytes
 .../img/pugin-error.png                            | Bin 0 -> 218989 bytes
 .../img/pyspark-error.png                          | Bin 0 -> 97383 bytes
 .../img/spark-hive-verion-error.png                | Bin 0 -> 99411 bytes
 .../2022-09-27-linkis111-deploy/index.md           | 162 ++++++++++++++++
 .../docusaurus-plugin-content-blog/authors.yml     |   8 +-
 14 files changed, 381 insertions(+), 2 deletions(-)

diff --git a/blog/2022-09-27-linkis111-deploy/img/config-err.png 
b/blog/2022-09-27-linkis111-deploy/img/config-err.png
new file mode 100644
index 0000000000..58fd54f11b
Binary files /dev/null and 
b/blog/2022-09-27-linkis111-deploy/img/config-err.png differ
diff --git a/blog/2022-09-27-linkis111-deploy/img/jdbc-engine-analyze.png 
b/blog/2022-09-27-linkis111-deploy/img/jdbc-engine-analyze.png
new file mode 100644
index 0000000000..dcf5bf302f
Binary files /dev/null and 
b/blog/2022-09-27-linkis111-deploy/img/jdbc-engine-analyze.png differ
diff --git a/blog/2022-09-27-linkis111-deploy/img/pugin-error.png 
b/blog/2022-09-27-linkis111-deploy/img/pugin-error.png
new file mode 100644
index 0000000000..58fd54f11b
Binary files /dev/null and 
b/blog/2022-09-27-linkis111-deploy/img/pugin-error.png differ
diff --git a/blog/2022-09-27-linkis111-deploy/img/pyspark-error.png 
b/blog/2022-09-27-linkis111-deploy/img/pyspark-error.png
new file mode 100644
index 0000000000..7f845a4983
Binary files /dev/null and 
b/blog/2022-09-27-linkis111-deploy/img/pyspark-error.png differ
diff --git a/blog/2022-09-27-linkis111-deploy/img/spark-hive-verion-error.png 
b/blog/2022-09-27-linkis111-deploy/img/spark-hive-verion-error.png
new file mode 100644
index 0000000000..f34b8d2da0
Binary files /dev/null and 
b/blog/2022-09-27-linkis111-deploy/img/spark-hive-verion-error.png differ
diff --git a/blog/2022-09-27-linkis111-deploy/index.md 
b/blog/2022-09-27-linkis111-deploy/index.md
new file mode 100644
index 0000000000..7b7bc68228
--- /dev/null
+++ b/blog/2022-09-27-linkis111-deploy/index.md
@@ -0,0 +1,205 @@
+---
+title: Deploy Apache Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2
+authors: [kevinWdong]
+tags: [blog,linki1.1.1,hadoop3.0.0-cdh6.3.2,spark2.4.8,hive2.1.1]
+---
+With the development of business and the update and iteration of community 
products, we found that Linkis1. X has greatly improved its performance in 
terms of resource management and engine management, which can better meet the 
requirements of the construction of data middle stations. Compared with version 
0.9.3 and the platform we used before, the user experience has also been 
greatly improved, and the problems such as the inability to view details on the 
task failure page have also bee [...]
+
+# 1.Environment
+
+## CDH6.3.2 Component versions
+
+* hadoop:3.0.0-cdh6.3.2
+* hive:2.1.1-cdh6.3.2
+* spark：2.4.8
+
+## hardware environment 
+
+128G cloud physical machine*2
+
+# 2. Linkis installation and deployment
+
+## 2.1 Compile code or release installation package?
+
+This installation deployment adopts the release installation package method. 
In order to adapt to the company's CDH6.3.2 version, the dependency packages of 
hadoop and hive need to be replaced with the CDH6.3.2 version. Here, the 
installation package is directly replaced. The dependent packages and modules 
to be replaced are shown in the following list.
+
+```plain
+// Modules involved 
+
+linkis-engineconn-plugins/spark
+linkis-engineconn-plugins/hive
+/linkis-commons/public-module
+/linkis-computation-governance/
+```
+
+```plain
+// List of cdh packages that need to be replaced
+
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-0.23-2.1.1-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-scheduler-2.1.1-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-server-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-shuffle-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
+
+```
+
+
+## 2.2 Problems encountered during deployment
+
+### 2.2.1  Kerberos configuration
+
+It needs to be added in the linkis.properties public configuration
+
+Each engine conf also needs to be added
+
+```plain
+wds.linkis.keytab.enable=true
+wds.linkis.keytab.file=/hadoop/bigdata/kerberos/keytab
+wds.linkis.keytab.host.enabled=false
+wds.linkis.keytab.host=your_host
+```
+
+### 2.2.2  Error is reported after Hadoop dependency package is replaced
+
+java.lang.NoClassDefFoundError:org/apache/commons/configuration2/Configuration
+
+![image](./img/config-err.png)
+
+Cause: Configuration class conflict. Add a commons-configuration2-2.1.1.jar 
under the linkis commons module to resolve the conflict
+
+### 
+
+### 2.2.3 Running spark, python, etc. in script reports no plugin for XXX
+
+Phenomenon: After modifying the version of Spark/Python in the configuration 
file, the startup engine reports no plugin for XXX
+
+![image](./img/pugin-error.png)
+
+Reason: LabelCommonConfig.java and GovernanceCommonConf In scala, the version 
of the engine is written dead, the corresponding version is modified, and all 
jars containing these two classes (linkis computation governance common-1.1.1. 
jar and linkis label common-1.1.1. jar) in linkis and other components 
(including scheduleris) are replaced after compilation
+
+### 2.2.4 Python engine execution error, initialization failed
+
+* Modify python. py and remove the imported pandas module
+* Configure the python loading directory and modify the python engine's 
linkis-enginecon.properties
+
+```plain
+pythonVersion=/usr/local/bin/python3.6
+```
+
+### 2.2.5 Failed to run the pyspark task and reported an error
+
+![image](./img/pyspark-error.png)
+
+Reason: PYSPARK is not set_ VERSION
+
+resolvent:
+
+Set two parameters in/etc/profile
+```
+export PYSPARK_ PYTHON=/usr/local/bin/python3.6
+export PYSPARK_ DRIVER_PYTHON=/usr/local/bin/python3.6
+```
+### 2.2.6 Error occurs when executing the pyspark task
+
+java.lang.NoSuchFieldError: HIVE_ STATS_ JDBC_ TIMEOUT
+
+![image](./img/spark-hive-verion-error.png)
+
+ 
+
+Reason: Spark 2.4.8 uses the hive1.2.1 package, but our hive has been upgraded 
to version 2.1.1. This parameter has been removed from hive2. Then the code in 
spark sql still calls the hive parameter, and then an error is reported,
+
+Therefore, HIVE is deleted from the spark sql/hive code_ STATS_ JDBC_ TIMEOUT 
This parameter is recompiled and packaged to replace the spark hive in spark 
2.4.8_ 2.11-2.4.8.jar
+
+### 2.2.7 Proxy user exception during jdbc engine execution
+
+Phenomenon: User A is used to execute a jdbc task 1. The engine chooses to 
reuse it. Then I also use User B to execute a jdbc task 2. It is found that the 
submitter of task 2 is A
+
+Analysis reason:
+
+ConnectionManager::getConnection
+
+![image](./img/jdbc-engine-analyze.png)
+
+When creating a datasource, we judge whether to create it according to the 
key. The key is a jdbc url, but this granularity may be a bit large, because 
different users may access the same datasource, such as hive. Their urls are 
the same, but their account passwords are different. So when the first user 
creates a datasource, the username has been specified. When the second user 
comes in, If the data source is found to exist, it will be used directly 
instead of creating a new data source. [...]
+
+Solution: Reduce the key granularity of the data source cache map, and change 
it to jdbc. url+jdbc. user.
+
+3.  DSS deployment
+    The installation process refers to the official website documents for 
installation configuration. The following describes some issues encountered in 
the installation and debugging process.
+
+## 3.1 The database list displayed on the left side of the DSS is incomplete
+
+Analysis: The database information displayed in the DSS data source module is 
from the hive metabase. However, because of the permission control through the 
Sentry in CDH6, most of the hive table metadata information does not exist in 
the hive metastore, so the displayed data is missing.
+
+resolvent:
+
+The original logic is transformed into the way of using jdbc to link hive and 
obtain table data display from jdbc.
+
+Simple logic description:
+
+The properties information of jdbc is obtained through the IDE jdbc 
configuration information configured on the linkis console.
+
+DBS: Get the schema through connection. getMetaData()
+
+TBS: connection. getMetaData(). getTables() Get the tables under the 
corresponding db
+
+COLUMNS: Get the columns information of the table by executing describe table
+
+## 3.2 Error jdbc is reported when executing jdbc script in DSS workflow name 
is empty
+
+Analysis: The default creator in the dss workflow is Schedulis. Because the 
related engine parameters of Schedulis are not configured in the management 
console, the parameters read are all empty.
+
+Adding a category of Schedulis to the console gives an error, ”The Schedulis 
directory already exists. Because the creator in the scheduling system is 
schedulis, the Schedulis Category cannot be added. In order to better identify 
each system, the default creator in the dss workflow is changed to 
nod_exception. This parameter can add wds. linkis. flow. job. creator. 
v1=nod_execution in the dss flow execution server. properties.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/blog/authors.yml b/blog/authors.yml
index 9f576fdf85..a8402f0bd0 100644
--- a/blog/authors.yml
+++ b/blog/authors.yml
@@ -26,4 +26,10 @@ ruY9527:
   name: ruY9527
   title: contributors
   url: https://github.com/ruY9527
-  image_url: https://avatars.githubusercontent.com/u/43773582?v=4
\ No newline at end of file
+  image_url: https://avatars.githubusercontent.com/u/43773582?v=4
+
+kevinWdong:
+  name: kevinWdong
+  title: contributors
+  url: https://github.com/kongslove
+  image_url: https://avatars.githubusercontent.com/u/42604208?v=4
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/config-err.png
 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/config-err.png
new file mode 100644
index 0000000000..58fd54f11b
Binary files /dev/null and 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/config-err.png
 differ
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/jdbc-engine-analyze.png
 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/jdbc-engine-analyze.png
new file mode 100644
index 0000000000..dcf5bf302f
Binary files /dev/null and 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/jdbc-engine-analyze.png
 differ
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/pugin-error.png
 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/pugin-error.png
new file mode 100644
index 0000000000..58fd54f11b
Binary files /dev/null and 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/pugin-error.png
 differ
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/pyspark-error.png
 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/pyspark-error.png
new file mode 100644
index 0000000000..7f845a4983
Binary files /dev/null and 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/pyspark-error.png
 differ
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/spark-hive-verion-error.png
 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/spark-hive-verion-error.png
new file mode 100644
index 0000000000..f34b8d2da0
Binary files /dev/null and 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/img/spark-hive-verion-error.png
 differ
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/index.md
 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/index.md
new file mode 100644
index 0000000000..3b2a76b8c0
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-blog/2022-09-27-linkis111-deploy/index.md
@@ -0,0 +1,162 @@
+---
+title: Deploy Apache Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2
+authors: [kevinWdong]
+tags: [blog,linki1.1.1,hadoop3.0.0-cdh6.3.2,spark2.4.8,hive2.1.1]
+---
+### 前言
+
+随着业务的发展和社区产品的更新迭代，我们发现Linkis1.X在资源管理，引擎管理方面有极大的性能提升，可以更好的满足数据中台的建设。相较于0.9.3版本和我们之前使用的平台，
 
在用户体验方面也得到很大的提升，任务失败页面无法方便查看详情等问题也都得到改善，因此决定升级Linkis以及WDS套件，那么如下是具体的实践操作，希望给大家带来参考。
 
+
+## 一、环境
+
+#### CDH6.3.2 各组件版本
+
+- hadoop:3.0.0-cdh6.3.2
+- hive:2.1.1-cdh6.3.2
+- spark：2.4.8
+
+#### 硬件环境
+
+2台 128G 云物理机
+
+## 二、Linkis安装部署
+
+### 2.1编译代码or release安装包？
+
+本次安装部署采用的是**release安装包**方式部署。为了适配司内CDH6.3.2版本，hadoop和hive的相关依赖包需要替换成CDH6.3.2版本，这里采用的是直接替换安装包的方式。需要替换的依赖包与模块如下l列表所示。
+
+```
+--涉及到的模块
+linkis-engineconn-plugins/spark
+linkis-engineconn-plugins/hive
+/linkis-commons/public-module
+/linkis-computation-governance/
+```
+
+
+
+```
+-----需要更换cdh包的列表
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-0.23-2.1.1-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hive-shims-scheduler-2.1.1-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/spark/dist/v2.4.8/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-jobclient-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-server-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-mapreduce-client-shuffle-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/hive/dist/v2.1.1/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-mapreduce-client-core-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-common-3.0.0-cdh6.3.2.jar
+./lib/linkis-commons/public-module/hadoop-hdfs-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-annotations-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-auth-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-api-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-client-3.0.0-cdh6.3.2.jar
+./lib/linkis-computation-governance/linkis-cg-linkismanager/hadoop-yarn-common-3.0.0-cdh6.3.2.jar
+
+```
+
+### 2.2部署过程中遇到的问题
+
+1、kerberos配置
+需要在linkis.properties公共配置中添加
+各个引擎conf也需要添加
+
+```
+wds.linkis.keytab.enable=true
+wds.linkis.keytab.file=/hadoop/bigdata/kerberos/keytab
+wds.linkis.keytab.host.enabled=false
+wds.linkis.keytab.host=your_host
+```
+
+2、更换Hadoop依赖包后启动报错java.lang.NoClassDefFoundError:org/apache/commons/configuration2/Configuration
+
+![image](./img/config-err.png)
+
+原因：Configuration类冲突，在linkis-commons模块下在添加一个commons-configuration2-2.1.1.jar解决冲突
+
+3、script中运行spark、python等报错no plugin for XXX
+现象：在配置文件中修改完spark/python的版本后，启动引擎报错no plugin for XXX
+![image](./img/pugin-error.png)
+原因：LabelCommonConfig.java和GovernaceCommonConf.scala这两个类中写死了引擎的版本，修改相应版本，编译后替换掉linkis以及其他组件（包括schedulis等）里面所有包含这两个类的jar（linkis-computation-governance-common-1.1.1.jar和linkis-label-common-1.1.1.jar）
+
+4、python引擎执行报错，初始化失败
+
+- 修改python.py,移除引入pandas模块
+
+- 配置python加载目录，修改python引擎的linkis-engineconn.properties
+
+  ```
+  pythonVersion=/usr/local/bin/python3.6
+  ```
+
+  
+
+5、运行pyspark任务失败报错
+![image](./img/pyspark-error.png)
+原因：未设置PYSPARK_VERSION
+解决方法：
+在/etc/profile下设置两个参数
+
+```
+export PYSPARK_PYTHON=/usr/local/bin/python3.6
+
+export PYSPARK_DRIVER_PYTHON=/usr/local/bin/python3.6
+```
+
+
+
+6、执行pyspark任务报错
+java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
+![image](./img/spark-hive-verion-error.png)
+原因：spark2.4.8里面使用的是hive1.2.1的包，但是我们的hive升级到了2.1.1版本，hive2里面已经去掉了这个参数，然后spark-sql里面的代码依然是要调用hive的这个参数的，然后就报错了，
+所以在spark-sql/hive代码中删除掉了HIVE_STATS_JDBC_TIMEOUT这个参数，重新编译后打包，替换spark2.4.8中的spark-hive_2.11-2.4.8.jar
+
+
+
+7、jdbc引擎执行出现代理用户异常
+
+现象：用A用户去执行一个jdbc任务1，引擎选了可以复用，然后我也用B用户去执行一个jdbc任务2，发现 任务2的提交人是A
+分析原因：
+ConnectionManager::getConnection
+![image](./img/jdbc-engine-analyze.png)
+这里创建datasource的时候是根据key来判断是否创建，而这个key是jdbc url 
，但这种粒度可能有点大，因为有可能是不同的用户去访问同一个数据源，比如说hive,他们的url是一样的，但是账号密码是不一样的，所以当第一个用户去创建datasource时，username已经指定了，第二个用户进来的时候，发现这个数据源存在，就直接拿这个数据源去用，而不是创建一个新的datasource，所以造成了用户B提交的代码通过A去执行了。
  
+解决方法：数据源缓存map的key粒度降低，改成jdbc.url+jdbc.user。
+
+
+
+## 三、DSS部署
+
+安装过程参考官网文档进行安装配置，下面说明一下在安装调试过程中遇到的一些事项。
+
+#### 3.1 DSS 左侧数据库展示的数据库列表显示不全
+
+分析：DSS数据源模块显示的数据库信息是来源于hive的元数据库，但由于CDH6中通过sentry进行权限控制，大部分的hive表元数据信息没有存在于hive
 metastore中，所以展示的数据存在缺失。
+解决方法：
+将原有逻辑改造成使用jdbc链接hive的方式，从jdbc中获取表数据展示。
+简单逻辑描述：
+jdbc的properties信息通过linkis控制台配置的IDE-jdbc的配置信息获取。
+DBS：通过connection.getMetaData()获取schema
+TBS：connection.getMetaData().getTables()获取对应db下的tables
+COLUMNS:通过执行describe  table 获取表的columns信息
+
+#### 3.2 DSS 工作流中执行jdbc脚本报错 jdbc.name is empty
+
+分析：dss workflow中默认执行的creator是Schedulis，由于在管理台中未配置Schedulis的相关引擎参数，导致读取的参数全为空。
+在控制台中添加Schedulis的Category时报错，”Schedulis目录已存在“。由于调度系统中的creator是schedulis，导致无法添加Schedulis
 Category,为了更好的标识各个系统，所以将dss 
workflow中默认执行的creator改成nodeexcetion，该参数可以在dss-flow-execution-server.properties中添加wds.linkis.flow.job.creator.v1=nodeexecution一行配置即可。
\ No newline at end of file
diff --git a/i18n/zh-CN/docusaurus-plugin-content-blog/authors.yml 
b/i18n/zh-CN/docusaurus-plugin-content-blog/authors.yml
index 4c05b67ed1..1e061d9a0d 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-blog/authors.yml
+++ b/i18n/zh-CN/docusaurus-plugin-content-blog/authors.yml
@@ -26,4 +26,10 @@ ruY9527:
   name: ruY9527
   title: contributors
   url: https://github.com/ruY9527
-  image_url: https://avatars.githubusercontent.com/u/43773582?v=4
\ No newline at end of file
+  image_url: https://avatars.githubusercontent.com/u/43773582?v=4
+
+kevinWdong:
+  name: kevinWdong
+  title: contributors
+  url: https://github.com/kongslove
+  image_url: https://avatars.githubusercontent.com/u/42604208?v=4
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[incubator-linkis-website] branch dev updated: Deploy Apache Linkis1.1.1 and DSS1.1.0 based on CDH6.3.2 (#526)

Reply via email to