Dear:

Wechat group "Apache Linkis(incubating)  community Development group "chat 
records are as follows:  微信群"Apache Linkis(incubating)  社区开发群"的聊天记录如下:

 

—————  2022-10-27  —————

hcl  09:44







Ladies and gentlemen! Is dev1.3.0 a containerized version? Where is the 
dockerfile







Jack  09:45




dist







hcl  09:53




[OK]







Mr Flash 09:54




[strong][strong]







hcl  09:55




The hive client is pre-installed in the image. The configuration information is 
hive-site.xml hdfs-site.xml How to unify the configuration information with the 
hadoop cluster in the generated environment? Is it uploaded through the 
foreground or modified in the background?







Jack  09:56




Modify configmap







hcl  09:58




Whether the configuration of the hadoop cluster in the production environment 
cannot be directly used, and if so, what key configurations should be modified?







Jack  10:00




@China Mobile -hcl can modify the configmap in k8s







Jack  10:00




These files are mapped to the container via cm







hcl  10:03




The hive-site and mapred-site.xml files contain some configuration related to 
cluster clients




Such as:




hive.aux.jars.path specifies the directory on the cluster




hdp.version is an environment variable




There are always various problems with replacing the configuration directly on 
the cluster.







hcl  10:03




@jacktao2017- Chinese Communication Service - Tao Zhiqiang







Jack  10:04




hdp needs to be adapted







Jack  10:05




Starting from compilation







hcl  10:06




Oh!







hcl  10:07




Is hive-engine necessary? Is it possible to connect directly using jdbc? Does 
the hive client have any advantages







hcl  10:08




Have you considered mounting the hive client directly? It is more convenient to 
adapt to hadoop vendors







Jack  10:10




@China Mobile -hcl Some execution engine dependent version needs may need to 
change







hcl  10:18




So is strong







Mr. Flash 10:22




Currently, the startup mode is hive driver, which strongly depends on the 
version and jar







Mr. Flash 10:22




If the jdbc mode of hive server2 is used, try the jdbc engine







hcl  10:23




jdbc-engine is ok in my tests







Mr. Flash 10:23




Access hive?







hcl  10:23




Are there any scenarios where jdbc cannot meet the requirements and hive driver 
is required







hcl  10:23




Yes.







Mr. Flash 10:23




Is kerberos enabled?







Mr. Flash 10:24




It's a choice...




When hive jdbc has high concurrency, metastore is under great pressure







Mr. Flash 10:25




ds uses hive jdbc







hcl  10:25




It can be dynamically loaded like taier







hcl  10:26




kerberos was enabled, and a login was made during the connection







Mr. Flash 10:29




I haven't done it yet and I thought it would take a lot of work, because the 
driver is hard to find







Mr. Flash 10:29




dbeaver I tested can I am cdh 5.16.1 turned on kerberos







hcl  10:31




It works. I use it better than hive-engine. hive-engine has a lot of problems 
running tez, closing local, large files, etc







Mr. Flash 10:38




However, jdbc has many benefits such as the ability to support multiple 
clusters simultaneously







hcl  10:39




When adapting to the hadoop cluster, do I directly overwrite the configmap with 
the configuration file on the cluster? Can I change hive.aux.jars.path to the 
directory in the container?







hcl  10:39




jacktao2017- TaoTaoTao2017







Jack  10:39




It seems that there are currently plug-in versions that mainly replace hive ecp







hcl  10:39




[good]







Jack  10:40




@China Mobile -hcl also covers changing the client installation package when 
building docker







hcl  10:41




If it is overwritten, do you need to change hive.aux.jars.path







Jack  10:42




Can we change







hcl  10:42




Ok, so far I've only found this configuration that needs to be modified







hcl  10:42




I don't know if there are any other questions







Mr. Flash 10:43




@jacktao2017- Is it true that configmap can solve hadoop hive spark 
configuration?







Mr. Flash 10:43




N Configuration files







Mr. Flash 10:43




That's a few dozen items short. Maybe hundreds of them







Jack  10:45




cm is the configuration file







Jack  10:45




The configuration center in k8s







hcl  10:46




create cm --from file







hcl  10:46




Is that right







Mr. Flash 10:46




I really don't know how to operate







Jack  10:47




@China Mobile -hcl Look at the cm file in helm







Jack  10:47




The format should be the same







hcl  10:48







Jack  10:48




@utopianet_ Wide Silver credit card _ Joaquin can hang itself in static file too







hcl  10:50




@jacktao2017- Is linkis hive-engine used in a production environment? Is it 
smooth after configuration modification? We haven't fully tested tez yet.







Jack  10:51




@utopianet_ Silver Credit card _ Joaquin







Jack  10:51




@China Mobile -hcl I test with ldh is good







Jack  10:52




tez is not familiar with a new parsing engine, similar to hive spark







hcl  10:53




Yes, all new versions default to tez







hcl  10:54




Large files can be problematic







Mr. Flash 10:55




Never used it at all...







Jack  10:57




@China Mobile -hcl hdp integration did not know which engine apache version 
default is before mr







hcl  10:58




Yes.







hcl  10:58




It used to be mr, mr Currently the hive-engine is ok, what's the problem







Jack  11:00




Hive on mr: HIVE3.X The underlying execution engine does not support hiveon mr, 
and the CDP does not support hiveon spark. Only hiveon tez is supported. Hive 
on Tez provides better ETL performance.







hcl  11:20




Yes, the new version is tez

hcl  09:44




各位大佬!dev1.3.0是容器化版本吗?dockerfile 在哪块放着




Jack  09:45

dist




hcl  09:53

[OK]




闪电先生  09:54

[强][强][强]




hcl  09:55

@jacktao2017-中通服-陶志强  陶总,hive客户端是在镜像中预先安装好的。他的配置信息hive-site.xml hdfs-site.xml 
如何与生成环境的hadoop集群统一。是通过前台上传还是后台修改。




Jack  09:56

修改configmap




hcl  09:58

是不是无法直接使用生产环境hadoop集群的配置,如果修改的话,一般修改哪些关键性配置。




Jack  10:00

@中国移动-hcl 可以的 就修改k8s里的configmap




Jack  10:00

这些文件会通过cm映射到容器里




hcl  10:03

hive-site 以及 mapred-site.xml 等文件中会有一些与集群客户端相关的配置

比如:

hive.aux.jars.path 它配置的是集群上一个目录

hdp.version  是一个环境变量

直接拿集群上的配置做替换总会有各种各样的问题。




hcl  10:03

@jacktao2017-中通服-陶志强 




Jack  10:04

hdp要适配一下




Jack  10:05

从编译开始




hcl  10:06

哦!




hcl  10:07

hive-engine是有必要的吗,如果直接用jdbc 进行连接可以吗?hive客户端会有一些优势吗




hcl  10:08

有没有考虑过把hive客户端直接挂载出来。与hadoop厂商适配的时候更方便一点




Jack  10:10

@中国移动-hcl 有些执行引擎依赖版本需要可能需要改




hcl  10:18

[强]也是




闪电先生  10:22

目前这种启动方式是hive driver,对版本和jar是强依赖




闪电先生  10:22

如果是hive server2的jdbc模式,可以试试目前的jdbc engine




hcl  10:23

jdbc-engine 我测试是ok的




闪电先生  10:23

访问hive吗?




hcl  10:23

有什么场景是jdbc满足不了的,需要hive driver的吗




hcl  10:23

是的




闪电先生  10:23

有没有开kerberos




闪电先生  10:24

这是一种选择吧……

hive jdbc高并发的时候,metastore压力很大




闪电先生  10:25

ds就是用的hive jdbc




hcl  10:25

可以像taier一样做成插件动态加载




hcl  10:26

开kerberos了,连接的时候做了一次登录




闪电先生  10:29

我还没有这样用过 我原本以为会很费功夫,比如驱动就不好找




闪电先生  10:29

dbeaver我测试过可以 我是cdh 5.16.1 开了kerberos




hcl  10:31

好用。我用着比hive-engine好用,hive-engine 跑tez经常出各种问题,关闭local,大文件啥的




闪电先生  10:38

不过jdbc有太多好处了 比如可以同时支持多集群




hcl  10:39

与hadoop集群适配的时候,是直接拿集群上的配置文件对configmap进行覆盖吗?再把hive.aux.jars.path改成容器里的目录吗?




hcl  10:39

@jacktao2017-中通服 陶总




Jack  10:39

貌似目前有插件了 主要换下hive ecp的版本




hcl  10:39

[强]




Jack  10:40

@中国移动-hcl 覆盖 还有构建docker的时候 换一下client安装包




hcl  10:41

覆盖的话,是不是要改下hive.aux.jars.path 这个




Jack  10:42

可以改




hcl  10:42

好的,我目前只发现了这个配置需要修改




hcl  10:42

不知道还有没有其他问题




闪电先生  10:43

@jacktao2017-中通服 configmap是不是真的可以解决hadoop hive spark配置[汗]




闪电先生  10:43

配置文件N多




闪电先生  10:43

说几十项都少了。可能上百项




Jack  10:45

cm就是配置文件




Jack  10:45

k8s里的配置中心




hcl  10:46

复制拷贝,create cm --from file




hcl  10:46

是不是这样




闪电先生  10:46

@jacktao2017-中通服 大佬 我还真不会操作 哈哈




Jack  10:47

@中国移动-hcl 你看看helm里的cm文件




Jack  10:47

格式要一样




hcl  10:48




Jack  10:48

@utopianet_广银信用卡_张华金 也可以自己挂在静态文件




hcl  10:50

@jacktao2017-中通服 有在生产环境使用linkis  hive-engine吗?配置修改之后是顺利的吗,我们tez还没完全测通。




Jack  10:51

@utopianet_广银信用卡_张华金 




Jack  10:51

@中国移动-hcl 我用ldh测试是好的




Jack  10:52

tez不太熟 是换了个解析引擎么类似 hive spark




hcl  10:53

是的,新版本默认都是tez




hcl  10:54

大文件会出问题




闪电先生  10:55

根本没有用过这个东西……




Jack  10:57

@中国移动-hcl hdp集成的早 不知道apache版本默认是哪个引擎 以前是mr




hcl  10:58

是的




hcl  10:58

以前是mr,mr目前hive-engine是ok的,没出什么问题




Jack  11:00

Hive on mr: HIVE3.X 底层执行引擎不再支持 hive on mr , CDP 中也不再支持 hive on spark 仅支持 hiveon 
tez (Hive on Tez 提供更好的 ETL 性能);




hcl  11:20

是的新版本是tez了



















--

Best Regards
------
康悦 ritakang 
GitHub:Ritakang0451
E-mail:rita0...@163.com

Reply via email to