Dear: Wechat group "Apache Linkis(incubating) community Development group "chat records are as follows: 微信群"Apache Linkis(incubating) 社区开发群"的聊天记录如下:
————— 2022-10-27 ————— hcl 09:44 Ladies and gentlemen! Is dev1.3.0 a containerized version? Where is the dockerfile Jack 09:45 dist hcl 09:53 [OK] Mr Flash 09:54 [strong][strong] hcl 09:55 The hive client is pre-installed in the image. The configuration information is hive-site.xml hdfs-site.xml How to unify the configuration information with the hadoop cluster in the generated environment? Is it uploaded through the foreground or modified in the background? Jack 09:56 Modify configmap hcl 09:58 Whether the configuration of the hadoop cluster in the production environment cannot be directly used, and if so, what key configurations should be modified? Jack 10:00 @China Mobile -hcl can modify the configmap in k8s Jack 10:00 These files are mapped to the container via cm hcl 10:03 The hive-site and mapred-site.xml files contain some configuration related to cluster clients Such as: hive.aux.jars.path specifies the directory on the cluster hdp.version is an environment variable There are always various problems with replacing the configuration directly on the cluster. hcl 10:03 @jacktao2017- Chinese Communication Service - Tao Zhiqiang Jack 10:04 hdp needs to be adapted Jack 10:05 Starting from compilation hcl 10:06 Oh! hcl 10:07 Is hive-engine necessary? Is it possible to connect directly using jdbc? Does the hive client have any advantages hcl 10:08 Have you considered mounting the hive client directly? It is more convenient to adapt to hadoop vendors Jack 10:10 @China Mobile -hcl Some execution engine dependent version needs may need to change hcl 10:18 So is strong Mr. Flash 10:22 Currently, the startup mode is hive driver, which strongly depends on the version and jar Mr. Flash 10:22 If the jdbc mode of hive server2 is used, try the jdbc engine hcl 10:23 jdbc-engine is ok in my tests Mr. Flash 10:23 Access hive? hcl 10:23 Are there any scenarios where jdbc cannot meet the requirements and hive driver is required hcl 10:23 Yes. Mr. Flash 10:23 Is kerberos enabled? Mr. Flash 10:24 It's a choice... When hive jdbc has high concurrency, metastore is under great pressure Mr. Flash 10:25 ds uses hive jdbc hcl 10:25 It can be dynamically loaded like taier hcl 10:26 kerberos was enabled, and a login was made during the connection Mr. Flash 10:29 I haven't done it yet and I thought it would take a lot of work, because the driver is hard to find Mr. Flash 10:29 dbeaver I tested can I am cdh 5.16.1 turned on kerberos hcl 10:31 It works. I use it better than hive-engine. hive-engine has a lot of problems running tez, closing local, large files, etc Mr. Flash 10:38 However, jdbc has many benefits such as the ability to support multiple clusters simultaneously hcl 10:39 When adapting to the hadoop cluster, do I directly overwrite the configmap with the configuration file on the cluster? Can I change hive.aux.jars.path to the directory in the container? hcl 10:39 jacktao2017- TaoTaoTao2017 Jack 10:39 It seems that there are currently plug-in versions that mainly replace hive ecp hcl 10:39 [good] Jack 10:40 @China Mobile -hcl also covers changing the client installation package when building docker hcl 10:41 If it is overwritten, do you need to change hive.aux.jars.path Jack 10:42 Can we change hcl 10:42 Ok, so far I've only found this configuration that needs to be modified hcl 10:42 I don't know if there are any other questions Mr. Flash 10:43 @jacktao2017- Is it true that configmap can solve hadoop hive spark configuration? Mr. Flash 10:43 N Configuration files Mr. Flash 10:43 That's a few dozen items short. Maybe hundreds of them Jack 10:45 cm is the configuration file Jack 10:45 The configuration center in k8s hcl 10:46 create cm --from file hcl 10:46 Is that right Mr. Flash 10:46 I really don't know how to operate Jack 10:47 @China Mobile -hcl Look at the cm file in helm Jack 10:47 The format should be the same hcl 10:48 Jack 10:48 @utopianet_ Wide Silver credit card _ Joaquin can hang itself in static file too hcl 10:50 @jacktao2017- Is linkis hive-engine used in a production environment? Is it smooth after configuration modification? We haven't fully tested tez yet. Jack 10:51 @utopianet_ Silver Credit card _ Joaquin Jack 10:51 @China Mobile -hcl I test with ldh is good Jack 10:52 tez is not familiar with a new parsing engine, similar to hive spark hcl 10:53 Yes, all new versions default to tez hcl 10:54 Large files can be problematic Mr. Flash 10:55 Never used it at all... Jack 10:57 @China Mobile -hcl hdp integration did not know which engine apache version default is before mr hcl 10:58 Yes. hcl 10:58 It used to be mr, mr Currently the hive-engine is ok, what's the problem Jack 11:00 Hive on mr: HIVE3.X The underlying execution engine does not support hiveon mr, and the CDP does not support hiveon spark. Only hiveon tez is supported. Hive on Tez provides better ETL performance. hcl 11:20 Yes, the new version is tez hcl 09:44 各位大佬!dev1.3.0是容器化版本吗?dockerfile 在哪块放着 Jack 09:45 dist hcl 09:53 [OK] 闪电先生 09:54 [强][强][强] hcl 09:55 @jacktao2017-中通服-陶志强 陶总,hive客户端是在镜像中预先安装好的。他的配置信息hive-site.xml hdfs-site.xml 如何与生成环境的hadoop集群统一。是通过前台上传还是后台修改。 Jack 09:56 修改configmap hcl 09:58 是不是无法直接使用生产环境hadoop集群的配置,如果修改的话,一般修改哪些关键性配置。 Jack 10:00 @中国移动-hcl 可以的 就修改k8s里的configmap Jack 10:00 这些文件会通过cm映射到容器里 hcl 10:03 hive-site 以及 mapred-site.xml 等文件中会有一些与集群客户端相关的配置 比如: hive.aux.jars.path 它配置的是集群上一个目录 hdp.version 是一个环境变量 直接拿集群上的配置做替换总会有各种各样的问题。 hcl 10:03 @jacktao2017-中通服-陶志强 Jack 10:04 hdp要适配一下 Jack 10:05 从编译开始 hcl 10:06 哦! hcl 10:07 hive-engine是有必要的吗,如果直接用jdbc 进行连接可以吗?hive客户端会有一些优势吗 hcl 10:08 有没有考虑过把hive客户端直接挂载出来。与hadoop厂商适配的时候更方便一点 Jack 10:10 @中国移动-hcl 有些执行引擎依赖版本需要可能需要改 hcl 10:18 [强]也是 闪电先生 10:22 目前这种启动方式是hive driver,对版本和jar是强依赖 闪电先生 10:22 如果是hive server2的jdbc模式,可以试试目前的jdbc engine hcl 10:23 jdbc-engine 我测试是ok的 闪电先生 10:23 访问hive吗? hcl 10:23 有什么场景是jdbc满足不了的,需要hive driver的吗 hcl 10:23 是的 闪电先生 10:23 有没有开kerberos 闪电先生 10:24 这是一种选择吧…… hive jdbc高并发的时候,metastore压力很大 闪电先生 10:25 ds就是用的hive jdbc hcl 10:25 可以像taier一样做成插件动态加载 hcl 10:26 开kerberos了,连接的时候做了一次登录 闪电先生 10:29 我还没有这样用过 我原本以为会很费功夫,比如驱动就不好找 闪电先生 10:29 dbeaver我测试过可以 我是cdh 5.16.1 开了kerberos hcl 10:31 好用。我用着比hive-engine好用,hive-engine 跑tez经常出各种问题,关闭local,大文件啥的 闪电先生 10:38 不过jdbc有太多好处了 比如可以同时支持多集群 hcl 10:39 与hadoop集群适配的时候,是直接拿集群上的配置文件对configmap进行覆盖吗?再把hive.aux.jars.path改成容器里的目录吗? hcl 10:39 @jacktao2017-中通服 陶总 Jack 10:39 貌似目前有插件了 主要换下hive ecp的版本 hcl 10:39 [强] Jack 10:40 @中国移动-hcl 覆盖 还有构建docker的时候 换一下client安装包 hcl 10:41 覆盖的话,是不是要改下hive.aux.jars.path 这个 Jack 10:42 可以改 hcl 10:42 好的,我目前只发现了这个配置需要修改 hcl 10:42 不知道还有没有其他问题 闪电先生 10:43 @jacktao2017-中通服 configmap是不是真的可以解决hadoop hive spark配置[汗] 闪电先生 10:43 配置文件N多 闪电先生 10:43 说几十项都少了。可能上百项 Jack 10:45 cm就是配置文件 Jack 10:45 k8s里的配置中心 hcl 10:46 复制拷贝,create cm --from file hcl 10:46 是不是这样 闪电先生 10:46 @jacktao2017-中通服 大佬 我还真不会操作 哈哈 Jack 10:47 @中国移动-hcl 你看看helm里的cm文件 Jack 10:47 格式要一样 hcl 10:48 Jack 10:48 @utopianet_广银信用卡_张华金 也可以自己挂在静态文件 hcl 10:50 @jacktao2017-中通服 有在生产环境使用linkis hive-engine吗?配置修改之后是顺利的吗,我们tez还没完全测通。 Jack 10:51 @utopianet_广银信用卡_张华金 Jack 10:51 @中国移动-hcl 我用ldh测试是好的 Jack 10:52 tez不太熟 是换了个解析引擎么类似 hive spark hcl 10:53 是的,新版本默认都是tez hcl 10:54 大文件会出问题 闪电先生 10:55 根本没有用过这个东西…… Jack 10:57 @中国移动-hcl hdp集成的早 不知道apache版本默认是哪个引擎 以前是mr hcl 10:58 是的 hcl 10:58 以前是mr,mr目前hive-engine是ok的,没出什么问题 Jack 11:00 Hive on mr: HIVE3.X 底层执行引擎不再支持 hive on mr , CDP 中也不再支持 hive on spark 仅支持 hiveon tez (Hive on Tez 提供更好的 ETL 性能); hcl 11:20 是的新版本是tez了 -- Best Regards ------ 康悦 ritakang GitHub:Ritakang0451 E-mail:rita0...@163.com