This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new e1c80b177bc4 [Enhencement](trino-conenctor) add a document of how to 
access a new Trino Connector plugin (#491)
e1c80b177bc4 is described below

commit e1c80b177bc4e1ff1a2a71c30cf8fdfcf02abdac
Author: Tiewei Fang <[email protected]>
AuthorDate: Mon Apr 1 11:07:09 2024 +0800

    [Enhencement](trino-conenctor) add a document of how to access a new Trino 
Connector plugin (#491)
---
 .../trino-connector-developer-guide.md             | 164 +++++++++++++++++++++
 .../trino-connector-developer-guide.md             | 163 ++++++++++++++++++++
 sidebarsCommunity.json                             |   3 +-
 3 files changed, 329 insertions(+), 1 deletion(-)

diff --git a/community/how-to-contribute/trino-connector-developer-guide.md 
b/community/how-to-contribute/trino-connector-developer-guide.md
new file mode 100644
index 000000000000..f825b60e6779
--- /dev/null
+++ b/community/how-to-contribute/trino-connector-developer-guide.md
@@ -0,0 +1,164 @@
+---
+{
+    "title": "How to access a new Trino Connector plugin",
+    "language": "en"
+}
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# How to access a new Trino Connector plugin
+
+## Background
+
+Starting from version 4.0, Doris supports docking with the Trino Connector 
plugin. Through the rich Trino Connector plugin and Doris' Trino-Connector 
Catalog function, Doris can query more data sources.
+
+The purpose of the Trino Connector compatibility framework is to help Doris 
quickly connect to more data sources to meet user needs.
+
+For data sources such as Hive, Iceberg, Hudi, Paimon, and JDBC, we still 
recommend using the built-in catalog of Doris  for connection to achieve better 
performance, stability, and compatibility.
+
+This article mainly introduces how to adapt a Trino Connector plugin in Doris.
+
+The following takes Trino's kafka Connector plugin as an example to introduce 
in detail how to adapt Trino's kafka Connector plugin in Doris, and then access 
the kafka data source through the `Trino-Connector` catalog function of Doris.
+
+> Note: Trino is an Apache License 2.0 protocol open source software provided 
by [Trino Software Foundation](https://trino.io/foundation). For details, 
please visit [Trino official website](https://trino.io/docs/current/).
+
+## Step 1: Compile Kakfa connector plugin
+
+Trino does not provide officially compiled connector plugins, so we need to 
compile the required connector plugins ourselves.
+
+> Note: Since Doris currently uses the 435 version of the `trino-main` 
package, it is best to compile the 435 version of the connector plugin. There 
may be compatibility issues with non-435 versions of the connector plugin. If 
you encounter any problems, please provide feedback to the Apache Doris 
community.
+
+1. Clone Trino source code
+`$ git clone https://github.com/trinodb/trino.git`
+2. Switch Trino source code to version 435
+`$ git checkout 435`
+3. Enter the Kafka plugin source code directory
+`$ cd trino/plugin/trino-kafka`
+4. Compile the Kafka plugin
+`$ mvn clean install -DskipTest`
+5. After the compilation is completed, the target/trino-kafka-435 directory 
will be generated in the trino/plugin/trino-kafka/ directory.
+
+> Note: Each connector plugin is a subdirectory, not a jar package.
+
+## Step 2: Set up Doris's fe.conf / be.conf
+
+After preparing the Kafka connector plug-in, you need to configure Doris's 
fe.conf and be.conf so that Doris can load the plug-in.
+
+If we store the `trino-kafka-435` directory prepared above in the 
/path/to/connectors directory, and then we should configure:
+
+1. fe.conf
+
+    Configure `trino_connector_plugin_dir=/path/to/connectors` in the fe.conf 
file (if the `trino_connector_plugin_dir` attribute is not configured in 
fe.conf, the `${Doris_HOME}/fe/connectors` directory will be used by default).
+
+2. be.conf
+
+    Configure `trino_connector_plugin_dir=/path/to/connectors` in the be.conf 
file (if the `trino_connector_plugin_dir` attribute is not configured in 
be.conf, the `${Doris_HOME}/be/connectors` directory will be used by default).
+
+> Note: Doris uses a lazy loading method to load the Trino Connector plug-in, 
which means that if it is the first time to use the Trino-Connector Catalog 
function in Doris, there is no need to restart the FE / BE node, Doris will 
automatically load the plug-in. However, the plug-in will only be loaded once, 
so if the plug-in in the `/path/to/connectors/` directory changes, you need to 
restart the FE / BE node before the changed plug-in can be loaded.
+
+## Step 3: Using the Trino-Connector catalog feature
+
+After completing the previous two steps, we can use the Trino-Connector 
Catalog function in Doris.
+
+1. First let's create a Trino-Connector Catalog in Doris:
+
+    ```sql
+    create catalog kafka_tpch properties (
+    "type"="trino-connector",
+    -- The following four properties are derived from trino and are consistent 
with the properties in etc/catalog/kakfa.properties of trino
+    "connector.name"="kafka",
+    
"kafka.table-names"="tpch.customer,tpch.orders,tpch.lineitem,tpch.part,tpch.partsupp,tpch.supplier,tpch.nation,tpch.region",
+    "kafka.nodes"="localhost:9092",
+    "kafka.table-description-dir" = "/mnt/datadisk1/fangtiewei"
+    );
+    ```
+
+    explain:
+    - `type` :The type of catalog, here we must set it to `trino-connector`.
+    - 
`connector.name`、`kafka.table-names`、`kafka.nodes`、`kafka.table-description-dir`
 The following four properties are derived from trino, refer to: [Kafka 
connector](https://trino.io/docs/current/connector/kafka.html#configuration)
+
+    Different Connector plug-ins should set different properties. You can 
refer to the official trino documentation: 
[Connectors](https://trino.io/docs/current/connector.html#connector--page-root)
+
+2. Use catalog
+
+    After we create the Trino-Connector catalog, there is no difference in use 
from other catalogs. Switch to the catalog through the `switch kafka_tpch` 
statement, and then you can query the data of the Kafka data source.
+
+The following are the Doris Trino-Connector catalog configuration of several 
commonly used Connector plug-ins.
+
+1. Hive
+
+    ```sql
+    create catalog emr_hive properties (
+        "type"="trino-connector",
+
+        "connector.name"="hive",
+        "hive.metastore.uri"="thrift://ip:port",
+        "hive.config.resources"="/path/to/core-site.xml,/path/to/hdfs-site.xml"
+    );
+    ```
+
+    > Note:
+    > - You should add Hadoop's user name in the JVM parameters: 
-DHADOOP_USER_NAME=ftw, which can be configured at the end of the 
JAVA_OPTS_FOR_JDK_17 parameter in the fe.conf / be.conf file, such as 
JAVA_OPTS_FOR_JDK_17="...-DHADOOP_USER_NAME=ftw"
+
+
+2. Mysql
+
+    ```sql
+    create catalog trino_mysql properties (
+        "type"="trino-connector",
+        
+        "connector.name"="mysql",
+        "connection-url" = "jdbc:mysql://ip:port",
+        "connection-user" = "user",
+        "connection-password" = "password"
+    );
+    ```
+
+    > Note:
+    > - When encountering the error: Unknown or incorrect time zone: 
'Asia/Shanghai', you need to add: -Duser.timezone=Etc/GMT-8 to the JVM startup 
parameters, which can be configured at the end of the JAVA_OPTS_FOR_JDK_17 
parameter in the fe.conf / be.conf file.
+
+3. Kafka
+
+    ```sql
+    create catalog kafka properties (
+        "type"="trino-connector",
+        
+        "connector.name"="kafka",
+        "kafka.nodes"="localhost:9092",
+        "kafka.table-description-supplier"="CONFLUENT",
+        "kafka.confluent-schema-registry-url"="http://localhost:8081";,
+        "kafka.hide-internal-columns" = "false"
+    );
+    ```
+
+
+4. BigQuery
+
+    ```sql
+    create catalog bigquery_catalog properties (
+        "type"="trino-connector",
+
+        "connector.name"="bigquery",
+        "bigquery.project-id"="steam-circlet-388406",
+        
"bigquery.credentials-file"="/path/to/application_default_credentials.json"
+    );
+    ```
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/trino-connector-developer-guide.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/trino-connector-developer-guide.md
new file mode 100644
index 000000000000..805282c8b0c5
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-contribute/trino-connector-developer-guide.md
@@ -0,0 +1,163 @@
+---
+{
+    "title": "如何接入一个新的 Trino Connector插件",
+    "language": "zh-CN"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# 如何接入一个新的 Trino Connector 插件
+
+## 背景
+
+从 4.0 版本开始,Doris 支持对接 Trino Connector 插件。通过丰富的 Trino Connector 插件以及 Doris 的 
`Trino-Connector` Catalog 功能可以让 Doris 支持更多的数据源。
+
+Trino Connector 兼容框架的目的在于帮助 Doris 快速对接更多的数据源,以满足用户需求。
+对于 Hive、Iceberg、Hudi、Paimon、JDBC 等数据源,我们仍然建议使用 Doris 内置的 Catalog 
进行连接,已获得更好的性能、稳定性和兼容性。
+
+本文主要介绍,如何在 Doris 中适配一个 Trino Connector 插件。
+
+下面以 Trino 的 Kafka Connector 插件为例,详细介绍如何在 Doris 中适配 Trino 的 Kafka Connector 
插件,然后通过 Doris 的 `Trino-Connector` Catalog 功能访问 Kafka 数据源。
+
+> 注:Trino 是一款由 [Trino 软件基金会](https://trino.io/foundation) 提供的 Apache License 
2.0 协议开源软件,详情可访问 [Trino 官网](https://trino.io/docs/current/)。
+
+## 步骤一:编译 Kakfa Connector 插件
+
+Trino 没有提供官方编译好的 Connector 插件,所以需要我们自己编译所需 Connector 插件。
+
+> 注意:由于 Doris 当前使用 435版本的 `trino-main` 包,所以最好编译 435 版本的 Connector 插件。对于非 435 
版本的 Connector 插件,可能会存在兼容性问题。如遇问题,欢迎向 Apache Doris 社区反馈。
+
+
+1. 拉取 Trino 源码
+`$ git clone https://github.com/trinodb/trino.git`
+2. 将 Trino 切换到 435 版本
+`$ git checkout 435`
+3. 进入到 Kafka 插件源码目录
+`$ cd trino/plugin/trino-kafka`
+4. 编译 Kafka 插件
+`$ mvn clean install -DskipTest`
+5. 编译完成后,在 trino/plugin/trino-kafka/ 目录下会生成 target/trino-kafka-435 目录
+
+> 注意:每一个 Connector 插件都是一个子目录,而不是一个 jar 包。
+
+## 步骤二:设置 Doris 的 fe.conf / be.conf
+
+准备好 Kafka Connector 插件后,需要配置 Doris 的 fe.conf 、be.conf 从而使得 Doris 能够找到该插件。
+
+我们将上述准备好的 `trino-kafka-435` 目录存放在 /path/to/connectors 目录下,然后我们配置:
+
+1. fe.conf
+
+    在 fe.conf 文件中配置 `trino_connector_plugin_dir=/path/to/connectors` 
(若fe.conf中没有配置 `trino_connector_plugin_dir` 属性,则默认使用 
`${Doris_HOME}/fe/connectors` 目录)
+
+2. be.conf
+
+    在 be.conf 文件中配置 `trino_connector_plugin_dir=/path/to/connectors` (若 
be.conf 中没有配置 `trino_connector_plugin_dir` 属性 ,则默认使用 
`${Doris_HOME}/be/connectors` 目录)
+
+> 注意:Doris 采用懒加载的方式加载 Trino Connector 插件,这意味着如果是第一次在 Doris 中使用 Trino-Connector 
Catalog 功能,是无需重启 FE / BE 节点的,Doris 会自动加载插件。但是插件只会加载一次,所以如果 
`/path/to/connectors/` 目录下插件发生了变化,需要重启 FE / BE 节点,才可以加载变化后的插件。
+
+## 步骤三:使用 Trino-Connector Catalog 功能
+
+完成前面两个步骤后,我们就可以在 Doris 中使用 Trino-Connector Catalog 功能了。
+
+1. 首先让我们在 Doris 中创建一个 Trino-Connector Catalog:
+
+    ```sql
+    create catalog kafka_tpch properties (
+    "type"="trino-connector",
+    -- 下面这四个属性来源于 trino,与 trino 的 etc/catalog/kakfa.properties 中的属性一致。
+    "connector.name"="kafka",
+    
"kafka.table-names"="tpch.customer,tpch.orders,tpch.lineitem,tpch.part,tpch.partsupp,tpch.supplier,tpch.nation,tpch.region",
+    "kafka.nodes"="localhost:9092",
+    "kafka.table-description-dir" = "/mnt/datadisk1/fangtiewei"
+    );
+    ```
+
+    解释:
+    - `type` :Catalog 类型,这里我们必须设置为 `trino-connector` 。
+    - 
`connector.name`、`kafka.table-names`、`kafka.nodes`、`kafka.table-description-dir`
 这四个属性都是来源于trino,参考:[Kafka 
connector](https://trino.io/docs/current/connector/kafka.html#configuration)
+
+    
不同的Connector插件应该设置不同的属性,可以参考trino官方文档:[Connectors](https://trino.io/docs/current/connector.html#connector--page-root)
+
+2. 使用 Catalog
+
+    当我们创建好 Trino-Connector Catalog后,在使用上与其他 Catalog 没有任何区别。通过 `switch 
kafka_tpch` 语句切换到该 Catalog ,然后就可以查询该 Kafka 数据源的数据了。
+
+下面给出几个常用的 Connector 插件的 Doris trino-conenctor Catalog 配置
+
+1. Hive
+
+    ```sql
+    create catalog emr_hive properties (
+        "type"="trino-connector",
+
+        "connector.name"="hive",
+        "hive.metastore.uri"="thrift://ip:port",
+        "hive.config.resources"="/path/to/core-site.xml,/path/to/hdfs-site.xml"
+    );
+    ```
+
+    > 使用 Hive 插件时需要注意:
+    > - 需要在 JVM 参数里加上 Hadoop 的用户:-DHADOOP_USER_NAME=ftw,可以配置在 fe.conf / 
be.conf 文件的JAVA_OPTS_FOR_JDK_17 参数末尾,如 
JAVA_OPTS_FOR_JDK_17="...-DHADOOP_USER_NAME=ftw"
+
+
+2. Mysql
+
+    ```sql
+    create catalog trino_mysql properties (
+        "type"="trino-connector",
+        
+        "connector.name"="mysql",
+        "connection-url" = "jdbc:mysql://ip:port",
+        "connection-user" = "user",
+        "connection-password" = "password"
+    );
+    ```
+
+    > 使用 Mysql 插件时需要注意:
+    > - 遇到报错:Unknown or incorrect time zone: 'Asia/Shanghai' , 需要在JVM启动参数处加上: 
-Duser.timezone=Etc/GMT-8。可以配置在 fe.conf / be.conf 文件的 JAVA_OPTS_FOR_JDK_17 参数末尾。
+
+3. Kafka
+
+    ```sql
+    create catalog kafka properties (
+        "type"="trino-connector",
+        
+        "connector.name"="kafka",
+        "kafka.nodes"="localhost:9092",
+        "kafka.table-description-supplier"="CONFLUENT",
+        "kafka.confluent-schema-registry-url"="http://localhost:8081";,
+        "kafka.hide-internal-columns" = "false"
+    );
+    ```
+
+
+4. BigQuery
+
+    ```sql
+    create catalog bigquery_catalog properties (
+        "type"="trino-connector",
+
+        "connector.name"="bigquery",
+        "bigquery.project-id"="steam-circlet-388406",
+        
"bigquery.credentials-file"="/path/to/application_default_credentials.json"
+    );
+    ```
diff --git a/sidebarsCommunity.json b/sidebarsCommunity.json
index 49a206cb3171..437980cae946 100644
--- a/sidebarsCommunity.json
+++ b/sidebarsCommunity.json
@@ -15,7 +15,8 @@
                 "how-to-contribute/docs-format-specification",
                 "how-to-contribute/pull-request",
                 "how-to-contribute/contribute-doc",
-                "how-to-contribute/how-to-share-blogs"
+                "how-to-contribute/how-to-share-blogs",
+                "how-to-contribute/trino-connector-developer-guide"
             ]
         },
         {


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to