This is an automated email from the ASF dual-hosted git repository. guoyangze pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/flink.git
commit ff5d8f08662bd50479039914e7a8b85ca539e6c1 Author: Xiangyu Feng <[email protected]> AuthorDate: Wed Nov 1 19:16:15 2023 +0800 [FLINK-33235][doc] Translate OLAP Quickstart to Chinese --- docs/content.zh/docs/dev/table/olap_quickstart.md | 185 ++++++++++++---------- docs/content.zh/docs/dev/table/overview.md | 3 +- 2 files changed, 98 insertions(+), 90 deletions(-) diff --git a/docs/content.zh/docs/dev/table/olap_quickstart.md b/docs/content.zh/docs/dev/table/olap_quickstart.md index e0b3afba2bc..e109dfe71ee 100644 --- a/docs/content.zh/docs/dev/table/olap_quickstart.md +++ b/docs/content.zh/docs/dev/table/olap_quickstart.md @@ -1,5 +1,5 @@ --- -title: "Quickstart for Flink OLAP" +title: "OLAP Quickstart" weight: 91 type: docs aliases: @@ -24,64 +24,73 @@ specific language governing permissions and limitations under the License. --> -# Introduction +# OLAP 搭建指南 -Flink OLAP has already added to [Apache Flink Roadmap](https://flink.apache.org/roadmap/). It means Flink can not only support streaming and batch computing, but also support OLAP(On-Line Analytical Processing). This page will show how to quickly set up a Flink OLAP service, and will introduce some best practices. +OLAP(OnLine Analysis Processing)是数据分析领域的一项关键技术,通常被用来对较大的数据集进行秒级的复杂查询分析。Flink 作为一款流批一体的计算引擎,现在也同样支持用户将其作为一个 OLAP 计算服务来部署。本文将会帮助你在本地快速搭建起一个 Flink OLAP 集群并试用。同时,也会介绍一些在实际生产环境中使用 Flink 作为 OLAP 计算服务的实践。 -## Architecture +## 架构介绍 -The Flink OLAP service consists of three parts: Client, Flink SQL Gateway, Flink Session Cluster. +本章节将会介绍 Flink 作为一个 OLAP 服务的总体架构及其在使用上的优势。 -* **Client**: Could be any client that can interact with Flink SQL Gateway, such as SQL client, Flink JDBC driver and so on. -* **Flink SQL Gateway**: The SQL Gateway provides an easy way to submit the Flink Job, look up the metadata, and analyze table stats. -* **Flink Session Cluster**: We choose session clusters to run OLAP queries, mainly to avoid the overhead of cluster startup. +### 架构 -## Advantage +Flink OLAP 服务整体由3个部分组成,包括:客户端,Flink SQL Gateway 和 Flink Session Cluster。 -* **Massively Parallel Processing** - * Flink OLAP runs naturally as an MPP(Massively Parallel Processing) system, which supports low-latency ad-hoc queries -* **Reuse Connectors** - * Flink OLAP can reuse rich connectors in Flink ecosystem. -* **Unified Engine** - * Unified computing engine for Streaming/Batch/OLAP. +* **客户端**: 可以是任何可以和 [Flink SQL Gateway]({{< ref "docs/dev/table/sql-gateway/overview" >}}) 交互的客户端,包括:[SQL Client]({{< ref "docs/dev/table/sqlClient" >}}),[Flink JDBC Driver]({{< ref "docs/dev/table/jdbcDriver" >}}) 等等; +* **Flink SQL Gateway**: Flink SQL Gateway 服务主要用作 SQL 解析、元数据获取、统计信息分析、Plan 优化和集群作业提交; +* **Flink Session Cluster**: OLAP 查询建议运行在 [Session 集群]({{< ref "/docs/deployment/resource-providers/native_kubernetes#starting-a-flink-session-on-kubernetes" >}})上,主要是可以减少集群启动时的额外开销; -# Deploying in Local Mode +{{< img src="/fig/olap-architecture.svg" alt="Illustration of Flink OLAP Architecture" width="85%" >}} -## Downloading Flink +### 优势 -The same as [Local Installation]({{< ref "docs/try-flink/local_installation" >}}). Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows). We need to have at least Java 11 installed, Java 17 is more recommended in OLAP scenario. To check the Java version installed, type in your terminal: +* **并行计算架构** + * Flink 是天然的并行计算架构,执行 OLAP 查询时可以方便的通过调整并发来满足不同数据规模下的低延迟查询性能要求 +* **弹性资源管理** + * Flink 的集群资源具有良好的 Min、Max 扩缩容能力,可以根据集群负载动态调整所使用的资源 +* **生态丰富** + * Flink OLAP 可以复用 Flink 生态中丰富的 [连接器]({{< ref "docs/connectors/table/overview" >}}) +* **统一引擎** + * 支持流 / 批 / OLAP 的统一计算引擎 + +## 本地运行 +本章将指导用户如何在本地试用 Flink OLAP 服务。 + +### 下载 Flink + +这里的方法和[本地安装]({{< ref "docs/try-flink/local_installation" >}})中记录的步骤类似。Flink 可以运行在任何类 UNIX 的操作系统下面, 例如:Linux, Mac OS X 和 Cygwin (for Windows)。你需要在本地安装好 __Java 11__,可以通过下述命令行的方式检查安装好的 Java 版本: ``` java -version ``` -Next, [Download](https://flink.apache.org/downloads/) the latest binary release of Flink, then extract the archive: +下一步, [下载](https://flink.apache.org/downloads/) Flink 最新的二进制包并进行解压: ``` tar -xzf flink-*.tgz ``` -## Starting a local cluster +### 启动本地集群 -To start a local cluster, run the bash script that comes with Flink: +运行下述脚本,即可在本地启动集群: ``` ./bin/start-cluster.sh ``` -You should be able to navigate to the web UI at localhost:8081 to view the Flink dashboard and see that the cluster is up and running. +用户可以导航到本地的 Web UI(http://localhost:8081)来查看 Flink Dashboard 并检查集群是否已拉起和正在运行。 -## Start a SQL Client CLI +### 启动 SQL Client -You can start the CLI with an embedded gateway by calling: +用户可以通过运行下述命令,用命令行启动内嵌了 Gateway 的 SQL Client: ``` ./bin/sql-client.sh ``` -## Running Queries +### 运行 SQL 查询 -You could simply execute queries in CLI and retrieve the results. +通过命令行,用户可以方便的提交查询并获取结果: ``` SET 'sql-client.execution.result-mode' = 'tableau'; @@ -102,98 +111,98 @@ GROUP BY buyer ORDER BY total_cost LIMIT 3; ``` -And then you could find job detail information in web UI at localhost:8081. +具体的作业运行信息你可以通过访问本地的 Web UI(http://localhost:8081)来获取。 -# Deploying in Production +## 生产环境部署 -This section guides you through setting up a production ready Flink OLAP service. +这个章节会向你介绍一些在生产环境中使用 Flink OLAP 服务的建议。 -## Cluster Deployment +### 客户端 -In production, we recommend to use Flink Session Cluster, Flink SQL Gateway and Flink JDBC Driver to build an OLAP service. +#### Flink JDBC Driver -### Session Cluster +Flink JDBC Driver 提供了底层的连接管理能力,方便用户使用并向 SQL Gateway 提交查询请求。在实际的生产使用中,用户需要注意如何复用 JDBC 连接,来避免 Gateway 频繁的执行 Session 相关的创建及关闭操作,从而减少端到端的作业耗时。详细信息可以参考文档 [Flink JDBC Driver]({{ <ref "docs/dev/table/jdbcDriver"> }})。 -For Flink Session Cluster, we recommend to deploy Flink on native Kubernetes using session mode. Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. By deploying on native Kubernetes, Flink Session Cluster is able to dynamically allocate and de-allocate TaskManagers. For more information, please refer to [Native Kubernetes]({{< ref "docs/deployment/resource-providers/native_kubernetes">}}). +### 集群部署 -### SQL Gateway +在生产环境中,建议使用 Flink Session 集群、Flink SQL Gateway 来搭建 OLAP 服务。 -For Flink SQL Gateway, we recommend deploying it as a stateless microservice and register this on the service discovery component. For more information, please refer to the [SQL Gateway Overview]({{< ref "docs/dev/table/sql-gateway/overview">}}). +#### Session Cluster -### Flink JDBC Driver +Flink Session 集群建议搭建在 Native Kubernetes 环境下,使用 Session 模式运行。K8S 作为一个流行的容器编排系统可以自动化的支持不同计算程序的部署、扩展和管理。通过将集群部署在 Native Kubernetes 上,Flink Session 集群支持动态的增减 TaskManagers。详细信息可以参考 [Native Kubernetes]({{< ref "docs/deployment/resource-providers/native_kubernetes">}})。同时,你可以在 Session 集群中配置 [slotmanager.number-of-slots.min]({{< ref "docs/deployment/config#slotmanager-number-of-slots-min" >}}),这个可以帮助你显著减少 OLAP 查询执行的冷启动时间,详情请参阅 [FLIP-362](https://cwiki.apache.org/confluence/display/FLINK/FLIP-362 [...] -When submitting queries to SQL Gateway, we recommend using Flink JDBC Driver since it provides low-level connection management. When used in production, we need to pay attention to reuse the JDBC connection to avoid frequently creating/closing sessions in the Gateway. For more information, please refer to the [Flink JDBC Driver]({{{<ref "docs/dev/table/jdbcDriver">}}}). +#### Flink SQL Gateway -## Datasource Configurations +对于 Flink SQL Gateway,用户可以将其部署为无状态的微服务并注册到服务发现的组件上来对外提供服务,方便客户端可以进行负载均衡。详细信息可以参考 [SQL Gateway Overview]({{< ref "docs/dev/table/sql-gateway/overview">}})。 -### Catalogs +### 数据源配置 -In OLAP scenario, we recommend using FileCatalogStore in the catalog configuration introduced in [FLIP-295](https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations). As a long running service, Flink OLAP cluster's catalog information will not change frequently and can be re-used cross sessions. For more information, please refer to the [Catalog Store]({{< ref "docs/dev/table/catalogs#catalog-store">}}). +#### Catalogs -### Connectors +在 OLAP 场景下,集群建议配置 [Catalogs]({{< ref "docs/dev/table/catalogs">}}) 中提供的 FileCatalogStore 作为 Catalog 选项。作为一个常驻服务,Flink OLAP 集群的元信息通常不会频繁变更而且需要支持跨 Session 的复用,这样可以减少元信息加载的冷启动时间。详细信息可以参考文档 [Catalog Store]({{< ref "docs/dev/table/catalogs#catalog-store">}})。 -Both Session Cluster and SQL Gateway rely on connectors to analyze table stats and read data from the configured data source. To add connectors, please refer to the [Connectors and Formats]({{< ref "docs/connectors/table/overview">}}). +#### 连接器 -## Cluster Configurations +Session Cluster 和 SQL Gateway 都依赖连接器来获取表的元信息同时从配置好的数据源读取数据,详细信息可以参考文档 [连接器]({{< ref "docs/connectors/table/overview" >}})。 -In OLAP scenario, we picked out a few configurations that can help improve user usability and query performance. +### 推荐参数配置 -### SQL&Table Options +对于 OLAP 场景,合理的参数配置可以帮助用户较大的提升服务总体的可用性和查询性能,下面列了一些生产环境建议的参数配置。 -| Parameters | Default | Recommended | -|:-------------------------------------|:--------|:------------| -| table.optimizer.join-reorder-enabled | false | true | -| pipeline.object-reuse | false | true | +#### SQL&Table 参数 -### Runtime Options +| 参数名称 | 默认值 | 推荐值 | +|:---------------------------------------------------------------------------------------------------------------|:------|:-----| +| [table.optimizer.join-reorder-enabled]({{<ref "docs/dev/table/config#table-optimizer-join-reorder-enabled">}}) | false | true | +| [pipeline.object-reuse]({{< ref "docs/deployment/config#pipeline-object-reuse" >}}) | false | true | -| Parameters | Default | Recommended | -|:-----------------------------|:-----------------------|:------------------------------------------------------------------------------------------------------------------------------------------| -| execution.runtime-mode | STREAMING | BATCH | -| execution.batch-shuffle-mode | ALL_EXCHANGES_BLOCKING | ALL_EXCHANGES_PIPELINED | -| env.java.opts.all | {default value} | {default value} -XX:PerMethodRecompilationCutoff=10000 -XX:PerBytecodeRecompilationCutoff=10000-XX:ReservedCodeCacheSize=512M -XX:+UseZGC | -| JDK Version | 11 | 17 | +#### Runtime 参数 -We strongly recommend using JDK17 with ZGC in OLAP scenario in order to provide zero gc stw and solve the issue described in [FLINK-32746](https://issues.apache.org/jira/browse/FLINK-32746). +| 参数名称 | 默认值 | 推荐值 | +|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------|:------------------------------------------------------------------------------------------------------------------------------------------| +| [execution.runtime-mode]({{< ref "docs/deployment/config#execution-runtime-mode" >}}) | STREAMING | BATCH | +| [execution.batch-shuffle-mode]({{< ref "docs/deployment/config#execution-batch-shuffle-mode" >}}) | ALL_EXCHANGES_BLOCKING | ALL_EXCHANGES_PIPELINED | +| [env.java.opts.all]({{< ref "docs/deployment/config#env-java-opts-all" >}}) | {default value} | {default value} -XX:PerMethodRecompilationCutoff=10000 -XX:PerBytecodeRecompilationCutoff=10000-XX:ReservedCodeCacheSize=512M -XX:+UseZGC | +| JDK Version | 11 | 17 | -### Scheduling Options +推荐在 OLAP 生产环境中使用 JDK17 和 ZGC,ZGC 可以优化 Metaspace 区垃圾回收的问题,详见 [FLINK-32746](https://issues.apache.org/jira/browse/FLINK-32746)。同时 ZGC 在堆内内存垃圾回收时可以提供接近0毫秒的应用程序暂停时间。OLAP 查询在执行时需要使用批模式,因为 OLAP 查询的执行计划中可能同时出现 Pipelined 和 Blocking 属性的边。批模式下的调度器支持对作业分阶段调度,可以避免出现调度死锁问题。 -| Parameters | Default | Recommended | -|:---------------------------------------------------------|:------------------|:------------------| -| jobmanager.scheduler | Default | Default | -| jobmanager.execution.failover-strategy | region | full | -| restart-strategy.type | (none) | disable | -| jobstore.type | File | Memory | -| jobstore.max-capacity | Integer.MAX_VALUE | 500 | +#### Scheduling 参数 -We would like to highlight the usage of `PipelinedRegionSchedulingStrategy`. Since many OLAP queries will have blocking edges in their jobGraph. +| 参数名称 | 默认值 | 推荐值 | +|:------------------------------------------------------------------------------------------------------------------------|:------------------|:--------| +| [jobmanager.scheduler]({{< ref "docs/deployment/config#jobmanager-scheduler" >}}) | Default | Default | +| [jobmanager.execution.failover-strategy]({{< ref "docs/deployment/config#jobmanager-execution-failover-strategy-1" >}}) | region | full | +| [restart-strategy.type]({{< ref "docs/deployment/config#restart-strategy-type" >}}) | (none) | disable | +| [jobstore.type]({{< ref "docs/deployment/config#jobstore-type" >}}) | File | Memory | +| [jobstore.max-capacity]({{< ref "docs/deployment/config#jobstore-max-capacity" >}}) | Integer.MAX_VALUE | 500 | -### Network Options -| Parameters | Default | Recommended | -|:------------------------------------|:-----------|:---------------| -| rest.server.numThreads | 4 | 32 | -| web.refresh-interval | 3000 | 300000 | -| pekko.framesize | 10485760b | 104857600b | +#### 网络参数 -### ResourceManager Options +| 参数名称 | 默认值 | 推荐值 | +|:--------------------------------------------------------------------------------------|:----------|:-----------| +| [rest.server.numThreads]({{< ref "docs/deployment/config#rest-server-numthreads" >}}) | 4 | 32 | +| [web.refresh-interval]({{< ref "docs/deployment/config#web-refresh-interval" >}}) | 3000 | 300000 | +| [pekko.framesize]({{< ref "docs/deployment/config#pekko-framesize" >}}) | 10485760b | 104857600b | -| Parameters | Default | Recommended | -|:-------------------------------------|:----------|:---------------| -| kubernetes.jobmanager.replicas | 1 | 2 | -| kubernetes.jobmanager.cpu.amount | 1.0 | 16.0 | -| jobmanager.memory.process.size | (none) | 65536m | -| jobmanager.memory.jvm-overhead.max | 1g | 6144m | -| kubernetes.taskmanager.cpu.amount | (none) | 16 | -| taskmanager.numberOfTaskSlots | 1 | 32 | -| taskmanager.memory.process.size | (none) | 65536m | -| taskmanager.memory.managed.size | (none) | 65536m | +#### 资源管理参数 -We prefer to use large taskManager pods in OLAP since this can put more computation in local and reduce network/deserialization/serialization overhead. Meanwhile, since JobManager is a single point of calculation in OLAP scenario, we also prefer large pod. +| 参数名称 | 默认值 | 推荐值 | +|:--------------------------------------------------------------------------------------------------------------|:-------|:----------------------------------------| +| [kubernetes.jobmanager.replicas]({{< ref "docs/deployment/config#kubernetes-jobmanager-replicas" >}}) | 1 | 2 | +| [kubernetes.jobmanager.cpu.amount]({{< ref "docs/deployment/config#kubernetes-jobmanager-cpu-amount" >}}) | 1.0 | 16.0 | +| [jobmanager.memory.process.size]({{< ref "docs/deployment/config#jobmanager-memory-process-size" >}}) | (none) | 32g | +| [jobmanager.memory.jvm-overhead.max]({{< ref "docs/deployment/config#jobmanager-memory-jvm-overhead-max" >}}) | 1g | 3g | +| [kubernetes.taskmanager.cpu.amount]({{< ref "docs/deployment/config#kubernetes-taskmanager-cpu-amount" >}}) | (none) | 16 | +| [taskmanager.numberOfTaskSlots]({{< ref "docs/deployment/config#taskmanager-numberoftaskslots" >}}) | 1 | 32 | +| [taskmanager.memory.process.size]({{< ref "docs/deployment/config#taskmanager-memory-process-size" >}}) | (none) | 65536m | +| [taskmanager.memory.managed.size]({{< ref "docs/deployment/config#taskmanager-memory-managed-size" >}}) | (none) | 16384m | +| [slotmanager.number-of-slots.min]({{< ref "docs/deployment/config#slotmanager-number-of-slots-min" >}}) | 0 | {taskManagerNumber * numberOfTaskSlots} | -# Future Work -There is a big margin for improvement in Flink OLAP, both in usability and query performance, and we trace all of them in underlying tickets. +用户可以根据实际的生产情况把 `slotmanager.number-of-slots.min` 配置为一个合理值,并将其用作集群的预留资源池从而支持 OLAP 查询。在 OLAP 场景下,TaskManager 建议配置为较大的资源规格,因为这样可以把更多的计算放到本地从而减少网络 / 序列化 / 反序列化的开销。JobManager 因为是 OLAP 场景下的计算单点,也建议使用较大的资源规格。 + +## Future Work +作为 [Apache Flink Roadmap](https://flink.apache.org/what-is-flink/roadmap/) 的一部分,社区将会持续优化 Flink 在 OLAP 场景下易用性与可用性,提升查询性能与集群承载能力。相关的工作我们都会通过下面的 jira 追踪进展: - https://issues.apache.org/jira/browse/FLINK-25318 - https://issues.apache.org/jira/browse/FLINK-32898 - -Furthermore, we are adding relevant OLAP benchmarks to the Flink repository such as [flink-benchmarks](https://github.com/apache/flink-benchmarks). \ No newline at end of file diff --git a/docs/content.zh/docs/dev/table/overview.md b/docs/content.zh/docs/dev/table/overview.md index ce62e434986..13d6dc88d73 100644 --- a/docs/content.zh/docs/dev/table/overview.md +++ b/docs/content.zh/docs/dev/table/overview.md @@ -53,8 +53,7 @@ and later use the DataStream API to build alerting based on the matched patterns * [内置函数]({{< ref "docs/dev/table/functions/systemFunctions" >}}): Table API 和 SQL 中的内置函数。 * [SQL Client]({{< ref "docs/dev/table/sqlClient" >}}): 不用编写代码就可以尝试 Flink SQL,可以直接提交 SQL 任务到集群上。 * [SQL Gateway]({{< ref "docs/dev/table/sql-gateway/overview" >}}): SQL 提交服务,支持多个客户端从远端并发提交 SQL 任务。 +* [OLAP Quickstart]({{< ref "docs/dev/table/olap_quickstart" >}}): Flink OLAP 服务搭建指南。 * [SQL Jdbc Driver]({{< ref "docs/dev/table/jdbcDriver" >}}): 标准JDBC Driver,可以提交Flink SQL作业到Sql Gateway。 -* [OLAP Quickstart]({{< ref "docs/dev/table/olap_quickstart" >}}): Flink OLAP服务搭建指南. - {{< top >}}
