(incubator-streampark-website) branch dev updated: Review and Improve the translation for blog articles (#295)

benjobs Fri, 24 Nov 2023 00:20:43 -0800

This is an automated email from the ASF dual-hosted git repository.

benjobs pushed a commit to branch dev
in repository 
https://gitbox.apache.org/repos/asf/incubator-streampark-website.git



The following commit(s) were added to refs/heads/dev by this push:
     new 550a5bc0 Review and Improve the translation for blog articles (#295)
550a5bc0 is described below

commit 550a5bc0a106e5b4f4ae7f518f308926c5a984e8
Author: Leomax_Sun <[email protected]>
AuthorDate: Fri Nov 24 16:20:29 2023 +0800

    Review and Improve the translation for blog articles (#295)
---
 blog/0-streampark-flink-on-k8s.md                  | 30 ++++++++++++----------
 blog/1-flink-framework-streampark.md               | 10 ++++----
 .../0-streampark-flink-on-k8s.md                   | 10 +++++---
 3 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/blog/0-streampark-flink-on-k8s.md 
b/blog/0-streampark-flink-on-k8s.md
index c074be05..4474edba 100644
--- a/blog/0-streampark-flink-on-k8s.md
+++ b/blog/0-streampark-flink-on-k8s.md
@@ -38,7 +38,7 @@ RUN mkdir -p $FLINK_HOME/usrlib
 COPY my-flink-job.jar $FLINK_HOME/usrlib/my-flink-job.jar
 ```
 
-4. Use Flink client script to start Flink tasks
+3. Use Flink client script to start Flink tasks
 
 ```shell
 
@@ -52,13 +52,13 @@ COPY my-flink-job.jar $FLINK_HOME/usrlib/my-flink-job.jar
     local:///opt/flink/usrlib/my-flink-job.jar
 ```
 
-5. Use the Kubectl command to obtain the WebUI access address and JobId of the 
Flink job.
+4. Use the Kubectl command to obtain the WebUI access address and JobId of the 
Flink job.
 
 ```shell
 kubectl -n flink-cluster get svc
 ```
 
-6. Stop the job using Flink command
+5. Stop the job using Flink command
 
 ```shell
 ./bin/flink cancel
@@ -73,13 +73,13 @@ kubectl -n flink-cluster get svc
 
   There will be higher requirements for using Flink on Kubernetes in 
enterprise-level production environments. Generally, you will choose to build 
your own platform or purchase related commercial products. No matter which 
solution meets the product capabilities: large-scale task development and 
deployment, status tracking, operation and maintenance monitoring , failure 
alarms, unified task management, high availability, etc. are common demands.
 
-  In response to the above issues, we investigated open source projects in the 
open source field that support the development and deployment of Flink on 
Kubernetes tasks. During the investigation, we also encountered other excellent 
open source projects. After comprehensively comparing multiple open source 
projects, we came to the conclusion: ** Whether StreamPark is completed The 
overall performance such as speed, user experience, and stability are all very 
good, so we finally chose Str [...]
+  In response to the above issues, we investigated open source projects in the 
open source field that support the development and deployment of Flink on 
Kubernetes tasks. During the investigation, we also encountered other excellent 
open source projects. After comprehensively comparing multiple open source 
projects, we came to the conclusion: **StreamPark has great performance in 
either completness, user experience, or stability, so we finally chose 
StreamPark as our one-stop real-time c [...]
 
   Let’s take a look at how StreamPark supports Flink on Kubernetes:
 
 ### **Basic environment configuration**
 
-  Basic environment configuration includes Kubernetes and Docker warehouse 
information as well as Flink client information configuration. The simplest way 
for the Kubernetes basic environment is to directly copy the .kube/config of 
the Kubernetes node to the StreamPark node user directory, and then use the 
kubectl command to create a Flink-specific Kubernetes Namespace and perform 
RBAC configuration.
+  Basic environment configuration includes Kubernetes and Docker repository 
information as well as Flink client information configuration. The simplest way 
for the Kubernetes basic environment is to directly copy the .kube/config of 
the Kubernetes node to the StreamPark node user directory, and then use the 
kubectl command to create a Flink-specific Kubernetes Namespace and perform 
RBAC configuration.
 
 ```shell
 # Create k8s namespace used by Flink jobs
@@ -125,13 +125,13 @@ After the job development is completed, the job comes 
online. In this step, Stre
 - Dependency download in job
 - Build job (JAR package)
 - Build image
-- Push the image to the remote warehouse
+- Push the image to the remote repository
 
 **For users: Just click the cloud-shaped online button in the task list**
 
 ![](/blog/relx/operation.png)
 
-We can see a series of work done by StreamPark when building and pushing the 
image.: **Read the configuration, build the image, and push the image to the 
remote warehouse...** I want to give StreamPark a big thumbs up!
+We can see a series of work done by StreamPark when building and pushing the 
image: **Read the configuration, build the image, and push the image to the 
remote repository...** I want to give StreamPark a big thumbs up!
 
 ![](/blog/relx/step_details.png)
 
@@ -181,13 +181,13 @@ Next, let’s take a look at how StreamPark supports this 
capability:
 
 ## Problems encountered
 
-  Any new technology has a process of exploration and pitfalls. The experience 
of failure is precious. Here are some pitfalls and experiences that StreamPark 
has stepped into during the implementation of fog core technology. **The 
content of this section is not only about StreamPark. I believe it will bring 
some reference to all friends who use Flink on Kubernetes.
+  Any new technology has a process of exploration and fall into pitfalls. The 
experience of failure is precious. Here are some pitfalls and experiences that 
StreamPark has stepped into during the implementation of fog core technology. 
**The content of this section is not only about StreamPark. I believe it will 
bring some reference to all friends who use Flink on Kubernetes**.
 
 ### **FAQs are summarized below**
 
 - **Kubernetes pod failed to pull the image**
 
-The main problem is that Kubernetes pod-template lacks docker’s 
imagePullSecrets
+  The main problem is that Kubernetes pod-template lacks docker’s 
imagePullSecrets
 
 - **Scala version inconsistent**
 
@@ -215,9 +215,11 @@ The main problem is that Kubernetes pod-template lacks 
docker’s imagePullSecre
 
 - **The changed code did not take effect after it was republished**
 
-This issue is related to the Kubernetes pod image pull policy. It is 
recommended to set the Pod image pull policy to Always:
+  This issue is related to the Kubernetes pod image pull policy. It is 
recommended to set the Pod image pull policy to Always:
 
+```shell
 ‍-Dkubernetes.container.image.pull-policy=Always
+```
 
 - **Each restart of the task will result in one more Job instance**
 
@@ -225,7 +227,7 @@ This issue is related to the Kubernetes pod image pull 
policy. It is recommended
 
 - **How to implement kubernetes pod domain name access**
 
-Domain name configuration only needs to be configured in pod-template 
according to Kubernetes resources. I can share with you a pod-template.yaml 
template that I summarized based on the above issues:
+  Domain name configuration only needs to be configured in pod-template 
according to Kubernetes resources. I can share with you a pod-template.yaml 
template that I summarized based on the above issues:
 
 ```yaml
 
@@ -281,7 +283,7 @@ Create a Dockerfile file and place the Dockerfile file in 
the same folder as the
 FROM flink:1.13.6-scala_2.11COPY lib $FLINK_HOME/lib/
 ```
 
-**3. Create a basic image and push it to a private warehouse**
+**3. Create a basic image and push it to a private repository**
 
 ```shell
 docker login --username=xxxdocker \
@@ -295,7 +297,7 @@ push 
k8s-harbor.xxx.com/streamx/udf_flink_1.13.6-scala_2.11:latest
 
 - **StreamPark supports Flink job metric monitoring**
 
-It would be great if StreamPark could connect to Flink Metric data and display 
Flink’s real-time consumption data at every moment on the StreamPark platform.
+  It would be great if StreamPark could connect to Flink Metric data and 
display Flink’s real-time consumption data at every moment on the StreamPark 
platform.
 
 - **StreamPark supports Flink job log persistence**
 
@@ -303,4 +305,4 @@ It would be great if StreamPark could connect to Flink 
Metric data and display F
 
 - **Improvement of the problem of too large image**
 
-StreamPark's current image support for Flink on Kubernetes jobs is to combine 
the basic image and user code into a Fat image and push it to the Docker 
warehouse. The problem with this method is that it takes a long time when the 
image is too large. It is hoped that the basic image can be restored in the 
future. There is no need to hit the business code together every time, which 
can greatly improve development efficiency and save costs.
+  StreamPark's current image support for Flink on Kubernetes jobs is to 
combine the basic image and user code into a Fat image and push it to the 
Docker repository. The problem with this method is that it takes a long time 
when the image is too large. It is hoped that the basic image can be restored 
in the future. There is no need to hit the business code together every time, 
which can greatly improve development efficiency and save costs.
diff --git a/blog/1-flink-framework-streampark.md 
b/blog/1-flink-framework-streampark.md
index b02b9e48..25070df4 100644
--- a/blog/1-flink-framework-streampark.md
+++ b/blog/1-flink-framework-streampark.md
@@ -4,7 +4,7 @@ title: StreamPark - Powerful Flink Development Framework
 tags: [StreamPark, DataStream, FlinkSQL]
 ---
 
-Although the Hadoop system is widely used today, its architecture is 
complicated, it has a high maintenance complexity, version upgrades are 
challenging, and due to departmental reasons, data center scheduling is 
prolonged. We urgently need to explore agile data platform models. With the 
current prevalence of cloud-native architecture and the backdrop of lake and 
warehouse integration, we have decided to use Doris as an offline data 
warehouse and TiDB (which is already in production) as  [...]
+Although the Hadoop system is widely used today, its architecture is 
complicated, it has a high maintenance complexity, version upgrades are 
challenging, and due to departmental reasons, data center scheduling is 
prolonged. We urgently need to explore agile data platform models. With the 
current popularization of cloud-native architecture and the integration between 
lake and warehous, we have decided to use Doris as an offline data warehouse 
and TiDB (which is already in production) as a [...]
 
 ![](/blog/belle/doris.png)
 
@@ -12,7 +12,7 @@ Although the Hadoop system is widely used today, its 
architecture is complicated
 
 # 1. Background
 
-Although the Hadoop system is widely used today, its architecture is 
complicated, it has a high maintenance complexity, version upgrades are 
challenging, and due to departmental reasons, data center scheduling is 
prolonged. We urgently need to explore agile data platform models. With the 
current prevalence of cloud-native architecture and the backdrop of lake and 
warehouse integration, we have decided to use Doris as an offline data 
warehouse and TiDB (which is already in production) as  [...]
+Although the Hadoop system is widely used today, its architecture is 
complicated, it has a high maintenance complexity, version upgrades are 
challenging, and due to departmental reasons, data center scheduling is 
prolonged. We urgently need to explore agile data platform models. With the 
current popularization of cloud-native architecture and the integration between 
lake and warehous, we have decided to use Doris as an offline data warehouse 
and TiDB (which is already in production) as a [...]
 
 ![](/blog/belle/doris.png)
 
@@ -217,7 +217,7 @@ It becomes evident that StreamPark essentially uploads the 
jar package to the Fl
 
 ### Custom Code Mode
 
-To our delight, StreamPark also provides support for coding 
DataStream/FlinkSQL tasks. For special requirements, we can author our 
implementations in Java/Scala. You can compose tasks following the scaffold 
method recommended by StreamPark or write a standard Flink task. By adopting 
this approach, we can delegate code management to git, utilizing the platform 
for automated compilation, packaging, and deployment. Naturally, if 
functionality can be achieved via SQL, we would prefer not to  [...]
+To our delight, StreamPark also provides support for coding 
DataStream/FlinkSQL tasks. For special requirements, we can achieve our 
implementations in Java/Scala. You can compose tasks following the scaffold 
method recommended by StreamPark or write a standard Flink task. By adopting 
this approach, we can delegate code management to git, utilizing the platform 
for automated compilation, packaging, and deployment. Naturally, if 
functionality can be achieved via SQL, we would prefer not to [...]
 
 <br/><br/>
 
@@ -225,11 +225,11 @@ To our delight, StreamPark also provides support for 
coding DataStream/FlinkSQL
 
 ## Suggestions for Improvement
 
-StreamPark, as with any new tool, does have areas ripe for enhancement based 
on our current evaluations:
+StreamPark, similar to any other new tools, does have areas for further 
enhancement based on our current evaluations:
 
 * **Strengthening Resource Management**: Features like multi-file system jar 
resources and robust task versioning are still awaiting additions.
 * **Enriching Frontend Features**: For instance, once a task is added, 
functionalities like copying could be integrated.
-* **Visualization of Task Submission Logs**: The process of task submission 
involves loading class files, jar packaging, building and submitting images, 
and more. A failure at any of these stages could halt the task. Yet, error logs 
often lack clarity, or due to some anomaly, the exceptions aren't thrown as 
expected, leaving users puzzled about rectifications.
+* **Visualization of Task Submission Logs**: The process of task submission 
involves loading class files, jar packaging, building and submitting images, 
and more. A failure at any of these stages could halt the task. However, error 
logs are not always clear, or due to some anomaly, the exceptions aren't thrown 
as expected, leaving users puzzled about rectifications.
 
 It's a universal truth that innovations aren't perfect from the outset. 
Although minor issues exist and there are areas for improvement with 
StreamPark, its merits outweigh its limitations. As a result, we've chosen 
StreamPark as our Flink DevOps platform. We're also committed to collaborating 
with its main developers to refine StreamPark further. We wholeheartedly invite 
others to use it and contribute towards its advancement.
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md 
b/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md
index 4e9096c8..39e96328 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-blog/0-streampark-flink-on-k8s.md
@@ -38,7 +38,7 @@ RUN mkdir -p $FLINK_HOME/usrlib
 COPY my-flink-job.jar $FLINK_HOME/usrlib/my-flink-job.jar
 ```
 
-4. 使用 Flink 客户端脚本启动 Flink 任务
+3. 使用 Flink 客户端脚本启动 Flink 任务
 
 ```shell
 
@@ -52,13 +52,13 @@ COPY my-flink-job.jar $FLINK_HOME/usrlib/my-flink-job.jar
     local:///opt/flink/usrlib/my-flink-job.jar
 ```
 
-5. 使用 Kubectl 命令获取到 Flink 作业的 WebUI 访问地址和 JobId
+4. 使用 Kubectl 命令获取到 Flink 作业的 WebUI 访问地址和 JobId
 
 ```shell
 kubectl -n flink-cluster get svc
 ```
 
-6. 使用Flink命令停止作业
+5. 使用Flink命令停止作业
 
 ```shell
 ./bin/flink cancel
@@ -186,7 +186,7 @@ StreamPark 在雾芯科技落地较晚，目前主要用于实时数据集成作
 
 ## 遇到的问题
 
-任何新技术都有探索与踩坑的过程，失败的经验是宝贵的，这里介绍下 StreamPark 在雾芯科技落地过程中踩的一些坑和经验，**这块的内容不仅仅关于 
StreamPark 的部分, 相信会带给所有使用 Flink on Kubernetes 的小伙伴一些参考。
+任何新技术都有探索与踩坑的过程，失败的经验是宝贵的，这里介绍下 StreamPark 在雾芯科技落地过程中踩的一些坑和经验，**这块的内容不仅仅关于 
StreamPark 的部分, 相信会带给所有使用 Flink on Kubernetes 的小伙伴一些参考**。
 
 ### **常见问题总结如下**
 
@@ -222,7 +222,9 @@ HDFS 阿里云 OSS/AWS S3 都可以进行 checkpoint 和 savepoint 存储，Flin
 
 该问题与 Kubernetes pod 镜像拉取策略有关，建议将 Pod 镜像拉取策略设置为 Always：
 
+```shell
 ‍-Dkubernetes.container.image.pull-policy=Always
+```
 
 - **任务每次重启都会导致多出一个 Job 实例**

(incubator-streampark-website) branch dev updated: Review and Improve the translation for blog articles (#295)

Reply via email to