This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new fc7f364 Automated deployment: ec04ffd7523f34c87a9a9b7b03730713f32efff1
fc7f364 is described below
commit fc7f3642a22686a56e64dbf3c11b6ba9928d575e
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Fri Dec 17 02:45:09 2021 +0000
Automated deployment: ec04ffd7523f34c87a9a9b7b03730713f32efff1
---
en-us/blog/YouZan-case-study.html | 321 +++++++++++++++++++++++++++-----------
en-us/blog/YouZan-case-study.json | 2 +-
2 files changed, 229 insertions(+), 94 deletions(-)
diff --git a/en-us/blog/YouZan-case-study.html
b/en-us/blog/YouZan-case-study.html
index d820d33..0bdc298 100644
--- a/en-us/blog/YouZan-case-study.html
+++ b/en-us/blog/YouZan-case-study.html
@@ -12,157 +12,292 @@
</head>
<body>
<div id="root"><div class="blog-detail-page" data-reactroot=""><header
class="header-container header-container-dark"><div class="header-body"><span
class="mobile-menu-btn mobile-menu-btn-dark"></span><a
href="/en-us/index.html"><img class="logo" src="/img/hlogo_white.svg"/></a><div
class="search search-dark"><span class="icon-search"></span></div><span
class="language-switch language-switch-dark">中</span><div
class="header-menu"><div><ul class="ant-menu whiteClass ant-menu-light ant-m
[...]
-<p><a href="https://imgpp.com/image/i2Fo0"><img
src="https://imgpp.com/images/2021/12/16/1639383815755.md.png"
alt="1639383815755.md.png"></a></p>
-<p>At the recent Apache DolphinScheduler Meetup 2021, Zheqi Song, the Director
of Youzan Big Data Development Platform shared the design scheme and production
environment practice of its scheduling system migration from Airflow to Apache
DolphinScheduler.</p>
-<p>This post-90s young man from Hangzhou, Zhejiang Province joined Youzan in
September 2019, where he is engaged in the research and development of data
development platforms, scheduling systems, and data synchronization modules.
When he first joined, Youzan used Airflow, which is also an Apache open source
project, but after research and production environment testing, Youzan decided
to switch to DolphinScheduler.</p>
-<p>How does the Youzan big data development platform use the scheduling
system? Why did Youzan decide to switch to Apache DolphinScheduler? The message
below will uncover the truth.</p>
+<div align=center>
+<img src="https://imgpp.com/images/2021/12/16/1639383815755.md.png"/>
+</div>
+<p>At the recent Apache DolphinScheduler Meetup 2021, Zheqi Song, the Director
of Youzan Big Data Development Platform
+shared the design scheme and production environment practice of its scheduling
system migration from Airflow to Apache
+DolphinScheduler.</p>
+<p>This post-90s young man from Hangzhou, Zhejiang Province joined Youzan in
September 2019, where he is engaged in the
+research and development of data development platforms, scheduling systems,
and data synchronization modules. When he
+first joined, Youzan used Airflow, which is also an Apache open source
project, but after research and production
+environment testing, Youzan decided to switch to DolphinScheduler.</p>
+<p>How does the Youzan big data development platform use the scheduling
system? Why did Youzan decide to switch to Apache
+DolphinScheduler? The message below will uncover the truth.</p>
<h2>Youzan Big Data Development Platform(DP)</h2>
-<p>As a retail technology SaaS service provider, Youzan is aimed to help
online merchants open stores, build data products and digital solutions through
social marketing and expand the omnichannel retail business, and provide better
SaaS capabilities for driving merchants' digital growth.</p>
+<p>As a retail technology SaaS service provider, Youzan is aimed to help
online merchants open stores, build data products
+and digital solutions through social marketing and expand the omnichannel
retail business, and provide better SaaS
+capabilities for driving merchants' digital growth.</p>
<p>At present, Youzan has established a relatively complete digital product
matrix with the support of the data center:</p>
-<p><a href="https://imgpp.com/image/i2gJb"><img
src="https://imgpp.com/images/2021/12/16/1_Jjgx5qQfjo559_oaJP-DAQ.md.png"
alt="1_Jjgx5qQfjo559_oaJP-DAQ.md.png"></a></p>
-<p>Youzan has established a big data development platform (hereinafter
referred to as DP platform) to support the increasing demand for data
processing services. This is a big data offline development platform that
provides users with the environment, tools, and data needed for the big data
tasks development.</p>
-<p><a href="https://imgpp.com/image/i2jiJ"><img
src="https://imgpp.com/images/2021/12/16/1_G9znZGQ1XBhJva0tjWa6Bg.md.png"
alt="1_G9znZGQ1XBhJva0tjWa6Bg.md.png"></a></p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_Jjgx5qQfjo559_oaJP-DAQ.md.png"/>
+</div>
+<p>Youzan has established a big data development platform (hereinafter
referred to as DP platform) to support the
+increasing demand for data processing services. This is a big data offline
development platform that provides users with
+the environment, tools, and data needed for the big data tasks development.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_G9znZGQ1XBhJva0tjWa6Bg.md.png"/>
+</div>
<p>Youzan Big Data Development Platform Architecture</p>
-<p>Youzan Big Data Development Platform is mainly composed of five modules:
basic component layer, task component layer, scheduling layer, service layer,
and monitoring layer. Among them, the service layer is mainly responsible for
the job life cycle management, and the basic component layer and the task
component layer mainly include the basic environment such as middleware and big
data components that the big data development platform depends on. The service
deployment of the DP platfo [...]
+<p>Youzan Big Data Development Platform is mainly composed of five modules:
basic component layer, task component layer,
+scheduling layer, service layer, and monitoring layer. Among them, the service
layer is mainly responsible for the job
+life cycle management, and the basic component layer and the task component
layer mainly include the basic environment
+such as middleware and big data components that the big data development
platform depends on. The service deployment of
+the DP platform mainly adopts the master-slave mode, and the master node
supports HA. The scheduling layer is
+re-developed based on Airflow, and the monitoring layer performs comprehensive
monitoring and early warning of the
+scheduling cluster.</p>
<h3>1 Scheduling layer architecture design</h3>
-<p><a href="https://imgpp.com/image/i2nK7"><img
src="https://imgpp.com/images/2021/12/16/1_UDNCmMrZtcswj62aqNXA1g.md.png"
alt="1_UDNCmMrZtcswj62aqNXA1g.md.png"></a></p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_UDNCmMrZtcswj62aqNXA1g.md.png"/>
+</div>
<p>Youzan Big Data Development Platform Scheduling Layer Architecture
Design</p>
-<p>In 2017, our team investigated the mainstream scheduling systems, and
finally adopted Airflow (1.7) as the task scheduling module of DP. In the
design of architecture, we adopted the deployment plan of Airflow + Celery +
Redis + MySQL based on actual business scenario demand, with Redis as the
dispatch queue, and implemented distributed deployment of any number of workers
through Celery.</p>
-<p>In the HA design of the scheduling node, it is well known that Airflow has
a single point problem on the scheduled node. To achieve high availability of
scheduling, the DP platform uses the Airflow Scheduler Failover Controller, an
open-source component, and adds a Standby node that will periodically monitor
the health of the Active node. Once the Active node is found to be unavailable,
Standby is switched to Active to ensure the high availability of the
schedule.</p>
+<p>In 2017, our team investigated the mainstream scheduling systems, and
finally adopted Airflow (1.7) as the task
+scheduling module of DP. In the design of architecture, we adopted the
deployment plan of Airflow + Celery + Redis +
+MySQL based on actual business scenario demand, with Redis as the dispatch
queue, and implemented distributed deployment
+of any number of workers through Celery.</p>
+<p>In the HA design of the scheduling node, it is well known that Airflow has
a single point problem on the scheduled node.
+To achieve high availability of scheduling, the DP platform uses the Airflow
Scheduler Failover Controller, an
+open-source component, and adds a Standby node that will periodically monitor
the health of the Active node. Once the
+Active node is found to be unavailable, Standby is switched to Active to
ensure the high availability of the schedule.</p>
<h3>2 Worker nodes load balancing strategy</h3>
-<p>In addition, to use resources more effectively, the DP platform
distinguishes task types based on CPU-intensive degree/memory-intensive degree
and configures different slots for different celery queues to ensure that each
machine's CPU/memory usage rate is maintained within a reasonable range.</p>
+<p>In addition, to use resources more effectively, the DP platform
distinguishes task types based on CPU-intensive
+degree/memory-intensive degree and configures different slots for different
celery queues to ensure that each machine's
+CPU/memory usage rate is maintained within a reasonable range.</p>
<h2>Scheduling System Upgrade and Selection</h2>
-<p>Since the official launch of the Youzan Big Data Platform 1.0 in 2017, we
have completed 100% of the data warehouse migration plan in 2018. In 2019, the
daily scheduling task volume has reached 30,000+ and has grown to 60,000+ by
2021. the platform’s daily scheduling task volume will be reached. With the
rapid increase in the number of tasks, DP's scheduling system also faces many
challenges and problems.</p>
+<p>Since the official launch of the Youzan Big Data Platform 1.0 in 2017, we
have completed 100% of the data warehouse
+migration plan in 2018. In 2019, the daily scheduling task volume has reached
30,000+ and has grown to 60,000+ by 2021.
+the platform’s daily scheduling task volume will be reached. With the rapid
increase in the number of tasks, DP's
+scheduling system also faces many challenges and problems.</p>
<h3>1 Pain points of Airflow</h3>
<ol>
-<li>In-depth re-development is difficult, the commercial version is separated
from the community, and costs relatively high to upgrade ;</li>
+<li>In-depth re-development is difficult, the commercial version is separated
from the community, and costs relatively
+high to upgrade ;</li>
<li>Based on the Python technology stack, the maintenance and iteration cost
higher;</li>
<li>Performance issues:</li>
</ol>
-<p><a href="https://imgpp.com/image/iR2cZ"><img
src="https://imgpp.com/images/2021/12/16/1_U33OWzzfw2Dqn3ryCNbSvw.md.png"
alt="1_U33OWzzfw2Dqn3ryCNbSvw.md.png"></a></p>
-<p>Airflow's schedule loop, as shown in the figure above, is essentially the
loading and analysis of DAG and generates DAG round instances to perform task
scheduling. Before Airflow 2.0, the DAG was scanned and parsed into the
database by a single point. It leads to a large delay (over the scanning
frequency, even to 60s-70s) for the scheduler loop to scan the Dag folder once
the number of Dags was largely due to business growth. This seriously reduces
the scheduling performance.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_U33OWzzfw2Dqn3ryCNbSvw.md.png"/>
+</div>
+<p>Airflow's schedule loop, as shown in the figure above, is essentially the
loading and analysis of DAG and generates DAG
+round instances to perform task scheduling. Before Airflow 2.0, the DAG was
scanned and parsed into the database by a
+single point. It leads to a large delay (over the scanning frequency, even to
60s-70s) for the scheduler loop to scan
+the Dag folder once the number of Dags was largely due to business growth.
This seriously reduces the scheduling
+performance.</p>
<ol start="4">
<li>Stability issues:</li>
</ol>
-<p>The Airflow Scheduler Failover Controller is essentially run by a
master-slave mode. The standby node judges whether to switch by monitoring
whether the active process is alive or not. If it encounters a deadlock
blocking the process before, it will be ignored, which will lead to scheduling
failure. After similar problems occurred in the production environment, we
found the problem after troubleshooting. Although Airflow version 1.10 has
fixed this problem, this problem will exist in [...]
+<p>The Airflow Scheduler Failover Controller is essentially run by a
master-slave mode. The standby node judges whether to
+switch by monitoring whether the active process is alive or not. If it
encounters a deadlock blocking the process
+before, it will be ignored, which will lead to scheduling failure. After
similar problems occurred in the production
+environment, we found the problem after troubleshooting. Although Airflow
version 1.10 has fixed this problem, this
+problem will exist in the master-slave mode, and cannot be ignored in the
production environment.</p>
<p>Taking into account the above pain points, we decided to re-select the
scheduling system for the DP platform.</p>
-<p>In the process of research and comparison, Apache DolphinScheduler entered
our field of vision. Also to be Apache's top open-source scheduling component
project, we have made a comprehensive comparison between the original
scheduling system and DolphinScheduler from the perspectives of performance,
deployment, functionality, stability, and availability, and community
ecology.</p>
+<p>In the process of research and comparison, Apache DolphinScheduler entered
our field of vision. Also to be Apache's top
+open-source scheduling component project, we have made a comprehensive
comparison between the original scheduling system
+and DolphinScheduler from the perspectives of performance, deployment,
functionality, stability, and availability, and
+community ecology.</p>
<p>This is the comparative analysis result below:</p>
-<p><a href="https://imgpp.com/image/iRJWj"><img
src="https://imgpp.com/images/2021/12/16/1_Rbr05klPmQIc7WPFNeEH-w.md.png"
alt="1_Rbr05klPmQIc7WPFNeEH-w.md.png"></a></p>
-<p><a href="https://imgpp.com/image/iRPvA"><img
src="https://imgpp.com/images/2021/12/16/1_Ity1QoRL_Yu5aDVClY9AgA.md.png"
alt="1_Ity1QoRL_Yu5aDVClY9AgA.md.png"></a>
-Airflow VS DolphinScheduler</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_Rbr05klPmQIc7WPFNeEH-w.md.png"/>
+</div>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_Ity1QoRL_Yu5aDVClY9AgA.md.png"/>
+</div>
+<p>Airflow VS DolphinScheduler</p>
<h3>1 DolphinScheduler valuation</h3>
-<p><a href="https://imgpp.com/image/iRUHk"><img
src="https://imgpp.com/images/2021/12/16/1_o8c1Y1TFAOis3KozzJnvfA.md.png"
alt="1_o8c1Y1TFAOis3KozzJnvfA.md.png"></a></p>
-<p>As shown in the figure above, after evaluating, we found that the
throughput performance of DolphinScheduler is twice that of the original
scheduling system under the same conditions. And we have heard that the
performance of DolphinScheduler will greatly be improved after version 2.0,
this news greatly excites us.</p>
-<p>In addition, at the deployment level, the Java technology stack adopted by
DolphinScheduler is conducive to the standardized deployment process of ops,
simplifies the release process, liberates operation and maintenance manpower,
and supports Kubernetes and Docker deployment with stronger scalability.</p>
-<p>In terms of new features, DolphinScheduler has a more flexible
task-dependent configuration, to which we attach much importance, and the
granularity of time configuration is refined to the hour, day, week, and month.
In addition, DolphinScheduler's scheduling management interface is easier to
use and supports worker group isolation. As a distributed scheduling, the
overall scheduling capability of DolphinScheduler grows linearly with the scale
of the cluster, and with the release of n [...]
-<p>From the perspective of stability and availability, DolphinScheduler
achieves high reliability and high scalability, the decentralized multi-Master
multi-Worker design architecture supports dynamic online and offline services
and has stronger self-fault tolerance and adjustment capabilities.</p>
-<p></p>
-<p>And also importantly, after months of communication, we found that the
DolphinScheduler community is highly active, with frequent technical exchanges,
detailed technical documents outputs, and fast version iteration.</p>
-<p></p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_o8c1Y1TFAOis3KozzJnvfA.md.png"/>
+</div>
+<p>As shown in the figure above, after evaluating, we found that the
throughput performance of DolphinScheduler is twice
+that of the original scheduling system under the same conditions. And we have
heard that the performance of
+DolphinScheduler will greatly be improved after version 2.0, this news greatly
excites us.</p>
+<p>In addition, at the deployment level, the Java technology stack adopted by
DolphinScheduler is conducive to the
+standardized deployment process of ops, simplifies the release process,
liberates operation and maintenance manpower,
+and supports Kubernetes and Docker deployment with stronger scalability.</p>
+<p>In terms of new features, DolphinScheduler has a more flexible
task-dependent configuration, to which we attach much
+importance, and the granularity of time configuration is refined to the hour,
day, week, and month. In addition,
+DolphinScheduler's scheduling management interface is easier to use and
supports worker group isolation. As a
+distributed scheduling, the overall scheduling capability of DolphinScheduler
grows linearly with the scale of the
+cluster, and with the release of new feature task plug-ins, the task-type
customization is also going to be attractive
+character.</p>
+<p>From the perspective of stability and availability, DolphinScheduler
achieves high reliability and high scalability, the
+decentralized multi-Master multi-Worker design architecture supports dynamic
online and offline services and has
+stronger self-fault tolerance and adjustment capabilities.</p>
+<p>And also importantly, after months of communication, we found that the
DolphinScheduler community is highly active, with
+frequent technical exchanges, detailed technical documents outputs, and fast
version iteration.</p>
<p>In summary, we decided to switch to DolphinScheduler.</p>
<h2>DolphinScheduler Migration Scheme Design</h2>
-<p>After deciding to migrate to DolphinScheduler, we sorted out the platform's
requirements for the transformation of the new scheduling system.</p>
-<p></p>
+<p>After deciding to migrate to DolphinScheduler, we sorted out the platform's
requirements for the transformation of the
+new scheduling system.</p>
<p>In conclusion, the key requirements are as below:</p>
-<p></p>
<ol>
-<li>Users are not aware of migration. There are 700-800 users on the platform,
we hope that the user switching cost can be reduced;</li>
-<li>The scheduling system can be dynamically switched because the production
environment requires stability above all else. The online grayscale test will
be performed during the online period, we hope that the scheduling system can
be dynamically switched based on the granularity of the workflow;</li>
-<li>The workflow configuration for testing and publishing needs to be
isolated. Currently, we have two sets of configuration files for task testing
and publishing that are maintained through GitHub. Online scheduling task
configuration needs to ensure the accuracy and stability of the data, so two
sets of environments are required for isolation.</li>
+<li>Users are not aware of migration. There are 700-800 users on the platform,
we hope that the user switching cost can
+be reduced;</li>
+<li>The scheduling system can be dynamically switched because the production
environment requires stability above all
+else. The online grayscale test will be performed during the online period, we
hope that the scheduling system can be
+dynamically switched based on the granularity of the workflow;</li>
+<li>The workflow configuration for testing and publishing needs to be
isolated. Currently, we have two sets of
+configuration files for task testing and publishing that are maintained
through GitHub. Online scheduling task
+configuration needs to ensure the accuracy and stability of the data, so two
sets of environments are required for
+isolation.</li>
</ol>
<p>In response to the above three points, we have redesigned the
architecture.</p>
-<p></p>
<h3>1 Architecture design</h3>
<ol>
<li>Keep the existing front-end interface and DP API;</li>
-<li>Refactoring the scheduling management interface, which was originally
embedded in the Airflow interface, and will be rebuilt based on
DolphinScheduler in the future;</li>
+<li>Refactoring the scheduling management interface, which was originally
embedded in the Airflow interface, and will be
+rebuilt based on DolphinScheduler in the future;</li>
<li>Task lifecycle management/scheduling management and other operations
interact through the DolphinScheduler API;</li>
-<li>Use the Project mechanism to redundantly configure the workflow to achieve
configuration isolation for testing and release.</li>
+<li>Use the Project mechanism to redundantly configure the workflow to achieve
configuration isolation for testing and
+release.</li>
</ol>
-<p><a href="https://imgpp.com/image/iRdIC"><img
src="https://imgpp.com/images/2021/12/16/1_eusVhW4QAJ2uO-J96bqiFg.md.png"
alt="1_eusVhW4QAJ2uO-J96bqiFg.md.png"></a>
-Refactoring Design</p>
-<p></p>
-<p>We entered the transformation phase after the architecture design is
completed. We have transformed DolphinScheduler's workflow definition, task
execution process, and workflow release process, and have made some key
functions to complement it.</p>
-<p></p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_eusVhW4QAJ2uO-J96bqiFg.md.png"/>
+</div>
+<p>Refactoring Design</p>
+<p>We entered the transformation phase after the architecture design is
completed. We have transformed DolphinScheduler's
+workflow definition, task execution process, and workflow release process, and
have made some key functions to
+complement it.</p>
<ul>
<li>Workflow definition status combing</li>
</ul>
-<p><a href="https://imgpp.com/image/iRhM6"><img
src="https://imgpp.com/images/2021/12/16/-1.md.png" alt="-1.md.png"></a></p>
-<p>We first combed the definition status of the DolphinScheduler workflow. The
definition and timing management of DolphinScheduler work will be divided into
online and offline status, while the status of the two on the DP platform is
unified, so in the task test and workflow release process, the process series
from DP to DolphinScheduler needs to be modified accordingly.</p>
+<div align=center>
+<img src="https://imgpp.com/images/2021/12/16/-1.md.png"/>
+</div>
+<p>We first combed the definition status of the DolphinScheduler workflow. The
definition and timing management of
+DolphinScheduler work will be divided into online and offline status, while
the status of the two on the DP platform is
+unified, so in the task test and workflow release process, the process series
from DP to DolphinScheduler needs to be
+modified accordingly.</p>
<ul>
<li>Task execution process transformation</li>
</ul>
-<p>Firstly, we have changed the task test process. After switching to
DolphinScheduler, all interactions are based on the DolphinScheduler API. When
the task test is started on DP, the corresponding workflow definition
configuration will be generated on the DolphinScheduler. After going online,
the task will be run and the DolphinScheduler log will be called to view the
results and obtain log running information in real-time.</p>
-<p><a href="https://imgpp.com/image/iRhM6"><img
src="https://imgpp.com/images/2021/12/16/-1.md.png" alt="-1.md.png"></a></p>
-<p><a href="https://imgpp.com/image/iRtJH"><img
src="https://imgpp.com/images/2021/12/16/-3.md.png" alt="-3.md.png"></a></p>
-<ul>
-<li>Workflow release process transformation</li>
-</ul>
-<p>Secondly, for the workflow online process, after switching to
DolphinScheduler, the main change is to synchronize the workflow definition
configuration and timing configuration, as well as the online status.</p>
-<p><a href="https://imgpp.com/image/iRBNI"><img
src="https://imgpp.com/images/2021/12/16/1_4-ikFp_jJ44-YWJcGNioOg.md.png"
alt="1_4-ikFp_jJ44-YWJcGNioOg.md.png"></a></p>
-<p><a href="https://imgpp.com/image/iRwim"><img
src="https://imgpp.com/images/2021/12/16/-5.md.png" alt="-5.md.png"></a></p>
-<p>The original data maintenance and configuration synchronization of the
workflow is managed based on the DP master, and only when the task is online
and running will it interact with the scheduling system. Based on these two
core changes, the DP platform can dynamically switch systems under the
workflow, and greatly facilitate the subsequent online grayscale test.</p>
+<p>Firstly, we have changed the task test process. After switching to
DolphinScheduler, all interactions are based on the
+DolphinScheduler API. When the task test is started on DP, the corresponding
workflow definition configuration will be
+generated on the DolphinScheduler. After going online, the task will be run
and the DolphinScheduler log will be called
+to view the results and obtain log running information in real-time.</p>
+<div align=center>
+<img src="https://imgpp.com/images/2021/12/16/-1.md.png"/>
+</div>
+<div align=center>
+<img src="https://imgpp.com/images/2021/12/16/-3.md.png"/>
+</div>
+- Workflow release process transformation
+<p>Secondly, for the workflow online process, after switching to
DolphinScheduler, the main change is to synchronize the
+workflow definition configuration and timing configuration, as well as the
online status.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_4-ikFp_jJ44-YWJcGNioOg.md.png"/>
+</div>
+<div align=center>
+<img src="https://imgpp.com/images/2021/12/16/-5.md.png"/>
+</div>
+The original data maintenance and configuration synchronization of the
workflow is managed based on the DP master, and
+only when the task is online and running will it interact with the scheduling
system. Based on these two core changes,
+the DP platform can dynamically switch systems under the workflow, and greatly
facilitate the subsequent online
+grayscale test.
<h3>2 Function completion</h3>
-<p></p>
<p>In addition, the DP platform has also complemented some functions. The
first is the adaptation of task types.</p>
-<p></p>
<ul>
<li>Task type adaptation</li>
</ul>
-<p>Currently, the task types supported by the DolphinScheduler platform mainly
include data synchronization and data calculation tasks, such as Hive SQL
tasks, DataX tasks, and Spark tasks. Because the original data information of
the task is maintained on the DP, the docking scheme of the DP platform is to
build a task configuration mapping module in the DP master, map the task
information maintained by the DP to the task on DP, and then use the API call
of DolphinScheduler to transfer [...]
-<p><a href="https://imgpp.com/image/iROc4"><img
src="https://imgpp.com/images/2021/12/16/1_A76iOa5LKyPiu-NoopmYrA.md.png"
alt="1_A76iOa5LKyPiu-NoopmYrA.md.png"></a></p>
-<p>Because some of the task types are already supported by DolphinScheduler,
it is only necessary to customize the corresponding task modules of
DolphinScheduler to meet the actual usage scenario needs of the DP platform.
For the task types not supported by DolphinScheduler, such as Kylin tasks,
algorithm training tasks, DataY tasks, etc., the DP platform also plans to
complete it with the plug-in capabilities of DolphinScheduler 2.0.</p>
+<p>Currently, the task types supported by the DolphinScheduler platform mainly
include data synchronization and data
+calculation tasks, such as Hive SQL tasks, DataX tasks, and Spark tasks.
Because the original data information of the
+task is maintained on the DP, the docking scheme of the DP platform is to
build a task configuration mapping module in
+the DP master, map the task information maintained by the DP to the task on
DP, and then use the API call of
+DolphinScheduler to transfer task configuration information.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_A76iOa5LKyPiu-NoopmYrA.md.png"/>
+</div>
+<p>Because some of the task types are already supported by DolphinScheduler,
it is only necessary to customize the
+corresponding task modules of DolphinScheduler to meet the actual usage
scenario needs of the DP platform. For the task
+types not supported by DolphinScheduler, such as Kylin tasks, algorithm
training tasks, DataY tasks, etc., the DP
+platform also plans to complete it with the plug-in capabilities of
DolphinScheduler 2.0.</p>
<h3>3 Transformation schedule</h3>
-<p>Because SQL tasks and synchronization tasks on the DP platform account for
about 80% of the total tasks, the transformation focuses on these task types.
At present, the adaptation and transformation of Hive SQL tasks, DataX tasks,
and script tasks adaptation have been completed.</p>
-<p><a href="https://imgpp.com/image/iRYY8"><img
src="https://imgpp.com/images/2021/12/16/1_y7HUfYyLs9NxnTzENKGSCA.md.png"
alt="1_y7HUfYyLs9NxnTzENKGSCA.md.png"></a></p>
-<h3>4 Function complement</h3>
+<p>Because SQL tasks and synchronization tasks on the DP platform account for
about 80% of the total tasks, the
+transformation focuses on these task types. At present, the adaptation and
transformation of Hive SQL tasks, DataX
+tasks, and script tasks adaptation have been completed.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_y7HUfYyLs9NxnTzENKGSCA.md.png"/>
+</div>
+### 4 Function complement
<ul>
<li>Catchup mechanism realizes automatic replenishment</li>
</ul>
-<p>DP also needs a core capability in the actual production environment, that
is, Catchup-based automatic replenishment and global replenishment
capabilities.</p>
-<p>The catchup mechanism will play a role when the scheduling system is
abnormal or resources is insufficient, causing some tasks to miss the currently
scheduled trigger time. When the scheduling is resumed, Catchup will
automatically fill in the untriggered scheduling execution plan.</p>
+<p>DP also needs a core capability in the actual production environment, that
is, Catchup-based automatic replenishment and
+global replenishment capabilities.</p>
+<p>The catchup mechanism will play a role when the scheduling system is
abnormal or resources is insufficient, causing some
+tasks to miss the currently scheduled trigger time. When the scheduling is
resumed, Catchup will automatically fill in
+the untriggered scheduling execution plan.</p>
<p>The following three pictures show the instance of an hour-level workflow
scheduling execution.</p>
-<p>In Figure 1, the workflow is called up on time at 6 o'clock and tuned up
once an hour. You can see that the task is called up on time at 6 o'clock and
the task execution is completed. The current state is also normal.</p>
-<p><a href="https://imgpp.com/image/iRk6U"><img
src="https://imgpp.com/images/2021/12/16/1_MvQGZ-FKKLMvKrlWihXHgg.md.png"
alt="1_MvQGZ-FKKLMvKrlWihXHgg.md.png"></a></p>
+<p>In Figure 1, the workflow is called up on time at 6 o'clock and tuned up
once an hour. You can see that the task is
+called up on time at 6 o'clock and the task execution is completed. The
current state is also normal.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_MvQGZ-FKKLMvKrlWihXHgg.md.png"/>
+</div>
<p>figure 1</p>
-<p>Figure 2 shows that the scheduling system was abnormal at 8 o'clock,
causing the workflow not to be activated at 7 o'clock and 8 o'clock.</p>
-<p><a href="https://imgpp.com/image/iRGHe"><img
src="https://imgpp.com/images/2021/12/16/1_1WxLOtd1Oh2YERmtGcRb0Q.md.png"
alt="1_1WxLOtd1Oh2YERmtGcRb0Q.md.png"></a>
- figure 2</p>
-<p></p>
-<p>Figure 3 shows that when the scheduling is resumed at 9 o'clock, thanks to
the Catchup mechanism, the scheduling system can automatically replenish the
previously lost execution plan to realize the automatic replenishment of the
scheduling.</p>
-<p><a href="https://imgpp.com/image/iRSXD"><img
src="https://imgpp.com/images/2021/12/16/126ec1039f7aa614c.md.png"
alt="126ec1039f7aa614c.md.png"></a>
-Figure 3</p>
-<p></p>
-<p>This mechanism is particularly effective when the amount of tasks is large.
When the scheduled node is abnormal or the core task accumulation causes the
workflow to miss the scheduled trigger time, due to the system's fault-tolerant
mechanism can support automatic replenishment of scheduled tasks, there is no
need to replenish and re-run manually.</p>
+<p>Figure 2 shows that the scheduling system was abnormal at 8 o'clock,
causing the workflow not to be activated at 7
+o'clock and 8 o'clock.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_1WxLOtd1Oh2YERmtGcRb0Q.md.png"/>
+</div>
+figure 2
+<p>Figure 3 shows that when the scheduling is resumed at 9 o'clock, thanks to
the Catchup mechanism, the scheduling system
+can automatically replenish the previously lost execution plan to realize the
automatic replenishment of the scheduling.</p>
+<div align=center>
+<img src="https://imgpp.com/images/2021/12/16/126ec1039f7aa614c.md.png"/>
+</div>
+<p>Figure 3</p>
+<p>This mechanism is particularly effective when the amount of tasks is large.
When the scheduled node is abnormal or the
+core task accumulation causes the workflow to miss the scheduled trigger time,
due to the system's fault-tolerant
+mechanism can support automatic replenishment of scheduled tasks, there is no
need to replenish and re-run manually.</p>
<p>At the same time, this mechanism is also applied to DP's global
complement.</p>
<ul>
<li>Global Complement across Dags</li>
</ul>
-<p><a href="https://imgpp.com/image/iRZa2"><img
src="https://imgpp.com/images/2021/12/16/1_eVyyABTQCLeSGzbbuizfDA.md.png"
alt="1_eVyyABTQCLeSGzbbuizfDA.md.png"></a></p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_eVyyABTQCLeSGzbbuizfDA.md.png"/>
+</div>
<p>DP platform cross-Dag global complement process</p>
-<p>The main use scenario of global complements in Youzan is when there is an
abnormality in the output of the core upstream table, which results in abnormal
data display in downstream businesses. In this case, the system generally needs
to quickly rerun all task instances under the entire data link.</p>
-<p>Based on the function of Clear, the DP platform is currently able to obtain
certain nodes and all downstream instances under the current scheduling cycle
through analysis of the original data, and then to filter some instances that
do not need to be rerun through the rule pruning strategy. After obtaining
these lists, start the clear downstream clear task instance function, and then
use Catchup to automatically fill up.</p>
+<p>The main use scenario of global complements in Youzan is when there is an
abnormality in the output of the core upstream
+table, which results in abnormal data display in downstream businesses. In
this case, the system generally needs to
+quickly rerun all task instances under the entire data link.</p>
+<p>Based on the function of Clear, the DP platform is currently able to obtain
certain nodes and all downstream instances
+under the current scheduling cycle through analysis of the original data, and
then to filter some instances that do not
+need to be rerun through the rule pruning strategy. After obtaining these
lists, start the clear downstream clear task
+instance function, and then use Catchup to automatically fill up.</p>
<p>This process realizes the global rerun of the upstream core through Clear,
which can liberate manual operations.</p>
-<p>Because the cross-Dag global complement capability is important in a
production environment, we plan to complement it in DolphinScheduler.</p>
+<p>Because the cross-Dag global complement capability is important in a
production environment, we plan to complement it in
+DolphinScheduler.</p>
<h2>Current Status & Planning & Outlook</h2>
<h3>1 DolphinScheduler migration status</h3>
-<p></p>
-<p>The DP platform has deployed part of the DolphinScheduler service in the
test environment and migrated part of the workflow.</p>
-<p>After docking with the DolphinScheduler API system, the DP platform
uniformly uses the admin user at the user level. Because its user system is
directly maintained on the DP master, all workflow information will be divided
into the test environment and the formal environment.</p>
-<p></p>
-<p><a href="https://imgpp.com/image/iRi0N"><img
src="https://imgpp.com/images/2021/12/16/1_bXwtKI2HJzQuHCMW5y3hgg.md.png"
alt="1_bXwtKI2HJzQuHCMW5y3hgg.md.png"></a></p>
+<p>The DP platform has deployed part of the DolphinScheduler service in the
test environment and migrated part of the
+workflow.</p>
+<p>After docking with the DolphinScheduler API system, the DP platform
uniformly uses the admin user at the user level.
+Because its user system is directly maintained on the DP master, all workflow
information will be divided into the test
+environment and the formal environment.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_bXwtKI2HJzQuHCMW5y3hgg.md.png"/>
+</div>
<p>DolphinScheduler 2.0 workflow task node display</p>
-<p>The overall UI interaction of DolphinScheduler 2.0 looks more concise and
more visualized and we plan to directly upgrade to version 2.0.</p>
+<p>The overall UI interaction of DolphinScheduler 2.0 looks more concise and
more visualized and we plan to directly
+upgrade to version 2.0.</p>
<h3>2 Access planning</h3>
-<p></p>
-<p>At present, the DP platform is still in the grayscale test of
DolphinScheduler migration., and is planned to perform a full migration of the
workflow in December this year. At the same time, a phased full-scale test of
performance and stress will be carried out in the test environment. If no
problems occur, we will conduct a grayscale test of the production environment
in January 2022, and plan to complete the full migration in March.</p>
-<p><a href="https://imgpp.com/image/iR9PL"><img
src="https://imgpp.com/images/2021/12/16/1_jv3ScivmLop7GYjKIECaiw.md.png"
alt="1_jv3ScivmLop7GYjKIECaiw.md.png"></a></p>
+<p>At present, the DP platform is still in the grayscale test of
DolphinScheduler migration., and is planned to perform a
+full migration of the workflow in December this year. At the same time, a
phased full-scale test of performance and
+stress will be carried out in the test environment. If no problems occur, we
will conduct a grayscale test of the
+production environment in January 2022, and plan to complete the full
migration in March.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_jv3ScivmLop7GYjKIECaiw.md.png"/>
+</div>
<h3>3 Expectations for DolphinScheduler</h3>
-<p>In the future, we strongly looking forward to the plug-in tasks feature in
DolphinScheduler, and have implemented plug-in alarm components based on
DolphinScheduler 2.0, by which the Form information can be defined on the
backend and displayed adaptively on the frontend.</p>
-<p><a href="https://imgpp.com/image/iRbic"><img
src="https://imgpp.com/images/2021/12/16/1_3jP2KQDtFy71ciDoUyW3eg.md.png"
alt="1_3jP2KQDtFy71ciDoUyW3eg.md.png"></a></p>
+<p>In the future, we strongly looking forward to the plug-in tasks feature in
DolphinScheduler, and have implemented
+plug-in alarm components based on DolphinScheduler 2.0, by which the Form
information can be defined on the backend and
+displayed adaptively on the frontend.</p>
+<div align=center>
+<img
src="https://imgpp.com/images/2021/12/16/1_3jP2KQDtFy71ciDoUyW3eg.md.png"/>
+</div>
<p>"</p>
-<p>I hope that DolphinScheduler's optimization pace of plug-in feature can be
faster, to better quickly adapt to our customized task types.</p>
+<p>I hope that DolphinScheduler's optimization pace of plug-in feature can be
faster, to better quickly adapt to our
+customized task types.</p>
<p>——Zheqi Song, Head of Youzan Big Data Development Platform</p>
<p>"</p>
</section><footer class="footer-container"><div
class="footer-body"><div><h3>About us</h3><h4>Do you need feedback? Please
contact us through the following ways.</h4></div><div
class="contact-container"><ul><li><a
href="/en-us/community/development/subscribe.html"><img class="img-base"
src="/img/emailgray.png"/><img class="img-change"
src="/img/emailblue.png"/><p>Email List</p></a></li><li><a
href="https://twitter.com/dolphinschedule"><img class="img-base"
src="/img/twittergray.png"/><im [...]
diff --git a/en-us/blog/YouZan-case-study.json
b/en-us/blog/YouZan-case-study.json
index 279c675..7122b15 100644
--- a/en-us/blog/YouZan-case-study.json
+++ b/en-us/blog/YouZan-case-study.json
@@ -1,6 +1,6 @@
{
"filename": "YouZan-case-study.md",
- "__html": "<h1>From Airflow to Apache DolphinScheduler, the Roadmap of
Scheduling System On Youzan Big Data Development Platform</h1>\n<p><a
href=\"https://imgpp.com/image/i2Fo0\"><img
src=\"https://imgpp.com/images/2021/12/16/1639383815755.md.png\"
alt=\"1639383815755.md.png\"></a></p>\n<p>At the recent Apache DolphinScheduler
Meetup 2021, Zheqi Song, the Director of Youzan Big Data Development Platform
shared the design scheme and production environment practice of its scheduling
sys [...]
+ "__html": "<h1>From Airflow to Apache DolphinScheduler, the Roadmap of
Scheduling System On Youzan Big Data Development Platform</h1>\n<div
align=center>\n<img
src=\"https://imgpp.com/images/2021/12/16/1639383815755.md.png\"/>\n</div>\n<p>At
the recent Apache DolphinScheduler Meetup 2021, Zheqi Song, the Director of
Youzan Big Data Development Platform\nshared the design scheme and production
environment practice of its scheduling system migration from Airflow to
Apache\nDolphinSchedul [...]
"link": "/dist/en-us/blog/YouZan-case-study.html",
"meta": {}
}
\ No newline at end of file