[shardingsphere] branch master updated: feat(blog): add new articles and images (#19801)

wuweijie Thu, 04 Aug 2022 00:41:17 -0700

This is an automated email from the ASF dual-hosted git repository.

wuweijie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/shardingsphere.git



The following commit(s) were added to refs/heads/master by this push:
     new aff01a06f14 feat(blog): add new articles and images  (#19801)
aff01a06f14 is described below

commit aff01a06f14e9f653d3ff2dce6cc2ba2eb62eed0
Author: Yumeiya <[email protected]>
AuthorDate: Thu Aug 4 15:41:03 2022 +0800

    feat(blog): add new articles and images  (#19801)
    
    * add new blogs
    
    * add new images
    
    * Update 
2022_05_24_Your_Guide_to_DistSQL_Cluster_Governance_Capability_Apache_ShardingSphere_Feature_Update.en.md
    
    * delete images
    
    * Revert "delete images"
    
    This reverts commit dc5cc370c3210eb5adb817c74fd6630c6324a12d.
    
    * Delete images
    
    * Update 
2022_05_17_How_does_Apache_ShardingSphere_implement_distributed_transactions.en.md
    
    * Update 
2022_05_24_Your_Guide_to_DistSQL_Cluster_Governance_Capability_Apache_ShardingSphere_Feature_Update.en.md
---
 ...concept_and_what_challenges_can_it_solve.en.md" | 175 ++++++++
 ...Sphere_implement_distributed_transactions.en.md | 277 +++++++++++++
 ...Transaction_Middle_Platform_Architecture.en.md" | 146 +++++++
 ...lity_Apache_ShardingSphere_Feature_Update.en.md | 449 +++++++++++++++++++++
 4 files changed, 1047 insertions(+)

diff --git 
"a/docs/blog/content/material/2022_05_10_What\342\200\231s_the_Database_Plus_concept_and_what_challenges_can_it_solve.en.md"
 
"b/docs/blog/content/material/2022_05_10_What\342\200\231s_the_Database_Plus_concept_and_what_challenges_can_it_solve.en.md"
new file mode 100644
index 00000000000..3176a55cc73
--- /dev/null
+++ 
"b/docs/blog/content/material/2022_05_10_What\342\200\231s_the_Database_Plus_concept_and_what_challenges_can_it_solve.en.md"
@@ -0,0 +1,175 @@
++++ 
+title = "What’s the Database Plus concept and what challenges can it solve?"
+weight = 54
+chapter = true 
++++
+
+## Background
+
+For a long time, unification or fragmentation were the two most popular 
predictions for the future of databases. With the advance of digitization, a 
single scenario cannot meet the needs of diversified applications — making 
database fragmentation an irreversible trend. While 
[Oracle](https://www.oracle.com/index.html), the commercial database with the 
largest market share, has no obvious weaknesses, all sorts of new databases are 
still entering the market. Today, over [300 databases are  [...]
+
+An increasing number of application scenarios has exacerbated database 
fragmentation, making database architecture, protocols, functions, and 
application scenarios increasingly diversified. In terms of database 
architecture, the centralized database which is evolved from a single-machine 
system coexists with the new generation of native distributed databases. In 
terms of database protocols, [MySQL](https://www.mysql.com) and 
[PostgreSQL](https://www.postgresql.org), two major open source [...]
+
+Today, it is normal for enterprises to leverage diversified databases. In my 
market of expertise, China, in the Internet industry, MySQL together with data 
sharding middleware is the go to architecture, with 
[GreenPlum](https://greenplum.org), [HBase](https://hbase.apache.org), 
[Elasticsearch](https://www.elastic.co/cn/), 
[Clickhouse](https://clickhouse.com) and other big data ecosystems being 
auxiliary computing engine for analytical data. At the same time, some legacy 
systems (such as  [...]
+
+When i claim that fragmentation is an irreversible trend for the database 
field, i base my assumptions on the above trends. A single type of database can 
only be applied to a certain (or a few scenarios at best)in which it excels.
+
+## Problems with database fragmentation
+
+With the increasing variety of databases used by enterprises, various problems 
and pain points also arise.
+
+### 1. Difficulties in architecture selection
+
+To meet flexible business requirements, the application architecture can be 
transformed from monolithic to service-oriented and then to 
microservice-oriented. In this case, the database used to store the core data 
becomes the focus of the distributed system.
+
+Compared with stateless applications, it is all the more difficult to design a 
stateful database. Divide-and-conquer is the best practice for distributed 
systems. Obviously, database systems cannot respond to all service requests 
with a single product and an integrated cluster.
+
+**First of all, a single database category is unable to meet all kinds of 
business application needs, while maintaining high throughput, low latency, 
strong consistency, operation and maintenance efficiency, and even stability.** 
It is less likely that a single application requires the coexistence of 
multiple databases, but the likelihood of multiple applications requiring 
multiple databases is much higher.
+
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/cvozravplx4vi8ri7rjk.png)
+
+**Secondly, no matter if we’re talking about standalone databases or 
All-in-One distributed database clusters, it is difficult to become the only 
storage support for the backend of many micro-service applications.** 
Standalone databases cannot carry the increasing amount of data and traffic, so 
more and more applications choose to adopt distributed solutions. However, if 
multiple applications use a unified database cluster, the load on CPU, memory, 
disks, and network cannot be completely [...]
+
+Today, most distributed databases are expensive to build, requiring backup and 
redundant standalone servers at the compute node, storage node, and governance 
nodes. Building an independent distributed database for each micro-service will 
inevitably lead to unnecessary resource consumption and ultimately become 
unsustainable for enterprises.
+
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/jmvy5n7wo3k8xws5iu69.png)
+
+**Finally, a large number of enterprises adopt a unitized architecture.** 
Based on the data sharding solution, the split and unification of databases are 
completed on the application side. As the number of databases increases, the 
complexity of the architectural design increases exponentially. In the long 
run, the dev team will be unable to focus on R&D, but will have to dedicate a 
lot of energy to the maintenance of the underlying components. Although the 
data sharding feature of [Apach [...]
+
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/pnwojdeofcm86csnzz00.png)
+
+### 2. Numerous technical challenges
+
+When the coexistence of fragmented databases becomes the norm, the development 
costs and learning curve for R&D teams will inevitably increase constantly. 
Although most databases support SQL operation, there are still a lot of 
differences between databases in terms of SQL dialects.
+
+If each database needs to be refined, it will take up a lot of energy of R&D 
teams, and the accumulated knowledge and experience are hard to be passed on. 
If we use the heterogeneous database with a coarse-grained standard model, it 
will annihilate the characteristics of the database itself.
+
+### 3. High operation & maintenance complexity
+
+It takes a lot of time and practical experience to master the characteristics 
of various databases and formulate effective operation and maintenance 
specifications. In addition to the most basic operation and maintenance work, 
the database peripheral supporting tools are also very different. Monitoring, 
backup, and other automated operation and maintenance work composed of 
peripheral supporting tools will incur substantial operation and maintenance 
costs.
+
+### 4. Lack of collaboration and unified management among databases
+
+From the perspective of a database, its primary goal is to improve its own 
capability instead of improving the online compatibility with other databases. 
Features such as correlated queries and distributed transactions across 
heterogeneous databases cannot be achieved within the database itself.
+
+Unlike the relatively standard SQL, the protocol of the database itself and 
the peripheral ecosystem tools lack a unified standard. Increasing attention 
has been dedicated to the unified management and control of heterogeneous 
databases. Unfortunately, databases lack upper-level standards, so it is 
difficult to effectively promote collaboration and unified management between 
databases.
+
+## What is Database Plus?
+
+[Database Plus](https://shardingsphere.apache.org) is a design concept for 
distributed database systems, designed to build an ecosystem on top of 
fragmented heterogeneous databases. The goal is to provide globally scalable 
and enhanced computing capabilities, while maximizing the original database 
computing capabilities. The interaction between applications and databases 
becomes oriented towards the Database Plus standard, therefore minimizing the 
impact of database fragmentation on uppe [...]
+
+**Connect, Enhance and Pluggable** are the three keywords that define the core 
values of Database Plus.
+
+### 1. Connect: building upper-level standards for databases
+Rather than providing an entirely new standard, Database Plus provides an 
intermediate layer that can be adapted to a variety of SQL dialects and 
database access protocols, providing an open interface to connect to various 
databases.
+
+Thanks to the implementation of the Database access protocol, Database Plus 
provides the same experience as a database and can support any development 
language, and database access client.
+
+Moreover, Database Plus supports maximum conversion between SQL dialects. An 
AST (abstract syntax tree) that parses SQL can be used to regenerate SQL 
according to the rules of other database dialects. The SQL dialect conversion 
makes it possible for heterogeneous databases to access each other. This way, 
users can use any SQL dialect to access heterogeneous underlying databases.
+
+**Database gateway is the best interpretation of Connect. **It is the 
prerequisite for Database Plus providing a solution for database fragmentation. 
This is done by building a common open docking layer positioned in the upper 
layer of the database, to pool all the access traffic of the fragmented 
databases.
+
+### 2. Enhance: database computing enhancement engine
+Following decades of development, databases now boast their own query 
optimizer, transaction engine, storage engine, and other time-tested storage 
and computing capabilities and design models. With the advent of the 
distributed and cloud native era, original computing and storage capabilities 
of the database will be scattered and woven into a distributed and cloud native 
level of new capabilities.
+
+Database Plus adopts a design philosophy that emphasizes traditional database 
practices, while at the same time adapting to the new generation of distributed 
databases. Whether centralized or distributed, Database Plus can repurpose and 
enhance the storage and native computing capabilities of a database.
+The capabilities enhancement mainly refers to three aspects: distributed, data 
control, and traffic control.
+
+Data fragmentation, elastic scaling, high availability, read/write spliting, 
distributed transactions, and heterogeneous database federated queries based on 
the vertical split are all capabilities that Database Plus can provide at the 
global level for distributed heterogeneous databases. It doesn’t focus on the 
database itself, but on the top of the fragmented database, focusing on the 
global collaboration between multiple databases.
+
+In addition to distributed enhancement, data control and traffic control 
enhancements are both in the silo structure. Incremental capabilities for data 
control include data encryption, data desensitization, data watermarking, data 
traceability, SQL audit etc.
+
+Incremental capabilities for traffic control include shadow library, gray 
release, SQL firewall, blacklist and whitelist, circuit-breaker and 
rate-limiting and so on. They are all provided by the database ecosystem layer. 
However, owing to database fragmentation, it is a huge amount of work to 
provide full enhancement capability for each database, and there is no unified 
standard. Database Plus provides users like you with the permutation and 
combination of supported database types and e [...]
+
+### 3. Pluggable: building a database-oriented functional ecosystem
+
+The Database Plus common layer could become bloated due to docking with an 
increasing number of database types, and additional enhancement capability. The 
pluggability borne out of Connect and Enhance is not only the foundation of 
Database Plus’ common layer, but also the effective guarantee of infinite 
ecosystem expansion possibilities.
+
+The pluggable architecture enables Database Plus to truly build a 
database-oriented functional ecosystem, unifying and managing the global 
capabilities of heterogeneous databases. It is not only for the distribution of 
centralized databases, but also for the silo function integration of 
distributed databases.
+
+Microkernel design and pluggable architecture are core values of the Database 
Plus concept, which is oriented towards a common platform layer rather than a 
specific function.
+
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/q5fqy07cslhod72r17ai.png)
+
+## ShardingSphere’s exploration of Database Plus
+The Apache ShardingSphere project has a long history. From being an open 
source database sharding middleware to being the Database Plus concept 
initiator. Currently, Apache ShardingSphere follows the Database Plus 
development concept and has completed most of its foundations according to the 
three core values of Database Plus.
+
+### 1. Connect layers
+ShardingSphere supports database protocols such as MySQL, PostgreSQL, 
[openGauss](https://opengauss.org/en/), and MySQL, PostgreSQL, openGauss, SQL 
Server, Oracle, and all SQL dialects that support the SQL 92 standard.
+The abstract top-level interface of the connect layers is open for other 
databases to interconnect, including database protocols, SQL parsing, and 
database access.
+
+### 2. Enhance layers
+The function enhancement of ShardingSphere is divided into the kernel layer 
and the optional function layer.
+The kernel layer contains query optimizers, distributed transactions, 
execution engine, permission engine, and other functions heavily related to the 
database kernel, as well as scheduling engine, distributed governance, and 
other functions heavily related to the distribution.
+Each module of kernel function must exist, but it can be switched to a 
different implementation type. Take the query optimizer as an example. If the 
SQL to be executed can be perfectly pushed down to the backend database, a 
computational pushdown engine based on the interaction between the original SQL 
and the database is used. If the SQL to be executed requires correlated queries 
across multiple data sources, a federated query engine based on the interaction 
between the query plan tree  [...]
+In addition to the most typical data sharding and read/write splitting, the 
functional modules such as high availability, elastic scaling, data encryption, 
and shadow library are being gradually improved.
+
+### 3. Pluggable layers
+The project has been completely transformed from the original architecture 
model centered on MySQL plus data fragmentation to the current microkernel plus 
pluggable architecture. ShardingSphere is completely pluggable from the 
database types and enhancements that provide connectivity, to its kernel 
capabilities.
+
+The core and periphery of ShardingSphere architecture are composed of three 
layers of models realized by microkernel, pluggable interface, and plugins. A 
single dependence is realized between layers, which means that the microkernel 
does not need to perceive plug-ins at all and plug-ins do not need to depend on 
each other. For a large-scale project with 200+ modules, the decoupling and 
isolation of the architecture is an effective guarantee of open collaboration 
between communities to mi [...]
+
+### Summary
+Database Plus is the development concept driving ShardingSphere, and 
ShardingSphere is the best practitioner of the Database Plus concept. As 
ShardingSphere becomes increasingly mature, the Database Plus puzzle will have 
found a prime example.
+
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/5rie16ua9qw9g3ywwffo.png)
+
+## The advantages of Database Plus
+Database Plus brings too many benefits for us to be able to list them all in 
this article. This article will only illustrate its impact on system 
architecture design and technology selection.
+
+### 1. Adaptability to flexible application scenarios
+You can fully customize features according to your needs. Data sharding is no 
longer indispensable when using ShardingSphere, and there are many 
ShardingSphere users who use the data encryption function independently. The 
pluggable capability of ShardingSphere is not only limited to the database 
access layer and the function enhancement module itself.
+Take data sharding as an example. As a pluggable part of the data sharding 
module, the sharding algorithm can be fully customized. Whether it is the 
standard `hash`, `range`, `time` and other sharding algorithms, or custom 
sharding algorithms, they all can be configured freely and flexibly according 
to your needs in a bid to achieve optimal performance.
+
+### 2. Microservice backend support capability for architects
+Apache ShardingSphere is the best solution for database unitization at the 
backend of microservices. As mentioned above, different microservices share a 
set of distributed database clusters, which cannot be viewed as a perfect and 
elegant solution in terms of architecture design asymetry or uncontrollable 
resource isolation. Building a distributed database cluster for each set of 
microservices would cause unnecessary resources waste.
+
+Compared with the heavyweight distributed database cluster, 
ShardingSphere-Proxy saves a lot of resources, laying a good foundation for 
each microservice cluster to have an independent set of database clusters. 
However, even if the microservice splitting is refined, it still takes a lot of 
resources to build a set of 
[Shardingsphere-Proxy](https://shardingsphere.apache.org/document/current/en/overview/#shardingsphere-proxy)
 for each set of microservices. In this case, the less resource-i [...]
+
+ShardingSphere-Proxy and ShardingSphere-JDBC can be used in a hybrid 
deployment to meet needs such as user-friendliness, cross-language adaptation, 
high-performance, and resource management. Additionally, ShardingSphere-Proxy 
and ShardingSphere-JDBC can be routed to each other in SQL requests with 
different characteristics through `Traffic Rule` to minimize the impact of 
application resource usage.
+
+`Traffic Rule` can send the requests consuming more computing resources to 
ShardingSphere-Proxy which has exclusive resources according to user-defined 
SQL characteristics (such as aggregate calculation and full-route queries 
without sharding keys), while reserving transactional lightweight operations at 
the microservice application end. This coincides with the concept of edge 
computing, which means the computing capability of ShardingSphere-JDBC in 
microservice applications is similar t [...]
+
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qazwfqqctb89aufap8xr.png)
+
+The flexible use of ShardingSphere-Proxy, ShardingSphere-JDBC, and `Traffic 
Rule` will continue to support architects’ design inspiration and creativity. 
As a joke, the proper use of ShardingSphere for elegant system design can be 
regarded as the threshold for a good architect.
+
+### 3. DistSQL brings a native database operation experience to DBAs
+Whille enhancing functions such as data sharding and data encryption, the 
previous version of Apache ShardingSphere mainly adopted `YAML` configuration. 
For developers, the flexible `YAML` configuration is convenient to use, but 
`YAML` configuration is actually inconvenient for DBAs. Apart from changing the 
SQL operating habits of DBAs, it is unable to connect with third-party systems 
such as permission, security, work order, monitoring, and audit.
+
+The new version of Apache ShardingSphere adds DistSQL operation mode. This 
allows you to use the enhanced features through SQL at any database terminal 
such as MySQL, Cli, or Navicat. DistSQL matches all of the powerful functions 
such as data sharding, data encryption, and read/write splitting. You can 
configure all functions using SQL syntax such as `CREATE`, `ALTER`, `DROP`, or 
`SHOW` that the DBA is familiar with. DistSQL also supports the management and 
control of authorized statemen [...]
+
+The database table structure is the metadata of theDatabase, and enhancement 
function configuration is the metadata of Database Plus. DistSQL not only 
improves the user-friendliness, but also completes the final puzzle for the 
release, operation and maintenance of Apache ShardingSphere.
+
+### 4. Proxyless model contributes to optimal performance
+In the field of Service Mesh, Istio plus Envoy is the most widespread 
architecture. It manages Envoy through Sidecar deployment while remaining 
non-intrusive to applications — giving what is called Proxy Service Mesh. It 
reduces development, use, and upgrade costs but its performance is lowered due 
to the addition of the Proxy/Sidecar layer in access links.
+
+Proxyless Service Mesh adopts another design model, which derives from the 
implementation of xDS protocol by gRPC. The new version of Istio allows the 
application code to pass directly through the SDK provided by Istio Agent 
through gRPC plus xDS programming, notably improving communication efficiency.
+
+In the field of distributed databases, the architecture design of 
storage/computing splitting has been widely accepted. The design of splitting 
compute and storage nodes is somewhat similar to the architectural model of 
Proxy Service Mesh. ShardingSphere-Proxy is also designed with the 
aforementioned model. It can effectively reduce development, use, and upgrade 
costs for users, but its performance is inevitably lowered.
+
+For performance-sensitive applications, ShardingSphere-JDBC, which coincides 
with the design concept of Proxyless Service Mesh, is undoubtedly more 
applicable and can be used to maximize system performance. A recent performance 
test with 16 servers to the [TPC-C model using ShardingSphere plus openGauss 
achieved over 10 million transactions per minute (tpmC), breaking performance 
records.](https://medium.com/codex/apache-shardingsphere-opengauss-breaking-the-distributed-database-performa
 [...]
+
+### Summary
+The gRPC’s Proxyless design concept was born recently, while the 
ShardingSphere-JDBC project has been around since the open source project began 
in 2016. Therefore, ShardingSphere-JDBC did not refer to the design concept of 
Proxyless. It was born based on the demand Internet focused businesses from 
Asia, for maximum performance including high concurrency and low latency.
+
+The same is true for the design of the Database Plus concept, borne out of to 
solve real world enterprise bottlenecks. ShardingSphere is the best example 
Database Plus, as its design is derived from real business scenarios. The 
Chinese Internet vertical where i spent most of my professional experience, is 
undoubtedly one of the most comprehensive scenarios in the world. Thus, a 
design concept born in such scenarios must also have extensive room for 
development.
+
+## Future plan
+Although the Database Plus concept has witnessed more and more iterations, 
Apache ShardingSphere still has a long way to go. The database gateway and 
heterogeneous federated query are the important functional pieces of the puzzle 
to complete the Database Plus concept.
+
+### 1. Database gateway
+Although Apache ShardingSphere supports the interconnection of heterogeneous 
databases, it cannot achieve the conversion of dialects between databases. In 
ShardingSphere’s route planning, SQL dialect conversion is an important 
function to achieve database gateway. It is no longer a difficult task for 
users to access MongoDB in MySQL dialect via PostgreSQL’s database protocol.
+
+### 2. Heterogeneous federated query
+Apache ShardingSphere currently only supports federated queries between 
homogeneous databases. In ShardingSphere’s route planning, federated query 
between heterogeneous databases will also be put on the agenda. It will no 
longer be far away for users to query MySQL and HBase with one SQL.
+
+## Conclusion
+The Apache ShardingSphere community has been active in open source for 7 
years. Through perseverance, the community has become mature we’d like to 
extend our sincere welcome to any devs or contributors who are enthusiastic 
about open source and coding to collaborate with us.
+
+Among our recent achievements we’re particulary proud of, Apache 
ShardingSphere’s pluggable architecture and data sharding philosophy have been 
recognized by the academic community. [The paper, Apache ShardingSphere: A 
Holistic and Pluggable Platform for Data Sharding, has been published at this 
year’s ICDE, a top conference in the database 
field.](https://faun.pub/a-holistic-pluggable-platform-for-data-sharding-icde-2022-understanding-apache-shardingsphere-55779cfde16)
+
+## Author
+
+Zhang Liang, the founder & CEO of [SphereEx](https://www.sphere-ex.com), 
served as the head of the architecture and database team of many large 
well-known Internet enterprises. He is enthusiastic about open source and is 
the founder and PMC chair of Apache ShardingSphere, ElasticJob, and other 
well-known open source projects.
+He is now a member of the [Apache Software 
Foundation](https://www.apache.org), a [Microsoft 
MVP](https://mvp.microsoft.com), [Tencent Cloud 
TVP](https://cloud.tencent.com/tvp), and [Huawei Cloud 
MVP](https://developer.huaweicloud.com/mvp) and has more than 10 years of 
experience in the field of architecture and database. He advocates for elegant 
code, and has made great achievements in distributed database technology and 
academic research. He has served as a producer and speaker at doze [...]
\ No newline at end of file
diff --git 
a/docs/blog/content/material/2022_05_17_How_does_Apache_ShardingSphere_implement_distributed_transactions.en.md
 
b/docs/blog/content/material/2022_05_17_How_does_Apache_ShardingSphere_implement_distributed_transactions.en.md
new file mode 100644
index 00000000000..384ee9d9859
--- /dev/null
+++ 
b/docs/blog/content/material/2022_05_17_How_does_Apache_ShardingSphere_implement_distributed_transactions.en.md
@@ -0,0 +1,277 @@
++++ 
+title = "How does Apache ShardingSphere implement distributed transactions?"
+weight = 55
+chapter = true 
++++
+
+With increasing amounts of data and increasing traffic load, business systems 
are facing significant challenges, and a strong demand for the scalability of 
database systems has emerged. Problems in online scaling of traditional 
standalone databases such as [Oracle](https://www.oracle.com/), 
[MySQL](https://www.mysql.com/), [SQL 
Server](https://www.microsoft.com/en-us/sql-server/sql-server-downloads), and 
[PostgreSQL](https://www.postgresql.org/) is now more evident than ever. In 
such cas [...]
+
+In this context, ShardingSphere provides a distributed database computing 
enhancement engine that can support distributed transactions with an ecosystem 
that is easily expandable thanks to a plugin oriented architecture.
+
+## Transaction processing
+**Transaction properties**
+Three properties define transactions: atomicity, durability, consistency, and 
isolation.
+
+**Atomicity**
+
+In a distributed scenario, the operations of a transaction may be distributed 
on multiple physical nodes, ensuring that all operations on multiple nodes 
succeed or fail.
+
+**Durability**
+
+The operation of a transaction committed is valid even if the power is cut off.
+
+**Consistency**
+
+**Tip:** Here “Consistency” is different from the C in CAP theorem. The C in 
CAP refers to the data consistency between multiple copies, but here it refers 
to an abstract concept between different levels.
+
+From a user’s persperctive, data transfers from one state to another and both 
states are confined to a certain degree. For example:
+
+Bank account A has 500$ and account B has 500$, totalling 1,000$. After A and 
B perform the transfer operation in a transaction, their total amount is still 
1,000$.
+
+**Isolation**
+
+When transactions are executed concurrently, data correctness is ensured. For 
example, two transactions modify the same data at the same time to ensure that 
the two transactions are executed in a certain order to keep the data correct.
+
+**Challenges**
+Compared to standalone transactions, distributed transactions face the 
following challenges:
+
+1. Atomicity. For standalone transactions, using undo log and redo log 
guarantees full commit or full roll-back. However, distributed transactions 
involve multiple physical nodes, each of which is different. Some node logs can 
be written successfully, while others fail.
+2. Network instability. Communication is stable for a standalone machine and 
any operation can be answered whether successful or not. However, in a 
distributed scenario, the network is unstable, and an operation may not be 
answered. Thus, how to ensure the availability of distributed transactions 
(clearing and recovery of abnormal transactions etc.) is a problem.
+3. Concurrency control. With the advent of MVCC, linearizable operation has 
become a rigid demand. Globally increasing transaction numbers can easily be 
done in a standalone database, but not in a distributed scenario.
+
+**Solutions**
+**Atomic commit**
+
+The mainstream solution to atomicity and network instability problems is [2PC 
(two-phase commit 
protocol)](https://en.wikipedia.org/wiki/Two-phase_commit_protocol), which 
defines two roles, that is TM (Transaction Manager) and RM (Resource Manager).
+
+In distributed scenarios, the operations of a transaction can be distributed 
in multiple nodes. And the transaction includes two phases.
+
+**Phase One:** RM locks related resources and performs specific operations and 
then returns success or failure to TM.
+
+**Phase Two:** according to the results returned by RM in the first phase, TM 
will execute the final commit operations (transaction state change, lock state 
deletion, etc.) if all the operations are successful, and roll back if any 
fails.
+
+**Note:** There are, of course, some things that need to be optimized, such as 
converting transactions that do not involve multiple nodes to one-phase commit 
etc.
+
+**Note:** The two-phase commit protocol only addresses the commit issue. 
Either the commit succeeds or fails. There is no intermediate state of partial 
success.It’s not necessarily related to the transaction isolation level.
+
+**Concurrency control**
+
+Concurrency control is a strategy ensuring that concurrent transactions are 
executed at a certain isolation level. Since the advent of Multiversion 
Concurrency Control (MVCC), mainstream databases have largely abandoned the 
previous two-phase locking model.
+
+Concurrency control essentially controls the concurrency of data read and 
write. The concurrency control stratigies determine the isolation level, and 
concurrency control addresses the following two problems:
+
+1. Determining the granularity of concurrency. For example, MySQL has row 
locks (lock granularity is one row) and table locks (lock granularity is one 
table) and so on.
+2. Three concurrency scenarios:
+a. Read concurrency. No special processing is required because no data changes 
are involved.
+b. Write concurrency. Do not write concurrently., otherwise data corruption 
will occur.
+c. Read/write concurrency. Performance optimization is mainly completed in 
this scenario. There are a variety of concurrency control mechanisms, among 
which Multiversion Concurrency Control is the most popular one.
+
+**MVCC Model**
+Two mainstream implementation modes:
+
+1. Based on transaction ID and ReadView
+The transaction ID is obtained for each transaction to identify the sequence 
in which the transaction is started. Snapshots are obtained through the active 
list to store multiple versions of data with the transaction ID, to achieve the 
effect of concurrency control. MySQL and 
[Postgres-XL](https://www.postgres-xl.org/) both take this approach.
+
+2. Based on timestamp
+By introducing timestamp, the visibility can be determined by adding 
timestamp-related attributes to data and comparing commits (commit timestamp) 
and Snapshot timestamp of data, to achieve the linearized concurrency control. 
That’s what [Spanner](https://research.google/pubs/pub39966/) did.
+
+The above two modes are dependent on the generation of global transaction 
numbers. The common generation mechanisms include TrueTime (used by Spanner), 
HLC ([CockroachDB](https://www.cockroachlabs.com/) uses HLC with errors), and 
TSO (Timestamp Oracle).
+
+## ShardingSphere’s transaction design
+ShardingSphere’s transaction feature is built on the local transaction of the 
storage DB, providing three transaction modes: `LOCAL`, `XA` and `BASE`. You’d 
only need to use the native transaction mode (`begin`/`commit`/`roll-back`) to 
use the three modes and make appropriate trade-offs between consistency and 
performance.
+
+**LOCAL**
+
+`LOCAL` mode is directly built on the local transaction of thestorage DB. It 
has the highest performance, although there’s a problem with atomicity. If you 
can tolerate this problem, it’s a good choice.
+
+**XA**
+
+In XA mode, the XA protocol is based on a set of interaction protocols defined 
by 2PC. It defines the `xa start/prepare/end/commit/rollback` interface. The 
commonly-used implementations have Narayana, Atomics, and ShardingSphere 
integrate the XA implementation of Narayana and Atomics.
+
+1. The app connects to the Proxy, and the Proxy creates a session object bound 
to the connection.
+2. The app executes `begin`. Proxy creates a logical transaction through 
Narayana TM, and binds it to the current session.
+3. The app executes specific SQL. Session establishes a connection to storage 
DB, and connection can be registered to transaction through 
`Transaction.enlistResource ()` interface. Then execute`XA START {XID}` to 
start the transaction and execute the SQL overwritten by routing.
+4. The app runs the `commit` command, executes `xa prepare` for each 
connection registered in the transaction database, updates the transaction 
status to `prepared`, and performs `xa commit` for each connection. If the 
system returns ok, updates the transaction status to `Committed`, and the 
transaction is committed successfully. If the `prepare` process fails, you can 
run the `rollback` command to roll back the data. If not, a background process 
will clear the data.
+5. The app runs the `rollback` command, and the connection registered in the 
transaction that connects the storage DB executes `xa rollback` respectively.
+![Image 
description](https://res.cloudinary.com/practicaldev/image/fetch/s--So-H7H7E--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/krrq38yl1vk54ymk8hyd.png)
+ 
+**BASE**
+
+[BASE (Basically Available, Soft State, Eventually Consistent) 
](https://phoenixnap.com/kb/acid-vs-base)mode. BASE transaction is the result 
of balance between C and A in CAP theorem. The AT mode of 
[Seata](https://seata.io/en-us/) is an implementation of BASE transaction, and 
ShardingSphere integrates the AT implementation of Seata.
+
+1. The app connects to the Proxy, and the Proxy creates a session object bound 
to the connection.
+2. The app executes `begin`. Proxy creates a logical transaction through Seata 
TM, binds it to the current session and registers it with Seata Server.
+3. The app executes logical SQL. Session establishes a connection to the 
storage DB. Each connection is a `ConnectionProxy` instance of Seata. And then 
parse the `actual sql` overwritten by routing and execute interceptions. For 
example, if it is a modification operation, execute `begin` to obtain the local 
lock, execute an `SQL` query, execute `commit` to release the local lock, and 
report the branch transaction results to Seata Server.
+4. After the app runs the `commit` command, the Seata TM in Proxy notifies the 
Seata Server and directly returns to the app. The Seata Server asynchronously 
interacts with the Proxy to delete transaction logs.
+5. The app runs the `rollback` command. After the Seata TM in the Proxy 
notifies the Seata Server, the Proxy directly returns to the app. The Seata 
Server asynchronously interacts with the Proxy, performs compensation 
operations, and deletes transaction logs.
+![Image 
description](https://res.cloudinary.com/practicaldev/image/fetch/s--zm7EfmVq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/4a7tny3orawdfl02bzuv.png)
+ 
+
+## Examples
+**Installation package preparation**
+Take an XA with good supporting capability integrated with Narayana 
implementation as an example. Due to the Narayana License issue, it cannot be 
packaged directly into the installation package, and additional dependencies 
need to be added.
+
+Download the installation package from the official website, decompress it to 
the directory `${ShardingSphere}` , and add the following `jar` packages to the 
directory `${ShardingSphere}/lib`.
+
+（Download：https://mvnrepository.com/）
+
+```
+jta-5.12.4.Final.jar
+arjuna-5.12.4.Final.jar
+common-5.12.4.Final.jar
+jboss-connector-api_1.7_spec-1.0.0.Final.jar                                   
          | 
------------------------------------------------------------------------------------------------------------------------------------
+jboss-logging-3.2.1.Final.jar                                                  
  | 
------------------------------------------------------------------------------------------------------------------------------------
+jboss-transaction-api_1.2_spec-1.0.0.Alpha3.jar                                
           | 
------------------------------------------------------------------------------------------------------------------------------------
+jboss-transaction-spi-7.6.0.Final.jar
+mysql-connector-java-5.1.47.jar                                                
   | 
------------------------------------------------------------------------------------------------------------------------------------
+narayana-jts-integration-5.12.4.Final.jar
+shardingsphere-transaction-xa-narayana-5.1.1-SNAPSHOT.jar
+```
+**MySQL instance preparation**
+Prepare two MySQL instances: 127.0.0.1:3306 and 127.0.0.1:3307.
+Create user root with password 12345678 for each MySQL instance.
+Create a test library for each MySQL instance.
+
+**ShardingSphere-Proxy configuration**
+Modify the transaction configuration of `server.yaml`
+
+```
+rules:
+  - !AUTHORITY
+    users:
+      - root@%:root
+      - sharding@:sharding
+    provider:
+      type: ALL_PRIVILEGES_PERMITTED
+  - !TRANSACTION
+    defaultType: XA
+    providerType: Narayana
+```
+Modify `conf/conf-sharding.yaml`
+
+```yaml
+dataSources:
+  ds_0:
+    url: 
jdbc:mysql://127.0.0.1:3306/test?serverTimezone=UTC&allowPublicKeyRetrieval=true&useSSL=false
+    username: root
+    password: 12345678
+    connectionTimeoutMilliseconds: 30000
+    idleTimeoutMilliseconds: 60000
+    maxLifetimeMilliseconds: 1800000
+    maxPoolSize: 50
+    minPoolSize: 1
+  ds_1:
+    url: 
jdbc:mysql://127.0.0.1:3307/test?serverTimezone=UTC&allowPublicKeyRetrieval=true&useSSL=false
+    username: root
+    password: 12345678
+    connectionTimeoutMilliseconds: 30000
+    idleTimeoutMilliseconds: 60000
+    maxLifetimeMilliseconds: 1800000
+    maxPoolSize: 50
+    minPoolSize: 1
+
+rules:
+  - !SHARDING
+    tables:
+      account:
+        actualDataNodes: ds_${0..1}.account${0..1}
+        tableStrategy:
+          standard:
+            shardingColumn: id
+            shardingAlgorithmName: account_inline
+        keyGenerateStrategy:
+          column: id
+          keyGeneratorName: snowflake
+    defaultDatabaseStrategy:
+      standard:
+        shardingColumn: id
+        shardingAlgorithmName: database_inline
+    defaultTableStrategy:
+      none:
+
+    shardingAlgorithms:
+      database_inline:
+        type: INLINE
+        props:
+          algorithm-expression: ds_${id % 2}
+      account_inline:
+        type: INLINE
+        props:
+          algorithm-expression: account${id % 2}
+
+    keyGenerators:
+      snowflake:
+        type: SNOWFLAKE
+        props:
+          worker-id: 123
+```
+**Start ShardingSphere-Proxy**
+Start the Proxy by running the following command:
+
+```bash
+cd ${ShardingSphere}
+./bin/start.sh
+```
+**Use ShardingSphere-Proxy**
+Use MySQL Client to connect shardingSphere-Proxy for testing, refer to the 
following command.
+
+```
+mysql -h127.0.0.1 -P3307 -uroot -proot
+mysql> use sharding_db;
+Database changed
+mysql> create table account(id int, balance float ,transaction_id int);
+Query OK, 0 rows affected (0.12 sec)
+
+mysql> select * from account;
+Empty set (0.02 sec)
+
+mysql> begin;
+Query OK, 0 rows affected (0.09 sec)
+
+mysql> insert into account(id, balance, transaction_id) values(1,1,1),(2,2,2);
+Query OK, 2 rows affected (0.53 sec)
+
+mysql> select * from account;
++------+---------+----------------+
+| id  | balance | transaction_id |
++------+---------+----------------+
+|  2 |   2.0 |       2 |
+|  1 |   1.0 |       1 |
++------+---------+----------------+
+2 rows in set (0.03 sec)
+
+mysql> commit;
+Query OK, 0 rows affected (0.05 sec)
+
+mysql> select * from account;
++------+---------+----------------+
+| id  | balance | transaction_id |
++------+---------+----------------+
+|  2 |   2.0 |       2 |
+|  1 |   1.0 |       1 |
++------+---------+----------------+
+2 rows in set (0.02 sec)
+```
+
+## Future plan
+Currently, ShardingSphere’s distributed transaction integrates the 2PC 
implementation scheme of the 3rd party to guarantee atomicity. Isolation 
depends on the isolation guarantee of the storage DB, providing available 
transaction functions.
+
+The future implementation of MVCC based on global Timestamp and combined with 
2PC, will provide better support for transaction isolation semantics.
+
+**Apache ShardingSphere Project Links:**
+[ShardingSphere 
Github](https://github.com/apache/shardingsphere/issues?page=1&q=is%3Aopen+is%3Aissue+label%3A%22project%3A+OpenForce+2022%22)
+
+[ShardingSphere Twitter](https://twitter.com/ShardingSphere)
+
+[ShardingSphere 
Slack](https://join.slack.com/t/apacheshardingsphere/shared_invite/zt-sbdde7ie-SjDqo9~I4rYcR18bq0SYTg)
+
+[Contributor Guide](https://shardingsphere.apache.org/community/cn/contribute/)
+
+## Author
+
+Lu Jingshang
+
+> [Apache ShardingSphere](https://shardingsphere.apache.org/) Committer & 
Infrastructure R&D Engineer at [SphereEx](https://www.sphere-ex.com/).
+> Enthusiastic about open source and database technology.
+> Focus on developing Apache ShardingSphere transaction module.
+
diff --git 
"a/docs/blog/content/material/2022_05_19_Apache_ShardingSphere_Enterprise_User_Case_Zhongshang_Huimin\342\200\231s_Transaction_Middle_Platform_Architecture.en.md"
 
"b/docs/blog/content/material/2022_05_19_Apache_ShardingSphere_Enterprise_User_Case_Zhongshang_Huimin\342\200\231s_Transaction_Middle_Platform_Architecture.en.md"
new file mode 100644
index 00000000000..0158e1d8528
--- /dev/null
+++ 
"b/docs/blog/content/material/2022_05_19_Apache_ShardingSphere_Enterprise_User_Case_Zhongshang_Huimin\342\200\231s_Transaction_Middle_Platform_Architecture.en.md"
@@ -0,0 +1,146 @@
++++ 
+title = "Apache ShardingSphere Enterprise User Case: Zhongshang Huimin’s 
Transaction Middle Platform Architecture"
+weight = 56
+chapter = true 
++++
+
+The FMCG (fast moving consumer-goods) industry can no longer meet current 
development requirements. The traditional way of only upgrading the end of the 
supply chain cannot achieve high-quality business development. 
[Huimin](https://huimin.cn/) focuses on order fulfillment and delivery service 
for over one million community supermarkets, transforming and upgrading from 
serving retail terminal stores to promoting the digitization of the whole FMCG 
industry chain.
+
+It is committed to building a B2B2C data closed-loop operating system, 
providing brand owners, distributors, and community supermarkets with 
transformed and upgraded solutions for the whole chain covering purchasing, 
marketing, operation, and distribution. All these commitments cannot be 
fulfilled without its powerful “middle platform” system.
+
+## Introducing the “middle platform”
+Huimin began to transferred [PHP](https://www.php.net/) technology stack to 
the Java technology stack, and shifted to microservices from 2016 to 2017, and 
started to build the “middle platform” in 2018.
+
+The phase-one project of its transaction middle platform went live in early 
2021, and phase-two in March 2022.
+
+Owing to business growth and various changes, Huimin launched a strong “middle 
platform” development strategy to improve efficiency and reduce costs. Measures 
were taken to reduce the coupling degree between systems, increase scalability, 
and ensure low latency while extracting business commonalities. At the 
beginning of 2021, the company finihsed rebuilding the previous order 
management system (OMS) and launched a packaged business capability project for 
a transaction middle platform ce [...]
+
+To synergize technical and business requirements during the rebuilding 
process, the go-live process was divided into two phases: application system 
splitting and data splitting.
+
+## System splitting
+R&D teams having to frequently adjust order management system and adapt it to 
multiple business lines may cause inefficiencies. In response to these 
problems, the following principles were formulated to split and rebuild the 
system:
+
+1. Avoid the logic coupling of different lines of business. Different business 
lines should contain and minimize influence on each other to reduce logic 
complexity.
+2. Responsibilities are divided according to the order lifecycle to ensure 
that each system is independent with an exclusive responsibility.
+3. Read/write splitting. Ensure the stability of core functions and avoid the 
impact of frequent iterations on core functions.
+4. [ElasticSearch](https://www.elastic.co/) was introduced to solve the 
external index problem, and reduce database index pressure.
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/1223xl1pboe6emvrjian.png)
+ 
+
+## Data splitting
+The purpose of data splitting is to reduce the maintenance pressure of a 
single table. When the amount of data reaches tens of millions, the index and 
field maintenance of database tables will have a great impact on the online 
environment. Before splitting a table, you have to consider the following 
questions:
+
+1. Splitting rules: what tables are data-intensive? What about the number of 
splits?
+2. Data read/write: how to solve the read/write problem after splitting tables?
+3. Go-live solution: how to go live smoothly?
+
+### Splitting rules
+
+During a 3 years iteration period, the number of splits = ((amount of data 
added per month * 36) + Amount of historical data) / maximum amount of data in 
a single table.
+
+When we use the [InnoDB](https://en.wikipedia.org/wiki/InnoDB) engine for DDL 
operation to maintain [Alibaba 
Cloud](https://www.alibabacloud.com/?spm=a2796.143805.6791778070.2.b8fa46e3JLecKh)
 RDS MySQL, we come up with the following results:
+
+When the amount of data is less than 5 million, the execution time is about 
tens of seconds. Between 5 million and less than 10 million, the execution time 
is about 500 seconds; between 10 million and less than 50 million, the 
execution time is about 1000 seconds; and when more than 50 million, the 
execution time is more than 2000 seconds.
+
+The upper limit for splitting a single table depends on the impact of a single 
table operation on business. Of course, some solutions can tackle the problem 
of table lock resulting from DDL operation. For example, dual-table 
synchronization and changes in table names can reduce the impact to within 
seconds.
+
+The split number of the library table is preferably 2 to the power of N, which 
is conducive to later horizontal scaling.
+
+### Technology selection
+
+After the data table is split, data hashing and query issues need to be 
resolved. As a SME (small & medium-sized enterprise), the cost of developing a 
set of sharding middleware is too high, and it will cause various risks in the 
early stage. A reasonable solution is to adopt a set of open source database 
sharding middleware.
+
+**[Apache ShardingSphere](https://shardingsphere.apache.org/)**
+
+Apache ShardingSphere is an open source ecosystem that allows you to transform 
any database into a distributed database system. The project includes a JDBC, a 
Proxy and Sidecar (Planning), which can be deployed independently and support 
mixed deployment. They all provide functions of data scale-out, distributed 
transactions and distributed governance.
+
+Apache ShardingSphere 5.x introduces the concept of [Database 
Plus](https://faun.pub/whats-the-database-plus-concepand-what-challenges-can-it-solve-715920ba65aa),
 and the new version focuses on a plugin oriented architecture. Currently, 
functions including data sharding, read/write splitting, data encryption, 
shadow library pressure testing, as well as SQL and protocol support such as 
[MySQL](https://www.mysql.com/), [PostgreSQL](https://www.postgresql.org/), 
[SQL Server](https://www.mic [...]
+
+Why Huimin chose Apache ShardingSphere:
+
+1. Apache ShardingSphere has a variety of functions and can solve current 
challenges with strong scalability.
+
+2. The Apache ShardingSphere community is very active, which means the 
possibility to find support quickly.
+
+3. The company adopts the SpringCloud technology stack which is convenient for 
integration and can lower costs.
+
+4. Its performance can fully support Huimin’s existing business.
+
+ShardingSphere supports the following 3 modes:
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/19343yqtstyzsmvtu69j.png)
+ 
+
+Currently, Huimin’s server-side technology stack only involves Java language, 
so there is no need to consider heterogeneous scenarios. Taking flexibility, 
code intrusion and deployment cost into consideration, the implementation of 
[ShardingSphere-JDBC](https://shardingsphere.apache.org/document/current/en/overview/#shardingsphere-jdbc)
 wasfinally selected.
+
+### Technology implementation
+
+The most complicated part of the process is actually the go-live part, not the 
plugin integration part as would someone expect.
+
+To achieve non-perception and rollback, we divided the go-live process into 
the following steps, as shown in the figure:
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/dyrbcvaql5nhnw4m0xns.png)
+ 
+
+**Step 1:** Processing all the data needs to ensure the data consistency 
between the old and new libraries. Since there currently is no synchronization 
tool supporting data sharding policy, we developed a data synchronization tool 
to support the configuration of the sharding policy.
+
+**Step 2:** Process incremental data to keep data consistency between the old 
and new libraries. Our implementation scheme here was to use the open source 
component, to listen to the binlog of the database and synchronize the database 
changes to the message queue, and then the data synchronization tool listens to 
the messages and writes them into the new library.
+
+The architecture scheme is not complicated, but great importance was attached 
to the data consistency in this process. Incremental data cannot be lost, and 
single-line data should be written in sequential order to prevent ABA problems.
+
+**Step 3:** Process the gray-release of read-traffic. At this point, we were 
writing an old library and reading a new library in the database. In this 
process, we should paid attention to the sensitivity of business to data 
consistency, because the delay from an old library to a new library is within 
seconds. Reading an old library is needed when we come across some sensitive 
scenarios requiring high consistency.
+
+**Step 4:** Switch all read-traffic of the application to the new library, 
maintaining the status of writing the old library and reading the new library.
+
+**Step 5:** Processing write-traffic didn’t take gray-release scenarios into 
consideration in order to reduce the complexity of the program. Of course, the 
grayscale scheme can also achieve the same effect. We supported the rollback 
strategy by writing data from the new library back to the old library, and then 
uniformly transferring all traffic to the new library within one release.
+
+So far, the whole process of release and going live has been completed. Due to 
the low cost of completing the transformation with the help of 
ShardingSphere-JDBC middleware to rewrite SQL and merge results, there was 
almost no code intrusion and going live went smoothly.
+
+There are also some issues that you may need to be aware of when accessing the 
ShardingSphere-JDBC scheme:
+
+1. ShardingSphere-JDBC supports a default processing for data insertion logic. 
If there is no shard column in SQL, the same data will be inserted into each 
shard table.
+
+2. ShardingSphere-JDBC parses only the logil table of the first SQL for 
batches of SQL semantic analysis, which results in an error. This problem has 
been reported and should be fixed in the near future. Additionally, a detailed 
SQL sample is provided, which lists in detail the scope of support.
+
+3. ShardingSphere connection mode selection. The number of database 
connections visited by businesses should be limited for the sake of resource 
control. It can effectively prevent a service operation from occupying too many 
resources, which depletes the database connection resources and affects the 
normal access of other services. Especially in the case of many sub-tables in a 
database instance, a logical SQL without shard keys will produce a large number 
of actual SQL that falls on dif [...]
+
+From the perspective of execution efficiency, maintaining a separate database 
connection for each shard query can make more effective use of multi-threading 
to improve execution efficiency. Building a separate thread for each database 
connection allows I/O consumption to be processed concurrently, maintains a 
separate database connection for each shard, and avoids premature loading of 
query result data into memory. A separate database connection can hold a 
reference to the cursor locatio [...]
+
+The result merging method by moving the result set cursor down is called 
streaming merge. It does not have to load all the result data into memory, 
which can effectively save memory resources and reduce the frequency of garbage 
collection. If each shard query cannot be guaranteed to have a separate 
database connection, the current query result set needs to be loaded into 
memory before reusing the database connection to obtain the query result set of 
the next shard table. Therefore, even  [...]
+
+It is a problem for the ShardingSphere execution engine to balance the 
relationship between the control and protection of database connection 
resources and saving middleware memory resources by adopting a better merging 
mode. Specifically, if a SQL needs to operate on 200 tables under a database 
instance after its sharding in ShardingSphere, whether to create 200 
connections and execute them concurrently or create one connection and execute 
it in sequence? How to choose between efficienc [...]
+
+In response to the above scenario, ShardingSphere provides a solution. It 
introduces the concept of Connection Mode and divides it into `MEMORY_STRICTLY` 
mode and `CONNECTION_STRICTLY` mode.
+
+`MEMORY_STRICTLY`
+
+The premise of using this pattern is that ShardingSphere does not limit the 
number of database connections required by an operation. If the actual SQL 
executed needs to operate on 200 tables in a database instance, a new database 
connection is created for each table and processed concurrently through 
multi-threading to maximize the execution efficiency. When SQL meets the 
conditions, streaming merge is preferred to prevent memory overflow or frequent 
garbage collection.
+
+`CONNECTION_STRICTLY`
+
+The premise for using this pattern is that ShardingSphere strictly limits the 
number of database connections required for an operation. If the actual SQL 
executed needs to operate on 200 tables in a database instance, only one single 
database connection is created and its 200 tables are processed sequentially. 
If shards in an operation are scattered across different databases, 
multi-threading is still used to handle operations on different libraries, but 
only one database connection is c [...]
+
+The `MEMORY_STRICTLY` mode is applicable for OLAP operations and can improve 
system throughput by relaxing limits on database connections. The 
`CONNECTION_STRICTLY` mode applies to OLTP operations, which usually have shard 
keys and are routed to a single shard. Therefore, it is a wise choice to 
strictly control database connections to ensure that database resources in an 
online system can be used by more applications.
+
+We found that in the `MEMORY_STRICTLY` mode, the operation would become 
I/O-intensive due to the cache buffer loading strategy of MySQL InnoDB engine, 
resulting in SQL timeouts. Its solution is to add another layer of processing 
without changing database resources. If there is no shard key, verify the shard 
key in the external index, and then use the shard key to perform database 
retrieval.
+
+## ShardingSphere Benefits
+**1. Performance improvement**
+Through architecture rebuilding, it can effectively control the amount of 
single table data, and greatly reduce the slow SQL, down nearly 50%.
+
+**2. Save R&D resources and lower cost**
+
+The introduction of Apache ShardingSphere does not require the re-development 
of sharding components, which reduces the R&D cost and lowers the risks.
+
+**3. Strong scalability**
+
+Apache ShardingSphere has good scalability in terms of data encryption, 
distributed transactions, shadow library and other aspects.
+
+## Conclusion
+
+In his book [Building Microservices, Sam 
Newman](https://samnewman.io/books/building_microservices/) writes, “ Our 
requirements shift more rapidly than they do for people who design and build 
buildings — as do the tools and techniques at our disposal. The things we 
create are not fixed points in time. Once launched into production, our 
software will continue to evolve as the way it is used changes. For most things 
we create, we have to accept that once the software gets into the hands of [...]
+
+## Author
+**Zhai Yang**
+
+Software architect at Zhongshang Huimin E-Commerce platform team.
+
+Participated in the 0–1 process of middle platform construction, mainly 
responsible for the R&D and building the transaction and commodity middle 
platform and the search platform.
\ No newline at end of file
diff --git 
a/docs/blog/content/material/2022_05_24_Your_Guide_to_DistSQL_Cluster_Governance_Capability_Apache_ShardingSphere_Feature_Update.en.md
 
b/docs/blog/content/material/2022_05_24_Your_Guide_to_DistSQL_Cluster_Governance_Capability_Apache_ShardingSphere_Feature_Update.en.md
new file mode 100644
index 00000000000..b2eac41cbb3
--- /dev/null
+++ 
b/docs/blog/content/material/2022_05_24_Your_Guide_to_DistSQL_Cluster_Governance_Capability_Apache_ShardingSphere_Feature_Update.en.md
@@ -0,0 +1,449 @@
++++ 
+title = "Your Guide to DistSQL Cluster Governance Capability——Apache 
ShardingSphere Feature Update"
+weight = 57
+chapter = true 
++++
+
+Since Apache ShardingSphere 5.0.0-Beta version included DistSQL, it made the 
project increasingly loved by developers and Ops teams for its advantages such 
as dynamic effects, no restart, and elegant syntax close to standard SQL.
+
+With upgrades to 5.0.0 and 5.1.0, the ShardingSphere community has once again 
added abundant syntax to DistSQL, bringing more practical features.
+
+In this post, the community co-authors will share the latest functions of 
DistSQL from the perspective of “cluster governance”.
+
+## ShardingSphere Cluster
+In a typical cluster composed of ShardingSphere-Proxy, there are multiple 
`compute nodes` and storage nodes, as shown in the figure below:
+
+To make it easier to understand, in ShardingSphere, we refer to Proxy as a 
compute node and Proxy-managed distributed database resources (such as `ds_0`, 
`ds_1`) as `resources` or `storage nodes`.
+![Image 
description](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/i233gunx5u6tr3xej822.png)
+ 
+
+Multiple Proxy or compute nodes are connected to the same register center, 
sharing configuration, and rules, and can sense each other’s online status.
+
+These compute nodes also share the underlying storage nodes, so they can 
perform read and write operations to the storage nodes at the same time. The 
user application is connected to any compute node and can perform equivalent 
operations.
+
+Through this cluster architecture, you can quickly scale Proxy horizontally 
when compute resources are insufficient, reducing the risk of a single point of 
failure and improving system availability.
+
+The load balancing mechanism can also be added between application and compute 
node.
+
+**Compute Node Governance**
+Compute node governance is suitable for Cluster mode. For more information 
about the ShardingSphere modes, please see [Your Detailed Guide to Apache 
ShardingSphere’s Operating 
Modes](https://medium.com/codex/your-detailed-guide-to-apache-shardingspheres-operating-modes-e50df1ee56e4).
+
+**Cluster Preparation**
+Take a standalone simulation of three Proxy compute nodes as an example. To 
use the mode, follow the configuration below:
+
+```yaml
+mode:
+  type: Cluster
+  repository:
+    type: ZooKeeper
+    props:
+      namespace: governance_ds
+      server-lists: localhost:2181
+      retryIntervalMilliseconds: 500
+      timeToLiveSeconds: 60
+      maxRetries: 3
+      operationTimeoutMilliseconds: 500
+  overwrite: false
+```
+Execute the bootup command separately:
+
+```bash
+sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3307
+sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3308
+sh %SHARDINGSPHERE_PROXY_HOME%/bin/start.sh 3309
+```
+After the three Proxy instances are successfully started, the compute node 
cluster is ready.
+
+`SHOW INSTANCE LIST`
+
+Use the client to connect to any compute node, such as 3307:
+
+```bash
+mysql -h 127.0.0.1 -P 3307 -u root -p
+```
+
+View the list of instances:
+
+```
+mysql> SHOW INSTANCE LIST;
++----------------+-----------+------+---------+
+| instance_id    | host      | port | status  |
++----------------+-----------+------+---------+
+| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled |
+| 10.7.5.35@3308 | 10.7.5.35 | 3308 | enabled |
+| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled |
++----------------+-----------+------+---------+
+```
+The above fields mean:
+
+- `instance_id `: The id of the instance, which is currently composed of host 
and port.
+- `Host` : host address.
+- `Port` : port number.
+- `Status` : the status of the instance enabled or disabled
+
+`DISABLE INSTANCE`
+
+`DISABLE INSTANCE` statement is used to set the specified compute node to a 
disabled state.
+
+💡Note: 
+the statement does not terminate the process of the target instance, but only 
virtually deactivates it.
+
+`DISABLE INSTANCE` supports the following syntax forms:
+
+```
+DISABLE INSTANCE 10.7.5.35@3308;
+#or
+DISABLE INSTANCE IP=10.7.5.35, PORT=3308;
+```
+Example:
+
+```
+mysql> DISABLE INSTANCE 10.7.5.35@3308;
+Query OK, 0 rows affected (0.02 sec)
+
+mysql> SHOW INSTANCE LIST;
++----------------+-----------+------+----------+
+| instance_id    | host      | port | status   | 
++----------------+-----------+------+----------+
+| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled  |
+| 10.7.5.35@3308 | 10.7.5.35 | 3308 | disabled |
+| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled  |
++----------------+-----------+------+----------+
+```
+After executing the `DISABLE INSTANCE `statement, by querying again, you can 
see that the instance status of Port 3308 has been updated to `disabled` , 
indicating that the compute node has been disabled.
+
+If there is a client connected to the `10.7.5.35@3308` , executing any SQL 
statement will prompt an exception:
+
+`1000 - Circuit break mode is ON.`
+💡Note: 
+It is not allowed to disable the current compute node. If you send 
`10.7.5.35@3309` to `DISABLE INSTANCE 10.7.5.35@3309` , you will receive an 
exception prompt.
+
+`ENABLE INSTANCE`
+
+`ENABLE INSTANCE` statement is used to set the specified compute node to an 
enabled state. `ENABLE INSTANCE` supports the following syntax forms:
+
+```
+ENABLE INSTANCE 10.7.5.35@3308;
+#or
+ENABLE INSTANCE IP=10.7.5.35, PORT=3308;
+```
+Example:
+
+```
+mysql> SHOW INSTANCE LIST;
++----------------+-----------+------+----------+
+| instance_id    | host      | port | status   | 
++----------------+-----------+------+----------+
+| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled  |
+| 10.7.5.35@3308 | 10.7.5.35 | 3308 | disabled |
+| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled  |
++----------------+-----------+------+----------+
+
+mysql> ENABLE INSTANCE 10.7.5.35@3308;
+Query OK, 0 rows affected (0.01 sec)
+
+mysql> SHOW INSTANCE LIST;
++----------------+-----------+------+----------+
+| instance_id    | host      | port | status   | 
++----------------+-----------+------+----------+
+| 10.7.5.35@3309 | 10.7.5.35 | 3309 | enabled  |
+| 10.7.5.35@3308 | 10.7.5.35 | 3308 | enabled  |
+| 10.7.5.35@3307 | 10.7.5.35 | 3307 | enabled  |
++----------------+-----------+------+----------+
+```
+
+After executing the `ENABLE INSTANCE` statement, you can query again and view 
that the instance state of Port 3308 has been restored to `enabled`.
+
+## How to Manage Compute Node Parameters
+In the previous article [Integrating SCTL into RAL — Making Apache 
ShardingSphere Perfect for Database 
Management](https://dzone.com/articles/integrating-sctl-into-distsqls-ral-making-apache-s),
 we explained the evolution of SCTL (ShardingSphere Control Language) to RAL 
(Resource & Rule Administration Language) and the new `SHOW VARIABLE` and `SET 
VARIABLE` syntax.
+
+However, in 5.0.0-Beta, the `VARIABLE` category of DistSQL RAL only contains 
only the following three statements:
+
+```
+SET VARIABLE TRANSACTION_TYPE = xx; （LOCAL, XA, BASE）
+SHOW VARIABLE TRANSACTION_TYPE;
+SHOW VARIABLE CACHED_CONNECTIONS;
+```
+By listening to the community’s feedback, we noticed that querying and 
modifying the props configuration of Proxy (located in `server.yaml`) is also a 
frequent operation. Therefore, we have added support for props configuration in 
DistSQL RAL since the 5.0.0 GA version.
+
+`SHOW VARIABLE`
+
+First, let’s review how to configure props:
+
+```yaml
+props:
+  max-connections-size-per-query: 1
+  kernel-executor-size: 16  # Infinite by default.
+  proxy-frontend-flush-threshold: 128  # The default value is 128.
+  proxy-opentracing-enabled: false
+  proxy-hint-enabled: false
+  sql-show: false
+  check-table-metadata-enabled: false
+  show-process-list-enabled: false
+    # Proxy backend query fetch size. A larger value may increase the memory 
usage of ShardingSphere Proxy.
+    # The default value is -1, which means set the minimum value for different 
JDBC drivers.
+  proxy-backend-query-fetch-size: -1
+  check-duplicate-table-enabled: false
+  proxy-frontend-executor-size: 0 # Proxy frontend executor size. The default 
value is 0, which means let Netty decide.
+    # Available options of proxy backend executor suitable: OLAP(default), 
OLTP. The OLTP option may reduce time cost of writing packets to client, but it 
may increase the latency of SQL execution
+    # and block other clients if client connections are more than 
`proxy-frontend-executor-size`, especially executing slow SQL.
+  proxy-backend-executor-suitable: OLAP
+  proxy-frontend-max-connections: 0 # Less than or equal to 0 means no 
limitation.
+  sql-federation-enabled: false
+    # Available proxy backend driver type: JDBC (default), ExperimentalVertx
+  proxy-backend-driver-type: JDBC
+```
+Now, you can perform interactive queries by using the following syntax:
+
+`SHOW VARIABLE PROXY_PROPERTY_NAME;`
+
+Example:
+
+```
+mysql> SHOW VARIABLE MAX_CONNECTIONS_SIZE_PER_QUERY;
++--------------------------------+
+| max_connections_size_per_query |
++--------------------------------+
+| 1                              |
++--------------------------------+
+1 row in set (0.00 sec)
+
+mysql> SHOW VARIABLE SQL_SHOW;
++----------+
+| sql_show |
++----------+
+| false    |
++----------+
+1 row in set (0.00 sec)
+……
+```
+……
+💡Note:
+For DistSQL syntax, parameter keys are separated by underscores.
+
+`SHOW ALL VARIABLES`
+
+Since there are plenty of parameters in Proxy, you can also query all 
parameter values through `SHOW ALL VARIABLES` :
+
+```
+mysql> SHOW ALL VARIABLES;
++---------------------------------------+----------------+
+| variable_name                         | variable_value |
++---------------------------------------+----------------+
+| sql_show                              | false          |
+| sql_simple                            | false          |
+| kernel_executor_size                  | 0              |
+| max_connections_size_per_query        | 1              |
+| check_table_metadata_enabled          | false          |
+| proxy_frontend_database_protocol_type |                |
+| proxy_frontend_flush_threshold        | 128            |
+| proxy_opentracing_enabled             | false          |
+| proxy_hint_enabled                    | false          |
+| show_process_list_enabled             | false          |
+| lock_wait_timeout_milliseconds        | 50000          |
+| proxy_backend_query_fetch_size        | -1             |
+| check_duplicate_table_enabled         | false          |
+| proxy_frontend_executor_size          | 0              |
+| proxy_backend_executor_suitable       | OLAP           |
+| proxy_frontend_max_connections        | 0              |
+| sql_federation_enabled                | false          |
+| proxy_backend_driver_type             | JDBC           |
+| agent_plugins_enabled                 | false          |
+| cached_connections                    | 0              |
+| transaction_type                      | LOCAL          |
++---------------------------------------+----------------+
+21 rows in set (0.01 sec)
+```
+
+`SET VARIABLE`
+
+Dynamic management of resources and rules is a special advantage of DistSQL. 
Now you can also dynamically update props parameters by using the `SET 
VARIABLE` statement. For example:
+
+```
+#Enable SQL log output
+SET VARIABLE SQL_SHOW = true;
+#Turn on hint function
+SET VARIABLE PROXY_HINT_ENABLED = true;
+#Open federal query
+SET VARIABLE SQL_FEDERATION_ENABLED = true;
+……
+```
+💡Note:
+
+The following parameters can be modified by the SET VARIABLE statement, but 
the new value takes effect only after the Proxy restart:
+
+- `kernel_executor_size`
+- `proxy_frontend_executor_size`
+- `proxy_backend_driver_type`
+
+The following parameters are read-only and cannot be modified:
+
+- `cached_connections`
+
+Other parameters will take effect immediately after modification.
+
+## How to Manage Storage nodes
+In ShardingSphere, storage nodes are not directly bound to compute nodes. 
Because one storage node may play different roles in different schemas at the 
same time, in order to implement different business logic. Storage nodes are 
always associated with a schema.
+
+For DistSQL, storage nodes are managed through `RESOURCE` related statements, 
including:
+
+- `ADD RESOURCE`
+- `ALTER RESOURCE`
+- `DROP RESOURCE`
+- `SHOW SCHEMA RESOURCES`
+
+**Schema Preparation**
+`RESOURCE` related statements only work on schemas, so before operating, you 
need to create and use `USE` command to successfully select a schema:
+
+```
+DROP DATABASE IF EXISTS sharding_db;
+CREATE DATABASE sharding_db;
+USE sharding_db;
+```
+
+`ADD RESOURCE`
+
+`ADD RESOURCE` supports the following syntax forms:
+
+Specify `HOST`, `PORT`, `DB`
+
+```
+ADD RESOURCE resource_0 (
+    HOST=127.0.0.1,
+    PORT=3306,
+    DB=db0,
+    USER=root,
+    PASSWORD=root
+);
+```
+Specify `URL`
+
+```
+ADD RESOURCE resource_1 (
+    URL="jdbc:mysql://127.0.0.1:3306/db1?serverTimezone=UTC&useSSL=false",
+    USER=root,
+    PASSWORD=root
+);
+```
+The above two syntax forms support the extension parameter PROPERTIES that is 
used to specify the attribute configuration of the connection pool between the 
Proxy and the storage node, such as:
+
+```
+ADD RESOURCE resource_2 (
+    HOST=127.0.0.1,
+    PORT=3306,
+    DB=db2,
+    USER=root,
+    PASSWORD=root,
+    PROPERTIES("maximumPoolSize"=10)
+),resource_3 (
+    URL="jdbc:mysql://127.0.0.1:3306/db3?serverTimezone=UTC&useSSL=false",
+    USER=root,
+    PASSWORD=root,
+    PROPERTIES("maximumPoolSize"=10,"idleTimeout"="30000")
+);
+```
+💡 Note: Specifying JDBC connection parameters, such as `useSSL`, is supported 
only with URL form.
+
+`ALTER RESOURCE`
+
+`ALTER RESOURCE` is used to modify the connection information of storage 
nodes, such as changing the size of a connection pool, modifying JDBC 
connection parameters, etc.
+
+Syntactically, `ALTER RESOURCE` is identical to `ADD RESOURCE`.
+
+```
+ALTER RESOURCE resource_2 (
+    HOST=127.0.0.1,
+    PORT=3306,
+    DB=db2,
+    USER=root,
+    PROPERTIES("maximumPoolSize"=50)
+),resource_3 (
+    URL="jdbc:mysql://127.0.0.1:3306/db3?serverTimezone=GMT&useSSL=false",
+    USER=root,
+    PASSWORD=root,
+    PROPERTIES("maximumPoolSize"=50,"idleTimeout"="30000")
+);
+```
+💡 Note: Since modifying the storage node may cause metadata changes or 
application data exceptions, `ALTER RESOURCE` cannot used to modify the target 
database of the connection. Only the following values can be modified:
+
+- User name
+- User password
+- `PROPERTIES` connection pool parameters
+- JDBC parameters
+
+`DROP RESOURCE`
+
+`DROP RESOURCE` is used to delete storage nodes from a schema without deleting 
any data in the storage node. The statement example is as follows:
+
+`DROP RESOURCE resource_0, resource_1;`
+
+💡 Note: In order to ensure data correctness, the storage node referenced by 
the rule cannot be deleted.
+
+`t_order` is a sharding table, and its actual tables are distributed in 
`resource_0` and `resource_1`. When `resource_0` and `resource_1` are 
referenced by `t_order` sharding rules, they cannot be deleted.
+
+`SHOW SCHEMA RESOURCES`
+
+`SHOW SCHEMA RESOURCES` is used to query storage nodes in schemas and supports 
the following syntax forms:
+
+```
+#Query the storage node in the current schema
+SHOW SCHEMA RESOURCES;
+#Query the storage node in the specified schema
+SHOW SCHEMA RESOURCES FROM sharding_db;
+```
+
+Example:
+
+Add 4 storage nodes through the above-mentioned `ADD RESOURCE` command, and 
then execute a query:
+
+
+
+> There are actually a large number of columns in the query result, but here 
we only show part of it.
+
+Above we have introduced you to the ways to dynamically manage storage nodes 
through DistSQL.
+
+Compared with modifying YAML files, exectuting DistSQL statements is 
real-time, and there is no need to restart the Proxy or compute node, making 
online operations safer.
+
+Changes executed through DistSQL can be synchronized to other compute nodes in 
the cluster in real time through the register center, while the client 
connected to any compute node can also query changes of storage nodes in real 
time.
+
+Apache ShardingSphere’s cluster governance is very powerful.
+
+## Conclusion
+If you have any questions or suggestions about Apache ShardingSphere, please 
open an issue on the GitHub issue list. If you are interested in contributing 
to the project, you’re very welcome to join the Apache ShardingSphere community.
+
+GitHub issue：https://github.com/apache/shardingsphere/issues
+
+Issues · apache/shardingsphere
+New issue Have a question about this project? Sign up for a free GitHub 
account to open an issue and contact its…
+github.com
+
+## Reference
+**1. ShardingSphere-Proxy Quickstart: 
**[https://shardingsphere.apache.org/document/5.1.0/cn/quick-start/shardingsphere-proxy-quick-start/](https://shardingsphere.apache.org/document/5.1.0/en/quick-start/shardingsphere-proxy-quick-start/)
+
+**2.DistSQL 
RDL：**https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/distsql/syntax/rdl/resource-definition/
+
+**3.DistSQL 
RQL：**https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/distsql/syntax/rql/resource-query/
+
+**4.DistSQL 
RAL：**https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/distsql/syntax/ral/
+
+**Apache ShardingSphere Project Links:**
+[ShardingSphere 
Github](https://github.com/apache/shardingsphere/issues?page=1&q=is%3Aopen+is%3Aissue+label%3A%22project%3A+OpenForce+2022%22)
+
+[ShardingSphere Twitter](https://twitter.com/ShardingSphere)
+
+[ShardingSphere 
Slack](https://join.slack.com/t/apacheshardingsphere/shared_invite/zt-sbdde7ie-SjDqo9~I4rYcR18bq0SYTg)
+
+[Contributor Guide](https://shardingsphere.apache.org/community/cn/contribute/)
+
+**Authors**
+**Longtao JIANG**
+
+SphereEx Middleware Development Engineer & Apache ShardingSphere Committer
+
+Jiang works on DistSQL and security features R&D.
+
+**Chengxiang Lan**
+
+SphereEx Middleware Development Engineer & Apache ShardingSphere Committer
+
+Lan contributes to DistSQL’s R&D.

[shardingsphere] branch master updated: feat(blog): add new articles and images (#19801)

Reply via email to