This is an automated email from the ASF dual-hosted git repository.
danny0405 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new a702ced7f0f [DOCS] Diagram Changes for Clustering, Rollbacks, Table
Types (#10510)
a702ced7f0f is described below
commit a702ced7f0f4e0e058ae0f0eaff28ec278f62fbf
Author: Dipankar Mazumdar <[email protected]>
AuthorDate: Tue Jan 16 21:46:06 2024 -0500
[DOCS] Diagram Changes for Clustering, Rollbacks, Table Types (#10510)
* remaining diagrams
* fixed issue with rollbacks page
---------
Co-authored-by: Dipankar Mazumdar <[email protected]>
---
website/docs/clustering.md | 6 +++---
website/docs/rollbacks.md | 4 ++--
website/docs/table_types.md | 4 ++--
website/static/assets/images/COW_new.png | Bin 0 -> 1034864 bytes
website/static/assets/images/MOR_new.png | Bin 0 -> 1342587 bytes
.../assets/images/blog/clustering/clustering1_new.png | Bin 0 -> 1420549 bytes
.../assets/images/blog/clustering/clustering2_new.png | Bin 0 -> 302821 bytes
.../assets/images/blog/clustering/clustering_3.png | Bin 0 -> 513090 bytes
.../assets/images/blog/rollbacks/Rollback_1.png | Bin 0 -> 311672 bytes
.../assets/images/blog/rollbacks/rollback2_new.png | Bin 0 -> 569899 bytes
10 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/website/docs/clustering.md b/website/docs/clustering.md
index 2feab1902ac..7749292b1cf 100644
--- a/website/docs/clustering.md
+++ b/website/docs/clustering.md
@@ -59,7 +59,7 @@ Clustering Service builds on Hudi’s MVCC based design to
allow for writers to
NOTE: Clustering can only be scheduled for tables / partitions not receiving
any concurrent updates. In the future, concurrent updates use-case will be
supported as well.
-
+
_Figure: Illustrating query performance improvements by clustering_
## Clustering Usecases
@@ -71,7 +71,7 @@ such small files could lead to higher query latency. From our
experience support
few users who are using Hudi just for small file handling capabilities. So,
you could employ clustering to batch a lot
of such small files into larger ones.
-
+
### Cluster by sort key
@@ -80,7 +80,7 @@ arrival time, while query predicates do not sit well with it.
With clustering, y
based on query predicates and so, your data skipping will be very efficient
and your query can ignore scanning a lot of
unnecessary data.
-
+
## Clustering Strategies
diff --git a/website/docs/rollbacks.md b/website/docs/rollbacks.md
index 5a2ebf2a70b..c78b8f3b084 100644
--- a/website/docs/rollbacks.md
+++ b/website/docs/rollbacks.md
@@ -35,7 +35,7 @@ for any actions/commits that is not yet committed and that
refers to partially f
is triggered and all dirty data is cleaned up followed by cleaning up the
commit instants from the timeline.
-
+
_Figure 1: single writer with eager rollbacks_
@@ -63,7 +63,7 @@ information whether the writer that started the commit of
interest is still maki
the commit, the heartbeat file is deleted. Or if the write failed midway, the
last modification time of the heartbeat
file is no longer updated, so other writers can deduce the failed write after
a period of time elapses.
-
+
_Figure 2: multi-writer with lazy cleaning of failed commits_
## Related Resources
diff --git a/website/docs/table_types.md b/website/docs/table_types.md
index 28814d239e8..e280909a9f3 100644
--- a/website/docs/table_types.md
+++ b/website/docs/table_types.md
@@ -69,7 +69,7 @@ Following illustrates how this works conceptually, when data
written into copy-o
<figure>
- <img className="docimage"
src={require("/assets/images/hudi_cow.png").default} alt="hudi_cow.png" />
+ <img className="docimage"
src={require("/assets/images/COW_new.png").default} alt="hudi_cow.png" />
</figure>
@@ -97,7 +97,7 @@ their columnar base file, to keep the query performance in
check (larger delta l
Following illustrates how the table works, and shows two types of queries -
snapshot query and read optimized query.
<figure>
- <img className="docimage"
src={require("/assets/images/hudi_mor.png").default} alt="hudi_mor.png" />
+ <img className="docimage"
src={require("/assets/images/MOR_new.png").default} alt="hudi_mor.png" />
</figure>
There are lot of interesting things happening in this example, which bring out
the subtleties in the approach.
diff --git a/website/static/assets/images/COW_new.png
b/website/static/assets/images/COW_new.png
new file mode 100644
index 00000000000..9a996e01c76
Binary files /dev/null and b/website/static/assets/images/COW_new.png differ
diff --git a/website/static/assets/images/MOR_new.png
b/website/static/assets/images/MOR_new.png
new file mode 100644
index 00000000000..519e9eb6fb8
Binary files /dev/null and b/website/static/assets/images/MOR_new.png differ
diff --git a/website/static/assets/images/blog/clustering/clustering1_new.png
b/website/static/assets/images/blog/clustering/clustering1_new.png
new file mode 100644
index 00000000000..6aec715ae6c
Binary files /dev/null and
b/website/static/assets/images/blog/clustering/clustering1_new.png differ
diff --git a/website/static/assets/images/blog/clustering/clustering2_new.png
b/website/static/assets/images/blog/clustering/clustering2_new.png
new file mode 100644
index 00000000000..5ccd84ab083
Binary files /dev/null and
b/website/static/assets/images/blog/clustering/clustering2_new.png differ
diff --git a/website/static/assets/images/blog/clustering/clustering_3.png
b/website/static/assets/images/blog/clustering/clustering_3.png
new file mode 100644
index 00000000000..8d1ca9275d6
Binary files /dev/null and
b/website/static/assets/images/blog/clustering/clustering_3.png differ
diff --git a/website/static/assets/images/blog/rollbacks/Rollback_1.png
b/website/static/assets/images/blog/rollbacks/Rollback_1.png
new file mode 100644
index 00000000000..cc3fd458c22
Binary files /dev/null and
b/website/static/assets/images/blog/rollbacks/Rollback_1.png differ
diff --git a/website/static/assets/images/blog/rollbacks/rollback2_new.png
b/website/static/assets/images/blog/rollbacks/rollback2_new.png
new file mode 100644
index 00000000000..f7bd86a5c0f
Binary files /dev/null and
b/website/static/assets/images/blog/rollbacks/rollback2_new.png differ