This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new cb70453 [MINOR] Publishing Concurrency Blog (#4344)
cb70453 is described below
commit cb70453d4a667d1ca5dfc2b683e3d2f721ad18dd
Author: Kyle Weller <[email protected]>
AuthorDate: Thu Dec 16 15:48:55 2021 -0800
[MINOR] Publishing Concurrency Blog (#4344)
---
...se-concurrency-control-are-we-too-optimistic.md | 53 +++++++++++++++++++++
.../concurrency/ConcurrencyControlConflicts.png | Bin 0 -> 48932 bytes
.../assets/images/blog/concurrency/MultiWriter.gif | Bin 0 -> 716723 bytes
.../images/blog/concurrency/SingleWriterAsync.gif | Bin 0 -> 685892 bytes
.../images/blog/concurrency/SingleWriterInline.gif | Bin 0 -> 712558 bytes
5 files changed, 53 insertions(+)
diff --git
a/website/blog/2021-12-16-lakehouse-concurrency-control-are-we-too-optimistic.md
b/website/blog/2021-12-16-lakehouse-concurrency-control-are-we-too-optimistic.md
new file mode 100644
index 0000000..5dd20ca
--- /dev/null
+++
b/website/blog/2021-12-16-lakehouse-concurrency-control-are-we-too-optimistic.md
@@ -0,0 +1,53 @@
+---
+title: "Lakehouse Concurrency Control: Are we too optimistic?"
+excerpt: "Vinoth Chandar, Creator of Apache Hudi, dives into concurrency
control mechanisms"
+author: vinoth
+category: blog
+---
+
+Transactions on data lakes are now considered a key characteristic of a
Lakehouse these days. But what has actually been accomplished so far? What are
the current approaches? How do they fare in real-world scenarios? These
questions are the focus of this blog.
+
+<!--truncate-->
+
+Having had the good fortune of working on diverse database projects - an RDBMS
([Oracle](https://www.oracle.com/database/)), a NoSQL key-value store
([Voldemort](https://www.slideshare.net/vinothchandar/voldemort-prototype-to-production-nectar-edits)),
a streaming database
([ksqlDB](https://www.confluent.io/blog/ksqldb-pull-queries-high-availability/)),
a closed-source real-time datastore and of course, Apache Hudi, I can safely
say that the nature of workloads deeply influence the concu [...]
+
+First, let's set the record straight. RDBMS databases offer the richest set of
transactional capabilities and the widest array of concurrency control
[mechanisms](https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-transaction-model.html).
Different isolation levels, fine grained locking, deadlock
detection/avoidance, and more are possible because they have to support
row-level mutations and reads across many tables while enforcing [key
constraints](https://dev.mysql.com/doc/refman/8. [...]
+
+# Pitfalls in Lake Concurrency Control
+
+Historically, data lakes have been viewed as batch jobs reading/writing files
on cloud storage and it's interesting to see how most new work extends this
view and implements glorified file version control using some form of
"[**Optimistic concurrency
control**](https://en.wikipedia.org/wiki/Optimistic_concurrency_control)"
(OCC). With OCC jobs take a table level lock to check if they have impacted
overlapping files and if a conflict exists, they abort their operations
completely. Without [...]
+
+Imagine a real-life scenario of two writer processes : an ingest writer job
producing new data every 30 minutes and a deletion writer job that is enforcing
GDPR, taking 2 hours to issue deletes. It's very likely for these to overlap
files with random deletes, and the deletion job is almost guaranteed to starve
and fail to commit each time. In database speak, mixing long running
transactions with optimism leads to disappointment, since the longer the
transactions the higher the probabilit [...]
+
+
+
static/assets/images/blog/concurrency/ConcurrencyControlConflicts.png
+
+So, what's the alternative? Locking? Wikipedia also says - "_However,
locking-based ("pessimistic") methods also can deliver poor performance because
locking can drastically limit effective concurrency even when deadlocks are
avoided."._ Here is where Hudi takes a different approach, that we believe is
more apt for modern lake transactions which are typically long-running and even
continuous. Data lake workloads share more characteristics with high throughput
stream processing jobs, than [...]
+
+# Model 1 : Single Writer, Inline Table Services
+
+The simplest form of concurrency control is just no concurrency at all. A data
lake table often has common services operating on it to ensure efficiency.
Reclaiming storage space from older versions and logs, coalescing files
(clustering in Hudi), merging deltas (compactions in Hudi), and more. Hudi can
simply eliminate the need for concurrency control and maximizes throughput by
supporting these table services out-of-box and running inline after every write
to the table.
+
+Execution plans are idempotent, persisted to the timeline and auto-recover
from failures. For most simple use-cases, this means just writing is sufficient
to get a well-managed table that needs no concurrency control.
+
+
+
+# Model 2 : Single Writer, Async Table Services
+
+Our delete/ingest example above is n't really that simple. While ingest/writer
may just be updating the last N partitions on the table, delete may span across
the entire table even. Mixing them in the same job, could slow down ingest
latency by a lot. But, Hudi provides the option of running the table services
in an async fashion, where most of the heavy lifting (e.g actually rewriting
the columnar data by compaction service) is done asynchronously, eliminating
any repeated wasteful retr [...]
+
+
+
+# Model 3 : Multiple Writers
+
+But it's not always possible to serialize the deletes into the same write
stream or sql based deletes are required. With multiple distributed processes,
some form of locking is inevitable, but like real databases Hudi's concurrency
model is intelligent enough to differentiate actual writing to the table, from
table services that manage or optimize the table. Hudi offers similar
optimistic concurrency control across multiple writers, but table services can
still execute completely lock-fr [...]
+
+
+
+All this said, there are still many ways we can improve upon this foundation.
+
+* For starters, Hudi has already implemented a [marker
mechanism](https://hudi.apache.org/blog/2021/08/18/improving-marker-mechanism/)
that tracks all the files that are part of an active write transaction and a
heartbeat mechanism that can track active writers to a table. This can be
directly used by other active transactions/writers to detect what other writers
are doing and [abort early](https://issues.apache.org/jira/browse/HUDI-1575) if
conflicts are detected, yielding the cluster [...]
+* While optimistic concurrency control is attractive when serializable
snapshot isolation is desired, it's neither optimal nor the only method for
dealing with concurrency between writers. We plan to implement a fully
lock-free concurrency control using CRDTs and widely adopted stream processing
concepts, over our log [merge
API](https://github.com/apache/hudi/blob/bc8bf043d5512f7afbb9d94882c4e43ee61d6f06/hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java#L
[...]
+* Touching upon key constraints, Hudi is the only lake transactional layer
that ensures unique [key](https://hudi.apache.org/docs/key_generation)
constraints today, but limited to the record key of the table. We will be
looking to expand this capability in a more general form to non-primary key
fields, with the said newer concurrency models.
+
+Finally, for data lakes to transform successfully into lakehouses, we must
learn from the failing of the "hadoop warehouse" vision, which shared similar
goals with the new "lakehouse" vision. Designers did not pay closer attention
to the missing technology gaps against warehouses and created unrealistic
expectations from the actual software. As transactions and database
functionality finally goes mainstream on data lakes, we must apply these
lessons and remain candid about the current sh [...]
\ No newline at end of file
diff --git
a/website/static/assets/images/blog/concurrency/ConcurrencyControlConflicts.png
b/website/static/assets/images/blog/concurrency/ConcurrencyControlConflicts.png
new file mode 100644
index 0000000..0f9b3e5
Binary files /dev/null and
b/website/static/assets/images/blog/concurrency/ConcurrencyControlConflicts.png
differ
diff --git a/website/static/assets/images/blog/concurrency/MultiWriter.gif
b/website/static/assets/images/blog/concurrency/MultiWriter.gif
new file mode 100644
index 0000000..7243658
Binary files /dev/null and
b/website/static/assets/images/blog/concurrency/MultiWriter.gif differ
diff --git
a/website/static/assets/images/blog/concurrency/SingleWriterAsync.gif
b/website/static/assets/images/blog/concurrency/SingleWriterAsync.gif
new file mode 100644
index 0000000..ecc10f5
Binary files /dev/null and
b/website/static/assets/images/blog/concurrency/SingleWriterAsync.gif differ
diff --git
a/website/static/assets/images/blog/concurrency/SingleWriterInline.gif
b/website/static/assets/images/blog/concurrency/SingleWriterInline.gif
new file mode 100644
index 0000000..e18cd73
Binary files /dev/null and
b/website/static/assets/images/blog/concurrency/SingleWriterInline.gif differ