bhasudha commented on code in PR #9372: URL: https://github.com/apache/hudi/pull/9372#discussion_r1307271216
########## website/docs/concurrency_control.md: ########## @@ -2,105 +2,126 @@ title: "Concurrency Control" summary: In this page, we will discuss how to perform concurrent writes to Hudi Tables. toc: true +toc_min_heading_level: 2 +toc_max_heading_level: 4 last_modified_at: 2021-03-19T15:59:57-04:00 --- +Concurrency control defines how different writers/readers coordinate access to the table. Hudi ensures atomic writes, by way of publishing commits atomically to the timeline, stamped with an instant time that denotes the time at which the action is deemed to have occurred. Unlike general purpose file version control, Hudi draws clear distinction between writer processes (that issue user’s upserts/deletes), table services (that write data/metadata to optimize/perform bookkeeping) and readers (that execute queries and read data). Hudi provides snapshot isolation between all three types of processes, meaning they all operate on a consistent snapshot of the table. Hudi provides optimistic concurrency control (OCC) between writers, while providing lock-free, non-blocking MVCC based concurrency control between writers and table-services and between different table services. -In this section, we will cover Hudi's concurrency model and describe ways to ingest data into a Hudi Table from multiple writers; using the [Hudi Streamer](#hudi-streamer) tool as well as -using the [Hudi datasource](#datasource-writer). +In this section, we will discuss the different concurrency controls supported by Hudi and how they are leveraged to provide flexible deployment models; we will cover multi-writing, a popular deployment model; finally, we’ll describe ways to ingest data into a Hudi Table from multiple writers using different writers, like DeltaStreamer, Hudi datasource, Spark Structured Streaming and Spark SQL. -## Supported Concurrency Controls -- **MVCC** : Hudi table services such as compaction, cleaning, clustering leverage Multi Version Concurrency Control to provide snapshot isolation -between multiple table service writers and readers. Additionally, using MVCC, Hudi provides snapshot isolation between an ingestion writer and multiple concurrent readers. - With this model, Hudi supports running any number of table service jobs concurrently, without any concurrency conflict. - This is made possible by ensuring that scheduling plans of such table services always happens in a single writer mode to ensure no conflict and avoids race conditions. +## Deployment models with supported concurrency controls -- **[NEW] OPTIMISTIC CONCURRENCY** : Write operations such as the ones described above (UPSERT, INSERT) etc, leverage optimistic concurrency control to enable multiple ingestion writers to -the same Hudi Table. Hudi supports `file level OCC`, i.e., for any 2 commits (or writers) happening to the same table, if they do not have writes to overlapping files being changed, both writers are allowed to succeed. - This feature is currently *experimental* and requires either Zookeeper or HiveMetastore to acquire locks. +### Model A: Single writer with inline table services -It may be helpful to understand the different guarantees provided by [write operations](/docs/write_operations/) via Hudi datasource or the Hudi Streamer. +This is the simplest form of concurrency, meaning there is no concurrency at all in the write processes. In this model, Hudi eliminates the need for concurrency control and maximizes throughput by supporting these table services out-of-box and running inline after every write to the table. Execution plans are idempotent, persisted to the timeline and auto-recover from failures. For most simple use-cases, this means just writing is sufficient to get a well-managed table that needs no concurrency control. -## Single Writer Guarantees +Although there is no actual concurrent writing in this model, there is a need to provide snapshot isolation between readers and writers. **MVCC** is leveraged to provide such isolation between ingestion writer and multiple readers and also between multiple table service writers and readers. Writes to the table either from ingestion or from table services produce versioned data that are available to readers only after the writes are committed. Until then, readers can access only the previous version of the data. - - *UPSERT Guarantee*: The target table will NEVER show duplicates. - - *INSERT Guarantee*: The target table wilL NEVER have duplicates if [dedup](/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) is enabled. - - *BULK_INSERT Guarantee*: The target table will NEVER have duplicates if [dedup](/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) is enabled. - - *INCREMENTAL PULL Guarantee*: Data consumption and checkpoints are NEVER out of order. +A single writer with all table services such as cleaning, clustering, compaction, etc can be configured to be inline (such as DeltaStreamer sync-once mode and Spark Datasource with default configs) without any additional configs. -## Multi Writer Guarantees +#### Single Writer Guarantees -With multiple writers using OCC, some of the above guarantees change as follows +In this model, the following are the guarantees on [write operations](https://hudi.apache.org/docs/write_operations/) to expect: - *UPSERT Guarantee*: The target table will NEVER show duplicates. -- *INSERT Guarantee*: The target table MIGHT have duplicates even if [dedup](/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) is enabled. -- *BULK_INSERT Guarantee*: The target table MIGHT have duplicates even if [dedup](/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) is enabled. +- *INSERT Guarantee*: The target table wilL NEVER have duplicates if [dedup](https://hudi.apache.org/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) is enabled. +- *BULK_INSERT Guarantee*: The target table will NEVER have duplicates if [dedup](https://hudi.apache.org/docs/configurations#hoodiedatasourcewriteinsertdropduplicates) is enabled. +- *INCREMENTAL PULL Guarantee*: Data consumption and checkpoints are NEVER out of order. + + +### Model B: Single writer with async table services + +Hudi provides the option of running the table services in an async fashion, where most of the heavy lifting (e.g actually rewriting the columnar data by compaction service) is done asynchronously. In this model, the async deployment eliminates any repeated wasteful retries and optimizes the table using clustering techniques while a single writer consumes the writes to the table without having to be blocked by such table services. This model avoids the need for taking an [external lock](#external-locking-and-lock-providers) to control concurrency and avoids the need to separately orchestrate and monitor offline table services jobs.. + +A single writer along with async table services runs in the same process. For example, you can have a DeltaStreamer in continuous mode write to a MOR table using async compaction; you can use Spark Streaming (where [compaction](https://hudi.apache.org/docs/compaction) is async by default), and you can use Flink streaming or your own job setup and enable async table services inside the same writer. + +Hudi leverages **MVCC** in this model to support running any number of table service jobs concurrently, without any concurrency conflict. This is made possible by ensuring Hudi 's ingestion writer and async table services coordinate among themselves to ensure no conflicts and avoid race conditions. The same single write guarantees described in Model A above can be achieved in this model as well. + +### Model C: Multi-writer + +It is not always possible to serialize all write operations to a table (such as UPSERT, INSERT or DELETE) into the same write process and therefore, multi-writing capability may be required. In multi-writing, disparate distributed processes run in parallel or overlapping time windows to write to the same table. In such cases, an external locking mechanism becomes necessary to coordinate concurrent accesses. Here are few different scenarios that would all fall under multi-writing: + +- Multiple ingestion writers to the same table:For instance, two Spark Datasource writers working on different sets of partitions form a source kafka topic. +- Multiple ingestion writers to the same table, including one writer with async table services: For example, a DeltaStreamer with async compaction for regular ingestion & a Spark Datasource writer for backfilling. +- A single ingestion writer and a separate compaction (HoodieCompactor) or clustering (HoodieClusteringJob) job apart from the ingestion writer: This is considered as multi-writing as they are not running in the same process. + +Hudi's concurrency model intelligently differentiates actual writing to the table from table services that manage or optimize the table. Hudi offers similar **optimistic concurrency control across multiple writers**, but **table services can still execute completely lock-free and async** as long as they run in the same process as one of the writers. +For multi-writing, Hudi leverages file level optimistic concurrency control(OCC). For example, when two writers write to non overlapping files, both writes are allowed to succeed. However, when the writes from different writers overlap (touch the same set of files), only one of them will succeed. Please note that this feature is currently experimental and requires external lock providers to acquire locks briefly at critical sections during the write. More on lock providers below. + +#### Multi Writer Guarantees + +With multiple writers using OCC, these are the write guarantees to expect: + +- *UPSERT Guarantee*: The target table will NEVER show duplicates. +- *INSERT Guarantee*: The target table MIGHT have duplicates even if dedup is enabled. +- *BULK_INSERT Guarantee*: The target table MIGHT have duplicates even if dedup is enabled. - *INCREMENTAL PULL Guarantee*: Data consumption and checkpoints MIGHT be out of order due to multiple writer jobs finishing at different times. + ## Enabling Multi Writing -The following properties are needed to be set properly to turn on optimistic concurrency control. +The following properties are needed to be set appropriately to turn on optimistic concurrency control to achieve multi writing. ``` hoodie.write.concurrency.mode=optimistic_concurrency_control -hoodie.cleaner.policy.failed.writes=LAZY hoodie.write.lock.provider=<lock-provider-classname> ``` -There are 4 different lock providers that require different configurations to be set. +| Config Name| Default| Description | +| --------------------------------------------------------------------------------- | ------------------------ |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| hoodie.write.concurrency.mode | SINGLE_WRITER (Optional) | <u>[Concurrency modes](https://github.com/apache/hudi/blob/c387f2a6dd3dc9db2cd22ec550a289d3a122e487/hudi-common/src/main/java/org/apache/hudi/common/model/WriteConcurrencyMode.java)</u> for write operations.<br />Possible values:<br /><ul><li>`SINGLE_WRITER`: Only one active writer to the table. Maximizes throughput.</li><li>`OPTIMISTIC_CONCURRENCY_CONTROL`: Multiple writers can operate on the table with lazy conflict resolution using locks. This means that only one writer succeeds if multiple writers write to the same file group.</li></ul><br />`Config Param: WRITE_CONCURRENCY_MODE` | +| hoodie.write.lock.provider | org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider (Optional) | Lock provider class name, user can provide their own implementation of LockProvider which should be subclass of org.apache.hudi.common.lock.LockProvider<br /><br />`Config Param: LOCK_PROVIDER_CLASS_NAME`<br />`Since Version: 0.8.0` | Review Comment: > Will add it back. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
