[GitHub] [hudi] vinothchandar commented on a diff in pull request #6408: [DOCS] Edits to the Hudi Tech specs

GitBox Tue, 16 Aug 2022 06:12:22 -0700


vinothchandar commented on code in PR #6408:
URL: https://github.com/apache/hudi/pull/6408#discussion_r946760002



##########
website/src/pages/tech-specs.md:
##########
@@ -263,68 +274,68 @@ Readers will use snapshot isolation to query a Hudi 
dataset at a consistent poin
 
 ## Writer Expectations
 
-Writer into Hudi will have to ingest new records, updates to existing records 
or delete records into the dataset. All transactional actions follow the same 
state transition as described in the transaction log (timeline) section. 
Writers will optimistically create new base and log files and will finally 
transition the action state to completed to register all the modifications to 
the dataset atomically. Writer merges the data using the following steps
+Writer into Hudi will have to ingest new records, updates to existing records 
or delete records into the table. All transactional actions follow the same 
state transition as described in the transaction log (timeline) section. 
Writers will optimistically create new base and log files and will finally 
transition the action state to completed to register all the modifications to 
the table atomically. Writer merges the data using the following steps
 
 1. Writer will pick a monotonically increasing instant time from the latest 
state of the Hudi timeline (**action commit time**) and will pick the last 
successful commit instant (**merge commit time**) to merge the changes to. If 
the merge succeeds, then action commit time will be the next successful commit 
in the timeline. 
-2. For all the incoming records, the writer will have to efficiently determine 
if this is an update or insert. This is done by a process called tagging - 
which is a batched point lookups of the record key and partition path pairs in 
the entire dataset. The efficiency of tagging is critical to the merge 
performance. This can be optimized with indexes (bloom, global key value based 
index) and caching. New records will not have a tag. 
+2. For all the incoming records, the writer will have to efficiently determine 
if this is an update or insert. This is done by a process called tagging - 
which is a batched point lookups of the record key and partition path pairs in 
the entire table. The efficiency of tagging is critical to the merge 
performance. This can be optimized with indexes (bloom, global key value based 
index) and caching. New records will not have a tag. 
 3. Once records are tagged, the writer can apply them onto the specific file 
slice. 
-   1. For copy on write, writer will create a new slice (action commit time) 
of the base file in the file group
-   2. For merge on read, writer will create a new log file with the action 
commit time on the merge commit time file slice
+   1. For CoW, writer will create a new slice (action commit time) of the base 
file in the file group
+   2. For MoR, writer will create a new log file with the action commit time 
on the merge commit time file slice
 4. Deletes are encoded as special form of updates where only the meta fields 
and the operation is populated. See the delete block type in log format block 
types. 
-5. Once all the writes into the file system is complete, concurrency control 
checks happen to ensure there are no overlapping writes and if that succeeds, 
the commit action is completed in the timeline atomically making the changes 
merged visible for the next reader.  
+5. Once all the writes into the file system are complete, concurrency control 
checks happen to ensure there are no overlapping writes and if that succeeds, 
the commit action is completed in the timeline atomically making the changes 
merged visible for the next reader.  
 6. Synchronizing Indexes and metadata needs to be done in the same transaction 
that commits the modifications to the table. 
 
 
 
-## Balancing data freshness and query performance
+## Balancing write and query performance
 
-Critical design choice for any dataset is to pick the right trade-offs in the 
data freshness and query performance spectrum. Hudi storage format lets the 
users decide on this trade-off by picking the table type, record merging and 
file sizing. 
+A critical design choice for any table is to pick the right trade-offs in the 
data freshness and query performance spectrum. Hudi storage format lets the 
users decide on this trade-off by picking the table type, record merging and 
file sizing. 
 
 #### Table types
 
-|                     | Merge Efficiency                                       
                                                                                
                                                                                
                                                                                
              | Query Efficiency                                                
                                                                                
                                                                                
                                                                                
                                                   |
-| ------------------- | 
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 | 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 |
-| Copy on Write (COW) | **Inefficient** <br />COW table type creates a new 
File slice in the file group for every batch of updates. Write amplification 
can be quite high when the update is spread across multiple file groups. The 
cost involved can be high over a time period especially on datasets with low 
data latency requirements. | **Efficient** <br />COW table types create whole 
readable data files in open source columnar file formats on each merge batch, 
there is minimum overhead per record in the query engine. Query engines are 
fairly optimized for accessing files directly in cloud storage.                 
                                                                        |
-| Merge on Read (MOR) | **Efficient** <br />MOR table type batches the updates 
to the file slice in a separate optimized Log file, write amplification is 
amortized over time when sufficient updates are batched. The merge cost 
involved will be lower than COW since the churn on the records re-written for 
every update is much lower.  | **Inefficient**<br />MOR Table type required 
record level merging during query. Although there are techniques to make this 
merge as efficient as possible, there is still a record level overhead to apply 
the updates batched up for the file slice. The merge cost applies on every 
query until the compaction applies the updates and creates a new file slice. |
+|                     | Merge Efficiency                                       
                                                                                
                                                                                
                                                                                
           | Query Efficiency                                                   
                                                                                
                                                                                
                                                                                
                                            |
+| ------------------- 
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Copy on Write (COW) | **Tunable** <br />COW table type creates a new File 
slice in the file-group for every batch of updates. Write amplification can be 
quite high when the update is spread across multiple file groups. The cost 
involved can be high over a time period especially on tables with low data 
latency requirements.    | **Optimal** <br />COW table types create whole 
readable data files in open source columnar file formats on each merge batch, 
there is minimal overhead per record in the query engine. Query engines are 
fairly optimized for accessing files directly in cloud storage.                 
                                                                      |

Review Comment:
   @prasannarajaperumal I made this `tunable` vs `optimal` . CoW is optimal for 
reads for e.g , while you can tune merge, by over-provisioning writers. this is 
probably a better way to talk about this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] vinothchandar commented on a diff in pull request #6408: [DOCS] Edits to the Hudi Tech specs

Reply via email to