Re: [PR] [DOCS] Add indexing examples and info for 1.0 [hudi]

via GitHub Tue, 03 Dec 2024 04:33:31 -0800


lokeshj1703 commented on code in PR #12409:
URL: https://github.com/apache/hudi/pull/12409#discussion_r1867641088



##########
website/docs/metadata_indexing.md:
##########
@@ -10,18 +10,65 @@ The [pluggable indexing 
subsystem](https://www.onehouse.ai/blog/introducing-mult
 of Hudi depends on the metadata table. Different types of index, from `files` 
index for locating records efficiently
 to `column_stats` index for data skipping, are part of the metadata table. A 
fundamental tradeoff in any data system
 that supports indices is to balance the write throughput with index updates. A 
brute-force way is to lock out the writes
-while indexing. However, very large tables can take hours to index. This is 
where Hudi's novel asynchronous metadata
-indexing comes into play.
+while indexing. Hudi supports index creation using SQL, Datasource as well as 
async indexing. However, very large tables 

Review Comment:
   Addressed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [DOCS] Add indexing examples and info for 1.0 [hudi]

Reply via email to