prashantwason commented on a change in pull request #4226:
URL: https://github.com/apache/hudi/pull/4226#discussion_r765204484



##########
File path: website/docs/metadata.md
##########
@@ -0,0 +1,30 @@
+---
+title: Metadata Table
+keywords: [ hudi, metadata, S3 file listings]
+---
+
+## Motivation for a Metadata Table
+
+The Apache Hudi Metadata Table can significantly improve read/write 
performance of your queries. The main purpose of the 

Review comment:
       IMO this is controversial in general sense. For queries which access 
large amount of data, the majority of time is spent in processing data files. 
Furthermore, on the write-side the additional delta-commit is required (we 
observe 10+ seconds for each) which makes the pipeline time the same as before. 
   
   But I guess for cloud store (we use HDFS) there will be a significant 
improvement as list files is very slow. So do not have to change anything here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to