nsivabalan commented on issue #3324: URL: https://github.com/apache/hudi/issues/3324#issuecomment-894492641
with MOR, there are 3 types of queries that could be of benefit to you. Config : https://hudi.apache.org/docs/configurations#query_type_opt_key [Snapshot/Realtime read](https://hudi.apache.org/docs/quick-start-guide#query-data) : reads entire data for latest snapshot. ReadOptimized query: "read_optimized" As I was telling you earlier, for a given data file, depending on your compaction schedule, there could be some delta log files. For snapshot reads, these will be merged with base data files and then served. Where as for ReadOptimized query, only the base data files will be read. If you can give up on freshness, your queries will be much faster since there is not real time merge involved. And then you have [incremental read](https://hudi.apache.org/docs/quick-start-guide#incremental-query) which will give you delta records between commits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
