[
https://issues.apache.org/jira/browse/HUDI-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17505284#comment-17505284
]
Raymond Xu commented on HUDI-1460:
----------------------------------
[~x1q1j1] can you please go through the description and design doc to see if
any further work needed?
> Time Travel (querying the historical versions of data) ability for Hudi Table
> -----------------------------------------------------------------------------
>
> Key: HUDI-1460
> URL: https://issues.apache.org/jira/browse/HUDI-1460
> Project: Apache Hudi
> Issue Type: New Feature
> Components: Common Core
> Reporter: qian heng
> Priority: Major
>
> Hi, all:
> We plan to use Hudi to sync mysql binlog data. There will be a flink ETL
> task to consume binlog records from kafka and save data to hudi every one
> hour. The binlog records are also grouped every one hour and all records of
> one hour will be saved in one commit. The data transmission pipeline should
> be like – binlog -> kafka -> flink -> parquet.
> After the data is synced to hudi, we want to querying the historical hourly
> versions of the Hudi table in hive SQL.
> Here is a more detailed description of our issue along with a simply design
> of Time Travel for Hudi, the design is under development and testing:
> [https://docs.google.com/document/d/1r0iwUsklw9aKSDMzZaiq43dy57cSJSAqT9KCvgjbtUo/edit?usp=sharing]
> We have to support Time Travel ability recently for our business needs. We
> also have seen the [RFC
> 07|https://cwiki.apache.org/confluence/display/HUDI/RFC+-+07+%3A+Point+in+time+Time-Travel+queries+on+Hudi+table].
> Be glad to receive any suggestion or dicussion.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)