sumihehe opened a new issue #2329:
URL: https://github.com/apache/hudi/issues/2329


   Hi, all:
   We plan to use Hudi to sync mysql binlog data. There will be a flink ETL 
task to consume binlog records from kafka and save data to hudi every one hour. 
The binlog records are also grouped every one hour and all records of one hour 
will be saved in one commit. The data transmission pipeline should be like -- 
binlog -> kafka -> flink -> parquet. 
   
   After the data is synced to hudi, we want to querying the historical hourly 
versions of the Hudi table in hive SQL.
   
   Here is a more detailed description of our issue along with a  simply design 
of Time Travel for Hudi, the design is under development and testing:
   
[https://docs.google.com/document/d/1r0iwUsklw9aKSDMzZaiq43dy57cSJSAqT9KCvgjbtUo/edit#](https://docs.google.com/document/d/1r0iwUsklw9aKSDMzZaiq43dy57cSJSAqT9KCvgjbtUo/edit#)
   
   We have to support Time Travel ability recently for our business needs. We 
also have seen the [RFC 
07](https://cwiki.apache.org/confluence/display/HUDI/RFC+-+07+%3A+Point+in+time+Time-Travel+queries+on+Hudi+table).
   Be glad to receive any suggestion or dicussion. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to