nandini57 edited a comment on issue #1569: URL: https://github.com/apache/incubator-hudi/issues/1569#issuecomment-620749380
Thanks Balaji. Yesterday , i did change the parameter to retain 40 commits and changed the _hoodie_record_key to include my business batch id column along with one of the other columns. Instead of OverwriteRecordPayload ,using a custom payload which will just add the records in each commit instead of removing from disk.The business batch id increments with every ingestion and i can audit based on commit time to have a view of data at a particular point in past. spark.sql("select * from hoodie_ro where cast(_hoodie_commit_time as long) <=" + Long.valueOf(commitTime)).show(); Is it a good idea to conceive _hoodie_record_key as 123_1,123_4 .. or it has to be monotonically increasing to help indexing? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org