nandini57 edited a comment on issue #1569:
URL: https://github.com/apache/incubator-hudi/issues/1569#issuecomment-620749380
Thanks Balaji. Yesterday , i did change the parameter to retain 40 commits
and changed the _hoodie_record_key to include my business batch id column along
with one of the other columns. Instead of EmptyRecordPayload ,using a custom
payload which will just add the records in each commit instead of removing from
disk.The business batch id increments with every ingestion and i can audit
based on commit time to go back in time.
spark.sql("select * from hoodie_ro where cast(_hoodie_commit_time as long)
<=" + Long.valueOf(commitTime)).show();
Is it a good idea to conceive _hoodie_record_key as 123_1,123_4 .. or it has
to be monotonically increasing to help indexing?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]