nandini57 edited a comment on issue #1569:
URL: https://github.com/apache/incubator-hudi/issues/1569#issuecomment-620749380


   Thanks Balaji. Yesterday , i did change the parameter to retain 40 commits 
and changed the _hoodie_record_key to include my business batch id column along 
with one of the other columns. Instead of OverwriteRecordPayload ,using a 
custom payload which will just add the records in each commit instead of 
removing from disk.The business batch id increments with every ingestion and i 
can audit based on commit time to have a view of data at a particular point in 
past.
   spark.sql("select * from hoodie_ro where cast(_hoodie_commit_time as long) 
<=" + Long.valueOf(commitTime)).show();
   
   Is it a good idea to conceive _hoodie_record_key as 123_1,123_4 .. or it has 
to be monotonically increasing to help indexing?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to