PhantomHunt commented on issue #8572:
URL: https://github.com/apache/hudi/issues/8572#issuecomment-1539885087

   > you are using some internal apis. so getCommitsTimeline will give you 
cleaned up and noncleaned commits. right one to use is
   > 
   > ```
   > 
timeline=metaClient.getActiveTimeline().getCleanerTimeline().filterCompletedInstants()
   > ```
   
   Hi @nsivabalan, Thanks for the help.
   We tried the above code as you suggested, but it didn't give us the desired 
output.
   
   We observed that getCleanerTimeline() just gave the list of timelines when 
the cleaner had ran. What we actually need is the list of timelines that exist 
after running the cleaner!
   
   For example:
   We inserted data in the Hudi table at the below-given timelines:
   1
   2
   3
   4
   5
   6
   7
   Then at the 8th moment, the cleaner ran and cleaned timelines 1 and 2. We 
intend to fetch all the remaining committed timelines as output, i.e. : 
[3,4,5,6,7]
   
   However, both the following code blocks don't return the above output - 
   committed timelines code: 
metaClient.getActiveTimeline().getCommitsTimeline().filterCompletedInstants() 
is returning [1,2,3,4,5,6,7]
   cleaner timelines code : 
metaClient.getActiveTimeline().getCleanerTimeline().filterCompletedInstants() 
is returning [8]
   
   So, can you please suggest some other internal APIs / approaches which can 
give us the committed timelines that actually exist after the cleaner ran?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to