gudladona commented on PR #6179:
URL: https://github.com/apache/hudi/pull/6179#issuecomment-1211491636

   > > @danny0405 Few questions.
   > > 
   > > * what table services could cause this?
   > > * Does this impact inserts too or only upserts?
   > > * The fix in this PR 
https://github.com/apache/hudi/pull/4812/files#r818514246 has greatly improved 
the write performance. This rollback would induce degradation. Can some parts 
of this rolled back code can be safely restored?
   > 
   > 1. In our case, it's the clean service for COW table
   > 2. It is upserts
   > 3. We have made the refresh lazy in [[HUDI-4167] Remove the timeline 
refresh with initializing hoodie tableĀ 
#5716](https://github.com/apache/hudi/pull/5716), thus the performance gap 
expects to be small now.
   
   Does the lazy refresh also apply to the TimelineService 
[RequestHandler](https://github.com/apache/hudi/blob/master/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java)?
 One of the changes in the above mentioned 
[PR](https://github.com/apache/hudi/pull/4812/files#r818514246) ensures that 
requests handled here does not refresh the local view if it is already ahead of 
the remote view. In your case did the Data loss happen because this refresh DID 
NOT happen? If not, removing this conditional check` 
HoodieTimeline.compareTimestamps(localLastKnownInstant, 
HoodieTimeline.LESSER_THAN, lastKnownInstantFromClient)` would force a refresh 
for a lot of requests. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to