gudladona commented on PR #6179: URL: https://github.com/apache/hudi/pull/6179#issuecomment-1211491636
> > @danny0405 Few questions. > > > > * what table services could cause this? > > * Does this impact inserts too or only upserts? > > * The fix in this PR https://github.com/apache/hudi/pull/4812/files#r818514246 has greatly improved the write performance. This rollback would induce degradation. Can some parts of this rolled back code can be safely restored? > > 1. In our case, it's the clean service for COW table > 2. It is upserts > 3. We have made the refresh lazy in [[HUDI-4167] Remove the timeline refresh with initializing hoodie tableĀ #5716](https://github.com/apache/hudi/pull/5716), thus the performance gap expects to be small now. Does the lazy refresh also apply to the TimelineService [RequestHandler](https://github.com/apache/hudi/blob/master/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/RequestHandler.java)? One of the changes in the above mentioned [PR](https://github.com/apache/hudi/pull/4812/files#r818514246) ensures that requests handled here does not refresh the local view if it is already ahead of the remote view. In your case did the Data loss happen because this refresh DID NOT happen? If not, removing this conditional check` HoodieTimeline.compareTimestamps(localLastKnownInstant, HoodieTimeline.LESSER_THAN, lastKnownInstantFromClient)` would force a refresh for a lot of requests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
