boneanxs commented on issue #7322: URL: https://github.com/apache/hudi/issues/7322#issuecomment-1330618582
Yes, I think @danny0405 is right. Like Spark file listing, hudi also cache file status to avoid call filesystem repeatedly when listing files. And the cache will be update when you call `refresh` explicitly. But I'm thinking whether we can hold the number of complete commits as a variable, and decide whether to use the cache or not by comparing if that num is changed or not.[This cannot work if the relation changed to HadoopRelation, `ToHadoopRelation` use Spark InmemoryFileIndex] -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
