hudi-bot opened a new issue, #15753:
URL: https://github.com/apache/hudi/issues/15753

   Currently, after most DML operations in Spark SQL, Hudi invokes 
`Catalog.refreshTable`
   
   Prior to Spark 3.2, this was essentially doing the following:
    # Invalidating relation cache (forcing next time for relation to be 
re-resolved, creating new FileIndex, listing files, etc)
    # Trigger cascading invalidation (re-caching) of the cached data (in 
CacheManager)
   
   As of Spark 3.2 it now additionally does `LogicalRelation.refresh` for ALL 
tables (previously this was only done for Temporary Views), therefore entailing 
whole table to be re-listed again by triggering `FileIndex.refresh` which might 
be costly operation.
   
    
   
   We should revert back to preceding behavior from Spark 3.1
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5697
   - Type: Bug
   - Epic: https://issues.apache.org/jira/browse/HUDI-3249
   - Fix version(s):
     - 1.1.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to