Zhangshunyu opened a new issue, #6075:
URL: https://github.com/apache/hudi/issues/6075

   Hudi's current driver cache management has some problems:
     1) The cache is only shared within the session. In different sessions, 
because the cache is not shared, the cache information of the same table is 
loaded repeatedly;
     2) When a session is released, the corresponding cache of the session is 
not released, causing the cache to accumulate until oom
     3) When the session is first built, the query table will create the 
relation of the table, and all file status information will be loaded during 
the process of building the relation
     
     Combining the above three points leads to the following results:
     1) Multiple session connections are connected to the same driver for 
concurrent execution, memory * N will eventually lead to driver oom
     2 Each query needs to build a relation, which is equivalent to executing 
the first query.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to