[GitHub] [hudi] guanziyue commented on pull request #6384: [HUDI-4613] Avoid the use of regex expressions when call hoodieFileGroup#addLogFile function

GitBox Wed, 28 Sep 2022 10:40:49 -0700


guanziyue commented on PR #6384:
URL: https://github.com/apache/hudi/pull/6384#issuecomment-1261243810


   Not sure if author uses spark. I do understand this save a lot of time on 
huge table especially in spark streaming mode. In spark, all writing task 
cannot start until FileSystemView finish loading because Hudi on spark need 
FileSystemView info to determine small files before generating writing task. 
   In my opinion, memory problem can be solved by other config. For example, 
using RocksDB Based FileSystemView which is nearly compulsory for large hudi 
table. But we have few to do for time consuming in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] guanziyue commented on pull request #6384: [HUDI-4613] Avoid the use of regex expressions when call hoodieFileGroup#addLogFile function

Reply via email to