vinishjail97 commented on PR #10872:
URL: https://github.com/apache/hudi/pull/10872#issuecomment-2105950888

   > hey @vinishjail97 : can you attach the memory profileing you did before 
and after this patch. and rebase w/ master. we are good to go
   
   
   15th March: Basic OOM Test (Consume 2M events, each payload is approximately 
1KB with 2 maxExecutors and 1GB memory) and dynamic allocation ratio was 0.002 
so essentially only 1 executor will be used as tasks spawned are not enough.
   
     driver:
       coreLimit: 2000m
       coreRequest: 1800m
       cores: 2
       labels:
         orgId: 0c043996-9e42-4904-95b9-f98918ebeda4
         version: 3.1.1
       memory: 2g
       serviceAccount: staging-spark
     dynamicAllocation:
       enabled: true
       initialExecutors: 0
       maxExecutors: 2
       minExecutors: 0
     executor:
       coreLimit: 1000m
       coreRequest: 750m
       cores: 1
       labels:
         orgId: 0c043996-9e42-4904-95b9-f98918ebeda4
         version: 3.1.1
       memory: 1g
   Without the fix, the stage was failing with executor OOM after 20min.
   
![image](https://github.com/apache/hudi/assets/16958856/275b3e4f-6c90-4d48-940e-38f180f2ba7f)
   
   
   After using this fix, the same stage completed in 17min with one executor.
   
![image](https://github.com/apache/hudi/assets/16958856/1bfc9c57-07d7-4a9d-a09f-6201bcc28717)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to