BalaMahesh opened a new issue #2236: URL: https://github.com/apache/hudi/issues/2236
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? - Join the mailing list to engage in conversations and get faster support at [email protected]. - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly. **Describe the problem you faced** We are running HUDI for data ingestion. While starting the application initially we gave the executor and driver heap memory a limit of 2 GB and 2 GB respectively. But after running for few hours, application exited with OOM error. We have bumped up the limit to 3GB and 3GB after this , it only helped to get some more extra time before the error and eventually exited with same error. After this we have configured to 6GB and 6 GB , since then there is no error till this point but the memory usage is increasing linearly with the time . Now it is using 19 GB after running for two days. **To Reproduce** Steps to reproduce the behavior: 1. Run Hoodie Delta streamer with JSON KAFKA SOURCE in continuous mode. **Expected behavior** Constant memory utilization based on ingestion pattern. **Environment Description** * Hudi version : 0.6.1 * Spark version : 2.4.5 * Hive version : 1.2 * Hadoop version : 2.8 * Storage (HDFS/S3/GCS..) : s3a * Running on Docker? (yes/no) : no ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
