Varun Nalla created SPARK-44900: ----------------------------------- Summary: Cached DataFrame keeps growing Key: SPARK-44900 URL: https://issues.apache.org/jira/browse/SPARK-44900 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.0 Reporter: Varun Nalla
Scenario : We have a kafka streaming application where the data lookups are happening by joining another DF which is cached, and the caching strategy is MEMORY_AND_DISK. However the size of the cached DataFrame keeps on growing for every micro batch the streaming application process and that's being visible under storage tab. A similar stack overflow thread was already raised. https://stackoverflow.com/questions/55601779/spark-dataframe-cache-keeps-growing -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org