[
https://issues.apache.org/jira/browse/SPARK-44900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Varun Nalla updated SPARK-44900:
--------------------------------
Priority: Blocker (was: Critical)
> Cached DataFrame keeps growing
> ------------------------------
>
> Key: SPARK-44900
> URL: https://issues.apache.org/jira/browse/SPARK-44900
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.0
> Reporter: Varun Nalla
> Priority: Blocker
>
> Scenario :
> We have a kafka streaming application where the data lookups are happening by
> joining another DF which is cached, and the caching strategy is
> MEMORY_AND_DISK.
> However the size of the cached DataFrame keeps on growing for every micro
> batch the streaming application process and that's being visible under
> storage tab.
> A similar stack overflow thread was already raised.
> https://stackoverflow.com/questions/55601779/spark-dataframe-cache-keeps-growing
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]