[
https://issues.apache.org/jira/browse/SPARK-30494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-30494:
----------------------------------
Description:
We can reproduce by below commands:
{code}
beeline> create or replace temporary view temp1 as select 1
beeline> cache table temp1
beeline> create or replace temporary view temp1 as select 1, 2
beeline> cache table temp1
{code}
The cached RDD for plan "select 1" stays in memory forever until the session
close. This cached data cannot be used since the view temp1 has been replaced
by another plan. It's a memory leak.
assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1,
2")).isDefined)
assert(spark.sharedState.cacheManager.lookupCachedData(sql("select
1")).isDefined)
was:
We can reproduce by below commands:
{code}
beeline> create or replace temporary view temp1 as select 1
beeline> cache table tempView
beeline> create or replace temporary view temp1 as select 1, 2
beeline> cache table tempView
{code}
The cached RDD for plan "select 1" stays in memory forever until the session
close. This cached data cannot be used since the view temp1 has been replaced
by another plan. It's a memory leak.
assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1,
2")).isDefined)
assert(spark.sharedState.cacheManager.lookupCachedData(sql("select
1")).isDefined)
> Duplicates cached RDD when create or replace an existing view
> -------------------------------------------------------------
>
> Key: SPARK-30494
> URL: https://issues.apache.org/jira/browse/SPARK-30494
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.3, 2.3.4, 2.4.5, 3.0.0
> Reporter: Lantao Jin
> Priority: Major
>
> We can reproduce by below commands:
> {code}
> beeline> create or replace temporary view temp1 as select 1
> beeline> cache table temp1
> beeline> create or replace temporary view temp1 as select 1, 2
> beeline> cache table temp1
> {code}
> The cached RDD for plan "select 1" stays in memory forever until the session
> close. This cached data cannot be used since the view temp1 has been replaced
> by another plan. It's a memory leak.
> assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1,
> 2")).isDefined)
> assert(spark.sharedState.cacheManager.lookupCachedData(sql("select
> 1")).isDefined)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]