Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Sahil Sareen
Spark 1.5.2 dfOld.registerTempTable("oldTableName") sqlContext.cacheTable("oldTableName") // // do something // dfNew.registerTempTable("oldTableName") sqlContext.cacheTable("oldTableName") Now when I use the "oldTableName" table I do get the latest contents from dfNew but do the

Re: Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Ted Yu
CacheManager#cacheQuery() is called where: * Caches the data produced by the logical representation of the given [[Queryable]]. ... val planToCache = query.queryExecution.analyzed if (lookupCachedData(planToCache).nonEmpty) { Is the schema for dfNew different from that of dfOld ?

Re: Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Ted Yu
This method in CacheManager: private[sql] def lookupCachedData(plan: LogicalPlan): Option[CachedData] = readLock { cachedData.find(cd => plan.sameResult(cd.plan)) Ied me to the following in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala : def

Re: Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Sahil Sareen
Thanks Ted! Yes, The schema might be different or the same. What would be the answer for each situation? On Fri, Dec 18, 2015 at 6:02 PM, Ted Yu wrote: > CacheManager#cacheQuery() is called where: > * Caches the data produced by the logical representation of the given >

Re: Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Sahil Sareen
So I looked at the function, my only worry is that the cache should be cleared if I'm overwriting the cache with the same table name. I did this experiment and the cache shows as table not cached but want to confirm this. In addition to not using the old table values is it actually

Re: Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Ted Yu
When second attempt is made to cache df3 which has same schema as the first DataFrame, you would see the warning below: scala> sqlContext.cacheTable("t1") scala> sqlContext.isCached("t1") res5: Boolean = true scala> sqlContext.sql("select * from t1").show +---+---+ | a| b| +---+---+ | 1| 1|

Re: Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable

2015-12-18 Thread Sahil Sareen
>From the UI I see two rows for this on a streaming application: RDD NameStorage LevelCached PartitionsFraction CachedSize in MemorySize in ExternalBlockStoreSize on DiskIn-memory table myColorsTableMemory Deserialized 1x Replicated2100%728.2 KB0.0 B0.0 BIn-memory table myColorsTableMemory