Cheng Lian created SPARK-1669:
---------------------------------

             Summary: SQLContext.cacheTable() should be idempotent
                 Key: SPARK-1669
                 URL: https://issues.apache.org/jira/browse/SPARK-1669
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 1.0.0
            Reporter: Cheng Lian


Calling {{cacheTable()}} on some table {{t} multiple times causes table {{t}} 
to  be cached multiple times. This semantics is different from {{RDD.cache()}}, 
which is idempotent.

We can check whether a table is already cached by checking:

# whether the structure of the underlying logical plan of the table is matches 
the pattern {{Subquery(\_, SparkLogicalPlan(inMem @ 
InMemoryColumnarTableScan(_, _)))}}
# whether {{inMem.cachedColumnBuffers.getStorageLevel.useMemory}} is true



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to