[GitHub] spark issue #19864: [SPARK-22673][SQL] InMemoryRelation should utilize exist...

viirya Sat, 02 Dec 2017 19:07:15 -0800

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19864
  
    Is this initial statistics important? After the columnar RDD is
    materialized, we will get accurate statistics then. Don't we?
    
    On Dec 3, 2017 1:43 AM, "Nan Zhu" <[email protected]> wrote:
    
    > *@CodingCat* commented on this pull request.
    > ------------------------------
    >
    > In sql/core/src/main/scala/org/apache/spark/sql/execution/
    > CacheManager.scala
    > <https://github.com/apache/spark/pull/19864#discussion_r154501939>:
    >
    > > -        planToCache,
    > -        InMemoryRelation(
    > -          sparkSession.sessionState.conf.useCompression,
    > -          sparkSession.sessionState.conf.columnBatchSize,
    > -          storageLevel,
    > -          
sparkSession.sessionState.executePlan(planToCache).executedPlan,
    > -          tableName)))
    > +      val inMemoryRelation = InMemoryRelation(
    > +        sparkSession.sessionState.conf.useCompression,
    > +        sparkSession.sessionState.conf.columnBatchSize,
    > +        storageLevel,
    > +        sparkSession.sessionState.executePlan(planToCache).executedPlan,
    > +        tableName)
    > +      if (planToCache.conf.cboEnabled && 
planToCache.stats.rowCount.isDefined) {
    > +        inMemoryRelation.setStatsFromCachedPlan(planToCache)
    > +      }
    >
    > I have to make InMemoryRelation stateful to avoid breaking APIs.....
    >
    > â
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/19864#pullrequestreview-80680362>,
    > or mute the thread
    > 
<https://github.com/notifications/unsubscribe-auth/AAEM96llEjZsyqac_xi9Nkks_2idfmgEks5s8YxWgaJpZM4QzBjk>
    > .
    >




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19864: [SPARK-22673][SQL] InMemoryRelation should utilize exist...

Reply via email to