[GitHub] [spark] viirya commented on a change in pull request #27055: [SPARK-30394]Skip DetermineTableStats rule when hive table can be converted to datasource table

GitBox Sun, 02 Feb 2020 23:34:21 -0800

viirya commented on a change in pull request #27055: [SPARK-30394]Skip 
DetermineTableStats rule when hive table can be converted to datasource table
URL: https://github.com/apache/spark/pull/27055#discussion_r373953950


 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CatalogFileIndex.scala
 ##########
 @@ -33,12 +33,10 @@ import org.apache.spark.sql.types.StructType
  *
  * @param sparkSession a [[SparkSession]]
  * @param table the metadata of the table
- * @param sizeInBytes the table's data size in bytes
  */
 class CatalogFileIndex(
     sparkSession: SparkSession,
-    val table: CatalogTable,
-    override val sizeInBytes: Long) extends FileIndex {
 
 Review comment:
   This change as @cloud-fan said, is expensive. And it doesn't follow up the 
defined behavior for partitioned data source and Hive table regrading 
statistics calculation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] viirya commented on a change in pull request #27055: [SPARK-30394]Skip DetermineTableStats rule when hive table can be converted to datasource table

Reply via email to