[GitHub] [spark] HyukjinKwon commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-30 Thread GitBox


HyukjinKwon commented on a change in pull request #28761:
URL: https://github.com/apache/spark/pull/28761#discussion_r447507397



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcTest.scala
##
@@ -78,12 +78,16 @@ abstract class OrcTest extends QueryTest with 
FileBasedDataSourceTest with Befor
   (f: String => Unit): Unit = withDataSourceFile(data)(f)
 
   /**
-   * Writes `data` to a Orc file and reads it back as a `DataFrame`,
+   * Writes `df` dataframe to a Orc file and reads it back as a `DataFrame`,
* which is then passed to `f`. The Orc file will be deleted after `f` 
returns.
*/
-  protected def withOrcDataFrame[T <: Product: ClassTag: TypeTag]
-  (data: Seq[T], testVectorized: Boolean = true)
-  (f: DataFrame => Unit): Unit = withDataSourceDataFrame(data, 
testVectorized)(f)
+  protected def withOrcDataFrame(df: DataFrame, testVectorized: Boolean = true)

Review comment:
   Ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-29 Thread GitBox


HyukjinKwon commented on a change in pull request #28761:
URL: https://github.com/apache/spark/pull/28761#discussion_r446887251



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcTest.scala
##
@@ -78,12 +78,16 @@ abstract class OrcTest extends QueryTest with 
FileBasedDataSourceTest with Befor
   (f: String => Unit): Unit = withDataSourceFile(data)(f)
 
   /**
-   * Writes `data` to a Orc file and reads it back as a `DataFrame`,
+   * Writes `df` dataframe to a Orc file and reads it back as a `DataFrame`,
* which is then passed to `f`. The Orc file will be deleted after `f` 
returns.
*/
-  protected def withOrcDataFrame[T <: Product: ClassTag: TypeTag]
-  (data: Seq[T], testVectorized: Boolean = true)
-  (f: DataFrame => Unit): Unit = withDataSourceDataFrame(data, 
testVectorized)(f)
+  protected def withOrcDataFrame(df: DataFrame, testVectorized: Boolean = true)

Review comment:
   Hm .. then I guess maybe it's good to just have a separate method for 
it. Most of changes here look caused by this one in the tests.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-28 Thread GitBox


HyukjinKwon commented on a change in pull request #28761:
URL: https://github.com/apache/spark/pull/28761#discussion_r446761037



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcTest.scala
##
@@ -78,12 +78,16 @@ abstract class OrcTest extends QueryTest with 
FileBasedDataSourceTest with Befor
   (f: String => Unit): Unit = withDataSourceFile(data)(f)
 
   /**
-   * Writes `data` to a Orc file and reads it back as a `DataFrame`,
+   * Writes `df` dataframe to a Orc file and reads it back as a `DataFrame`,
* which is then passed to `f`. The Orc file will be deleted after `f` 
returns.
*/
-  protected def withOrcDataFrame[T <: Product: ClassTag: TypeTag]
-  (data: Seq[T], testVectorized: Boolean = true)
-  (f: DataFrame => Unit): Unit = withDataSourceDataFrame(data, 
testVectorized)(f)
+  protected def withOrcDataFrame(df: DataFrame, testVectorized: Boolean = true)

Review comment:
   @viirya why do we need to change this? Looks we can just add the 
overridden version to test nested DataFrame without touching other tests.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-28 Thread GitBox


HyukjinKwon commented on a change in pull request #28761:
URL: https://github.com/apache/spark/pull/28761#discussion_r446759277



##
File path: 
sql/core/v2.3/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
##
@@ -74,9 +75,9 @@ class OrcFilterSuite extends OrcTest with SharedSparkSession {
   }
 
   protected def checkFilterPredicate

Review comment:
   Seems we don't need to change here





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org