ssimeonov commented on a change in pull request #23620: [SPARK-26696][SQL] 
Makes Dataset encoder public
URL: https://github.com/apache/spark/pull/23620#discussion_r255358318
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
 ##########
 @@ -139,6 +139,14 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
       ("a", 1), ("b", 2))
   }
 
+  test("as tuple using instance encoder") {
+    val data = Seq(("a", 1), ("b", 2), ("c", 3))
+    val ds = data.toDS()
+    checkDataset(
+      ds.toDF().as[(String, Int)](ds.encoder),
 
 Review comment:
   @dongjoon-hyun I am not sure what your are trying to suggest with your test 
example other than prove my point, which is that the test I added is the 
minimum test that verifies the added functionality: making `encoder` publicly 
accessible.
   
   The purpose of test cases is not to motivate functionality but to test 
functionality. The motivation is provided in the JIRA issue. I believe that 
explanation to be perfectly clear or else @gatorsmile wouldn't have asked me to 
create a PR. 
   
   I do not understand the point of creating a large & complicated test that 
*requires* a public `encoder`. As outlined in the JIRA issue, a "real-world" 
test would require:
   
   1. A framework interface including something like `def foo[A](ds: 
Dataset[A]): Dataset[A]`
   2. An implementation of that interface
   3. Some complex and rather arbitrary ad hoc data transformations in the 
implementation 
   
   Do you really think this test would be easy to comprehend amidst the simple, 
very much non-real-world tests in the suite? Do you really think the test would 
add any meaningful verification of the tiny code change in the patch? If you 
do, please explain your reasoning.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to