[GitHub] [spark] HeartSaVioR commented on a change in pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

GitBox Tue, 26 Jan 2021 03:40:54 -0800


HeartSaVioR commented on a change in pull request #31296:
URL: https://github.com/apache/spark/pull/31296#discussion_r564447933




##########
File path: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
##########
@@ -2007,6 +2007,54 @@ class DatasetSuite extends QueryTest
 
     checkAnswer(withUDF, Row(Row(1), null, null) :: Row(Row(1), null, null) :: 
Nil)
   }
+
+  test("SPARK-34205: Pipe Dataset") {
+    assume(TestUtils.testCommandAvailable("cat"))
+
+    val nums = spark.range(4)
+    val piped = nums.pipe("cat", (l, printFunc) => printFunc(l.toString)).toDF

Review comment:
       I see what @viirya said. I'd agree that transform looks to behave as an 
operation (not sure that is intended or not, but looks like at least for now) 
and transform also requires top level API to cover up like we did for 
`mapPartition`.
   
   If we are OK to add the top level API (again not yet decided so just a 2 
cents) then which one? I'd rather say `transform` is something we'd like to be 
consistent with, instead of `pipe`. They have been exposed as SQL statement, 
and probably used widely for Spark SQL users, and even Hive users. If we want 
feature parity then my vote goes to `transform`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a change in pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

Reply via email to