[GitHub] [spark] HeartSaVioR commented on a change in pull request #26173: [SPARK-29503][SQL] Remove conversion CreateNamedStruct to CreateNamedStructUnsafe

GitBox Tue, 29 Oct 2019 02:31:30 -0700

HeartSaVioR commented on a change in pull request #26173: [SPARK-29503][SQL] 
Remove conversion CreateNamedStruct to CreateNamedStructUnsafe
URL: https://github.com/apache/spark/pull/26173#discussion_r339968513


 ##########
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala
 ##########
 @@ -64,6 +68,24 @@ class DataFrameComplexTypeSuite extends QueryTest with 
SharedSparkSession {
     val ds100_5 = Seq(S100_5()).toDS()
     ds100_5.rdd.count
   }
+
+  test("SPARK-29503 nest unsafe struct inside safe array") {
+    withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {
+      val df = spark.sparkContext.parallelize(Seq(Seq(1, 2, 3))).toDF("items")
+
+      // items: Seq[Int] => items.map { item => Seq(Struct(item)) }
+      val result = df.select(
+        new Column(MapObjects(
+          (item: Expression) => array(struct(new Column(item))).expr,
 
 Review comment:
   I haven't spent another time to try it (as it seems to be clean and simple 
reproducer). I'm not sure it's not going to be valid reproducer since it pulls 
catalyst package. Catalyst could analyze the query and inject it if necessary 
in any way.
   
   I indicated you'd like to revisit #25745 - that was WIP and it didn't have 
any number of performance gain. I'd rather choose "safeness" over "speed", and 
even we haven't figured out there's outstanding difference between twos. It was 
the only one case MapObjects could have unsafe struct, by allowing this, safe 
and unsafe are possibly mixed up leading to encounter corner case.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a change in pull request #26173: [SPARK-29503][SQL] Remove conversion CreateNamedStruct to CreateNamedStructUnsafe

Reply via email to