[jira] [Commented] (SPARK-35079) Transform with udf gives incorrect result
[ https://issues.apache.org/jira/browse/SPARK-35079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346970#comment-17346970 ] koert kuipers commented on SPARK-35079: --- looks to me like this is a duplicate of SPARK-34829 > Transform with udf gives incorrect result > - > > Key: SPARK-35079 > URL: https://issues.apache.org/jira/browse/SPARK-35079 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: koert kuipers >Priority: Minor > Fix For: 3.1.2, 3.2.0 > > > i think this is a correctness bug in spark 3.1.1 > the behavior is correct in spark 3.0.1 > in spark 3.0.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [a, b, c]| > +---+ > {code} > in spark 3.1.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [c, c, c]| > +---+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35079) Transform with udf gives incorrect result
[ https://issues.apache.org/jira/browse/SPARK-35079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342230#comment-17342230 ] Takeshi Yamamuro commented on SPARK-35079: -- I've checked it and the issue has already resolved in latest branch-3.1; {code:java} Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.2-SNAPSHOT /_/ Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181) Type in expressions to have them evaluated. Type :help for more information. scala> import spark.implicits._ scala> import org.apache.spark.sql.functions._ scala> val x = Seq(Seq("aa", "bb", "cc")).toDF x: org.apache.spark.sql.DataFrame = [value: array] scala> x.select(transform(col("value"), col => udf((_: String).drop(1)).apply(col))).show +---+ |transform(value, lambdafunction(UDF(lambda 'x_0), x_0))| +---+ | [a, b, c]| +---+ {code} So, I will close this. Anyway, thank you for the report. > Transform with udf gives incorrect result > - > > Key: SPARK-35079 > URL: https://issues.apache.org/jira/browse/SPARK-35079 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: koert kuipers >Priority: Minor > > i think this is a correctness bug in spark 3.1.1 > the behavior is correct in spark 3.0.1 > in spark 3.0.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [a, b, c]| > +---+ > {code} > in spark 3.1.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [c, c, c]| > +---+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35079) Transform with udf gives incorrect result
[ https://issues.apache.org/jira/browse/SPARK-35079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342182#comment-17342182 ] Takeshi Yamamuro commented on SPARK-35079: -- Could you check if branch-3.1 has the issue? > Transform with udf gives incorrect result > - > > Key: SPARK-35079 > URL: https://issues.apache.org/jira/browse/SPARK-35079 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: koert kuipers >Priority: Minor > > i think this is a correctness bug in spark 3.1.1 > the behavior is correct in spark 3.0.1 > in spark 3.0.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [a, b, c]| > +---+ > {code} > in spark 3.1.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [c, c, c]| > +---+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35079) Transform with udf gives incorrect result
[ https://issues.apache.org/jira/browse/SPARK-35079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17342171#comment-17342171 ] shahid commented on SPARK-35079: Seems It is not reproducible with master branch? +-+ |transform(value, lambdafunction(UDF(lambda x_0#3993), namedlambdavariable()))| +-+ |[a, b, c]| +-+ > Transform with udf gives incorrect result > - > > Key: SPARK-35079 > URL: https://issues.apache.org/jira/browse/SPARK-35079 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: koert kuipers >Priority: Minor > > i think this is a correctness bug in spark 3.1.1 > the behavior is correct in spark 3.0.1 > in spark 3.0.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [a, b, c]| > +---+ > {code} > in spark 3.1.1: > {code:java} > scala> import spark.implicits._ > scala> import org.apache.spark.sql.functions._ > scala> val x = Seq(Seq("aa", "bb", "cc")).toDF > x: org.apache.spark.sql.DataFrame = [value: array] > scala> x.select(transform(col("value"), col => udf((_: > String).drop(1)).apply(col))).show > +---+ > |transform(value, lambdafunction(UDF(lambda 'x), x))| > +---+ > | [c, c, c]| > +---+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org