[
https://issues.apache.org/jira/browse/SPARK-31563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092168#comment-17092168
]
Maxim Gekk commented on SPARK-31563:
------------------------------------
I am working on the issue
> Failure of InSet.sql for UTF8String collection
> ----------------------------------------------
>
> Key: SPARK-31563
> URL: https://issues.apache.org/jira/browse/SPARK-31563
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.5, 3.0.0, 3.1.0
> Reporter: Maxim Gekk
> Priority: Major
>
> The InSet expression works on collections of internal Catalyst's types. We
> can see this in the optimization when In is replaced by InSet, and In's
> collection is evaluated to internal Catalyst's values:
> [https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L254]
> {code:scala}
> if (newList.length > SQLConf.get.optimizerInSetConversionThreshold) {
> val hSet = newList.map(e => e.eval(EmptyRow))
> InSet(v, HashSet() ++ hSet)
> }
> {code}
> The code existed before the optimization
> https://github.com/apache/spark/pull/25754 that made another wrong assumption
> about collection types.
> If InSet accepts only internal Catalyst's types, the following code shouldn't
> fail:
> {code:scala}
> InSet(Literal("a"), Set("a", "b").map(UTF8String.fromString)).sql
> {code}
> but it fails with the exception:
> {code}
> Unsupported literal type class org.apache.spark.unsafe.types.UTF8String a
> java.lang.RuntimeException: Unsupported literal type class
> org.apache.spark.unsafe.types.UTF8String a
> at
> org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:88)
> at
> org.apache.spark.sql.catalyst.expressions.InSet.$anonfun$sql$2(predicates.scala:522)
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]