[ https://issues.apache.org/jira/browse/SPARK-33641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245440#comment-17245440 ]
Apache Spark commented on SPARK-33641: -------------------------------------- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/30654 > Invalidate new char-like type in public APIs that result incorrect results > -------------------------------------------------------------------------- > > Key: SPARK-33641 > URL: https://issues.apache.org/jira/browse/SPARK-33641 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.1.0 > Reporter: Kent Yao > Assignee: Kent Yao > Priority: Blocker > Fix For: 3.1.0 > > > 1. udf > {code:java} > scala> spark.udf.register("abcd", () => "12345", > org.apache.spark.sql.types.VarcharType(2)) > scala> spark.sql("select abcd()").show > scala.MatchError: CharType(2) (of class > org.apache.spark.sql.types.VarcharType) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeFor(RowEncoder.scala:215) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeForInput(RowEncoder.scala:212) > at > org.apache.spark.sql.catalyst.expressions.objects.ValidateExternalType.<init>(objects.scala:1741) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.$anonfun$serializerFor$3(RowEncoder.scala:175) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) > at > scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) > at > scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) > at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) > at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.serializerFor(RowEncoder.scala:171) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:66) > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:768) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) > at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:611) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:768) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:606) > ... 47 elided > {code} > 2. spark.createDataframe > {code:java} > scala> spark.createDataFrame(spark.read.text("README.md").rdd, new > org.apache.spark.sql.types.StructType().add("c", "char(1)")).show > +--------------------+ > | c| > +--------------------+ > | # Apache Spark| > | | > |Spark is a unifie...| > |high-level APIs i...| > |supports general ...| > |rich set of highe...| > |MLlib for machine...| > |and Structured St...| > | | > |<https://spark.ap...| > | | > |[![Jenkins Build]...| > |[![AppVeyor Build...| > |[![PySpark Covera...| > | | > | | > |## Online Documen...| > | | > |You can find the ...| > |guide, on the [pr...| > +--------------------+ > only showing top 20 rows > {code} > 3. reader.schema > ``` > scala> spark.read.schema("a varchar(2)").text("./README.md").show(100) > +--------------------+ > | a| > +--------------------+ > | # Apache Spark| > | | > |Spark is a unifie...| > |high-level APIs i...| > |supports general ...| > ``` > 4. etc -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org