[ 
https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-40282:
---------------------------------
    Priority: Major  (was: Blocker)

> DataType argument in StructType.add is incorrectly throwing scala.MatchError
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-40282
>                 URL: https://issues.apache.org/jira/browse/SPARK-40282
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.3.0
>            Reporter: M. Manna
>            Priority: Major
>         Attachments: SparkApplication.kt, retailstore.csv
>
>
> *Problem Description*
> as part of contract mentioned here, Spark should be able to support 
> {{IntegerType}} as an argument in StructType.add method. However, it 
> complaints with {{scala.MatchError}} today.
>  
> If we call the override version which access String value as Type e.g. 
> "Integer" - it works.
> *How to Reproduce*
>  # Create a Kotlin Project - I have used Kotlin but Java will also work 
> (needs minor adjustment)
>  # Place the attached CSV file in {{src/main/resources}}   
>  # Compile the project with Java 11
>  # Run - it will give you error.
> {code:java}
> Exception in thread "main" scala.MatchError: 
> org.apache.spark.sql.types.IntegerType@363fe35a (of class 
> org.apache.spark.sql.types.IntegerType)
>     at 
> org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeFor(RowEncoder.scala:240)
>     at 
> org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeForInput(RowEncoder.scala:236)
>     at 
> org.apache.spark.sql.catalyst.expressions.objects.ValidateExternalType.<init>(objects.scala:1890)
>     at 
> org.apache.spark.sql.catalyst.encoders.RowEncoder$.$anonfun$serializerFor$3(RowEncoder.scala:197)
>     at 
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
>     at 
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
>     at 
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
>     at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
>     at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
>     at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
>     at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198)
>     at 
> org.apache.spark.sql.catalyst.encoders.RowEncoder$.serializerFor(RowEncoder.scala:192)
>     at 
> org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:73)
>     at 
> org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:81)
>     at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:92)
>     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
>     at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:89)
>     at 
> org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:444)
>     at 
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
>     at 
> org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
>     at scala.Option.getOrElse(Option.scala:189)
>     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
>     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185) 
> {code}
>  # Now change line (commented as HERE) - to have a String value i.e. "Integer"
>  # It works
> *Ask*
>  # Why does it not accept IntegerType, StringType as DataType as part of the 
> parameters supplied through {{add}} function in {{StructType}} ?
>  # If this is a bug, do we know when the fix can come?
>    



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to