Fengyu Cao created SPARK-44025:
----------------------------------

             Summary: CSV Table Read Error with CharType(length) column
                 Key: SPARK-44025
                 URL: https://issues.apache.org/jira/browse/SPARK-44025
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.4.0
         Environment: {{apache/spark:v3.4.0 image}}
            Reporter: Fengyu Cao


Problem:
 # read a CSV format table
 # table has a `CharType(length)` column
 # read table failed with Exception:  `org.apache.spark.SparkException: Job 
aborted due to stage failure: Task 0 in stage 36.0 failed 4 times, most recent 
failure: Lost task 0.3 in stage 36.0 (TID 72) (10.113.9.208 executor 11): 
java.lang.IllegalArgumentException: requirement failed: requiredSchema 
(struct<name:string,age:int,job:string>) should be the subset of dataSchema 
(struct<name:string,age:int,job:string>).`

 

reproduce with official image:
 # {{docker run -it apache/spark:v3.4.0 /opt/spark/bin/spark-sql}}
 # {{CREATE TABLE csv_bug (name STRING, age INT, job CHAR(4)) USING CSV OPTIONS 
('header' = 'true', 'sep' = ';') LOCATION 
"/opt/spark/examples/src/main/resources/people.csv";}}
 # SELECT * FROM csv_bug;
 # ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: requiredSchema 
(struct<name:string,age:int,job:string>) should be the subset of dataSchema 
(struct<name:string,age:int,job:string>).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to