Fengyu Cao created SPARK-44025:
----------------------------------
Summary: CSV Table Read Error with CharType(length) column
Key: SPARK-44025
URL: https://issues.apache.org/jira/browse/SPARK-44025
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.4.0
Environment: {{apache/spark:v3.4.0 image}}
Reporter: Fengyu Cao
Problem:
# read a CSV format table
# table has a `CharType(length)` column
# read table failed with Exception: `org.apache.spark.SparkException: Job
aborted due to stage failure: Task 0 in stage 36.0 failed 4 times, most recent
failure: Lost task 0.3 in stage 36.0 (TID 72) (10.113.9.208 executor 11):
java.lang.IllegalArgumentException: requirement failed: requiredSchema
(struct<name:string,age:int,job:string>) should be the subset of dataSchema
(struct<name:string,age:int,job:string>).`
reproduce with official image:
# {{docker run -it apache/spark:v3.4.0 /opt/spark/bin/spark-sql}}
# {{CREATE TABLE csv_bug (name STRING, age INT, job CHAR(4)) USING CSV OPTIONS
('header' = 'true', 'sep' = ';') LOCATION
"/opt/spark/examples/src/main/resources/people.csv";}}
# SELECT * FROM csv_bug;
# ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: requiredSchema
(struct<name:string,age:int,job:string>) should be the subset of dataSchema
(struct<name:string,age:int,job:string>).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]