Max Gekk created SPARK-46890: -------------------------------- Summary: CSV fails on a column with default and without enforcing schema Key: SPARK-46890 URL: https://issues.apache.org/jira/browse/SPARK-46890 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 4.0.0 Reporter: Max Gekk Assignee: Max Gekk
When we create a table using CSV on an existing file with a header and: - a column has an default + - enforceSchema is false - taking into account CSV header The example below shows the issue: {code:sql} CREATE TABLE IF NOT EXISTS products ( product_id INT, name STRING, price FLOAT default 0.0, quantity INT default 0 ) USING CSV OPTIONS ( header 'true', inferSchema 'false', enforceSchema 'false', path '/Users/maximgekk/tmp/products.csv' ); {code} The CSV file products.csv: {code} product_id,name,price,quantity 1,Apple,0.50,100 2,Banana,0.25,200 3,Orange,0.75,50 {code} The query fails: {code:sql} spark-sql (default)> SELECT price FROM products; 24/01/28 11:43:09 ERROR Executor: Exception in task 0.0 in stage 8.0 (TID 6) java.lang.IllegalArgumentException: Number of column in CSV header is not equal to number of fields in the schema: Header length: 4, schema size: 1 CSV file: file:///Users/maximgekk/tmp/products.csv {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org