johnkerl opened a new issue, #16017: URL: https://github.com/apache/datafusion/issues/16017
### Describe the bug Doing `describe` with any upper-case or dot characters in column names results in ``` Error: Execution("Schema error: No field named ...") ``` ### To Reproduce `./Cargo.toml` ``` [package] name = "describe-with-case" version = "0.1.0" edition = "2024" [dependencies] "datafusion" = "47" tokio = { version = "1.45", features = ["rt", "rt-multi-thread", "macros"] } ``` `src/main.rs` ``` use std::env; use datafusion::error::Result; use datafusion::execution::context::SessionContext; use datafusion::prelude::CsvReadOptions; #[tokio::main] async fn main() -> Result<()> { let args: Vec<String> = env::args().collect(); let ctx = SessionContext::new(); for arg in args.iter().skip(1) { println!(""); println!("Filename: {arg}"); let df = ctx.read_csv(arg, CsvReadOptions::new()).await?; let stat = df.describe().await?.collect().await?; println!("{stat:?}"); } Ok(()) } ``` `./desc-good.csv` ``` abc,def,ghi 1,2,3 4,5,6 7,8,9 ``` `./desc-bad.csv` ``` abc,Def,gh.i 1,2,3 4,5,6 7,8,9 ``` ### Expected behavior With column names `abc,def,ghi` we see `cargo run ./desc-good.csv` ``` Argument ./desc-good.csv [RecordBatch { schema: Schema { fields: [Field { name: "describe", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "abc", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "def", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "ghi", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, columns: [StringArray [ "count", "null_count", "mean", "std", "min", "max", "median", ], PrimitiveArray<Float64> [ 3.0, 0.0, 4.0, 3.0, 1.0, 7.0, 4.0, ], PrimitiveArray<Float64> [ 3.0, 0.0, 5.0, 3.0, 2.0, 8.0, 5.0, ], PrimitiveArray<Float64> [ 3.0, 0.0, 6.0, 3.0, 3.0, 9.0, 6.0, ]], row_count: 7 }] ``` With column names `abc,Def,gh.i` I would expect similar. But I actually see: `cargo run ./desc-bad.csv` ``` Argument ./desc-bad.csv Error: Execution("Schema error: No field named def. Valid fields are \"?table?\".abc, \"?table?\".\"Def\", \"?table?\".\"gh.i\".") ``` ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org