VungleTienan opened a new issue, #8660:
URL: https://github.com/apache/arrow-datafusion/issues/8660

   ### Describe the bug
   
   Hey there. It seems that datafusion cannot recognize the field name 
existence when making an aggregation on a parquet file.  The code fail to run 
with following error:
   ```
   Error: SchemaError(FieldNotFound { field: Column { relation: None, name: 
"fl_date" }, valid_fields: [Column { relation: Some(Bare { table: "?table?" }), 
name: "FL_DATE" }, Column { relation: Some(Bare { table: "?table?" }), name: 
"DEP_DELAY" }, Column { relation: Some(Bare { table: "?table?" }), name: 
"FL_DATE" }, Column { relation: Some(Bare { table: "?table?" }), name: 
"DEP_DELAY" }] })
   ```
   
   Maybe I was making some mistakes?
   
   ### To Reproduce
   
   1. Download the flights 1m data:
   https://www.tablab.app/datasets/sample/parquet
   
   2. Run the code below:
   ```Rust
   use datafusion::{
       arrow::datatypes::{DataType, Field, Schema},
       prelude::*,
   };
   
   #[tokio::main]
   async fn main() -> datafusion::error::Result<()> {
       let ctx: SessionContext = SessionContext::new();
       let schema = Schema::new(vec![
           Field::new("FL_DATE", DataType::Utf8, true),
           Field::new("DEP_DELAY", DataType::Int32, true),
       ]);
       let df = ctx
           .read_parquet(
               "../../dataset/flights.parquet",
               ParquetReadOptions::default().schema(&schema),
           )
           .await?;
       let df = df
           .select_columns(&["FL_DATE", "DEP_DELAY"])?
           .aggregate(vec![col("FL_DATE")], vec![sum(col("DEP_DELAY"))])?;
       df.show().await?;
       Ok(())
   }
   ```
   
   ### Expected behavior
   
   The aggregated data is displayed.
   
   ### Additional context
   
   cargo.toml
   
   ```
   [package]
   name = "data_engines"
   version = "0.1.0"
   edition = "2021"
   
   # See more keys and their definitions at 
https://doc.rust-lang.org/cargo/reference/manifest.html
   
   [dependencies]
   datafusion = "34"
   tokio = { version = "1.35.1", features = ["full"] }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to